Quantized versions of fastText embeddings for 158 languages
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3629536
下载链接
链接失效反馈官方服务:
资源简介:
These are compressed (quantized) versions of word embeddings for 158 languages originally available from https://fasttext.cc/docs/en/crawl-vectors.html . Following steps were performed to reduce file size:
the output matrix is discarded
the quantization is performed with parameters "-qnorm -dsub 1"
The average cosine similarity between original and quantized vectors for frequent words is 0.99. The file size is 4-6 times smaller.
创建时间:
2020-01-31



