five

English and Portuguese CBOW Models from Europarl Corpus, version 7, using FastText with Subwords Option

收藏
NIAID Data Ecosystem2026-03-13 收录
下载链接:
https://zenodo.org/record/6603927
下载链接
链接失效反馈
官方服务:
资源简介:
The models were trained using FastText, model CBOW, 40 epochs, and subwords. Each *.BIN file has its *.VEC file with the vocabulary ordered by frequency. The *.BIN file can return a vector to represent an out-of-vocabulary (OOV) word if the necessary parts of the OOV word were used in training. FastText and Gensim can use these files. The English and Portuguese models are identified in the file name,  _en_  and _pt_ respectively. An Excel file has the neighborhood changes of some selected words during training on each epoch. A previous exercise to find words with more than one meaning.
创建时间:
2022-06-02
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作