ELMo embeddings models for seven languages
收藏SSH Open MarketPlace2025-07-04 更新2025-07-05 收录
下载链接:
https://marketplace.sshopencloud.eu/dataset/kgI5pr
下载链接
链接失效反馈官方服务:
资源简介:
This model is used to produce contextual word embeddings. It is trained on large monolingual corpora for 7 languages. Each language's model was trained for approximately 10 epochs. Corpora sizes used in training range from over 270 M tokens in Latvian to almost 2 B tokens in Croatian. About 1 million most common tokens were provided as vocabulary during the training for each language model. The model can also infer OOV words, since the neural network input is on the character level.
The model is available for download from the CLARIN.SI repository.
创建时间:
2025-07-04



