ELMo embeddings models for seven languages

SSH Open MarketPlace2025-07-04 更新2025-07-05 收录

下载链接：

https://marketplace.sshopencloud.eu/dataset/kgI5pr

下载链接

链接失效反馈

官方服务：

资源简介：

This model is used to produce contextual word embeddings. It is trained on large monolingual corpora for 7 languages. Each language's model was trained for approximately 10 epochs. Corpora sizes used in training range from over 270 M tokens in Latvian to almost 2 B tokens in Croatian. About 1 million most common tokens were provided as vocabulary during the training for each language model. The model can also infer OOV words, since the neural network input is on the character level. The model is available for download from the CLARIN.SI repository.

创建时间：

2025-07-04

5,000+

优质数据集

54 个

任务类型

进入经典数据集