Italian FastText models

NIAID Data Ecosystem2026-05-01 收录

下载链接：

https://zenodo.org/record/8426106

下载链接

链接失效反馈

官方服务：

资源简介：

Italian FastText models trained from scratch on a dataset composed of: - wiki: a dump of Italian Wikipedia (as of December 15, 2022), comprising 25,548,651 sentences and 526,640,982 words (3.2 GB of raw text); - webz: a dataset of Italian news (159,226 documents) from the webz.io platform, crawled in October 2015, containing 44,041,823 sentences and 44,544,385 words (244 MB); - a dataset of 5,510 Italian news articles from the newspaper ModenaToday (MT) or 15,115 documents from the Italian version of Reuters (RCV2). ft_wiki_wbz_mt_20_epochs: FastText model trained on the dataset consisting of wiki, webz, and MT for 20 epochs ft_wiki_wbz_mt_50_epochs: FastText model trained on the dataset consisting of wiki, webz, and MT for 50 epochs ft_wiki_wbz_reut_20_epochs: FastText model trained on the dataset consisting of wiki, webz, and RCV2 for 20 epochs ft_wiki_wbz_reut_50_epochs: FastText model trained on the dataset consisting of wiki, webz, and RCV2 for 50 epochs

创建时间：

2023-10-10

5,000+

优质数据集

54 个

任务类型

进入经典数据集