WiLI-2018

Papers with Code2024-05-15 收录

下载链接：

https://paperswithcode.com/dataset/wili-2018

下载链接

链接失效反馈

资源简介：

WiLI-2018 is a benchmark dataset for monolingual written natural language identification. WiLI-2018 is a publicly available, free of charge dataset of short text extracts from Wikipedia. It contains 1000 paragraphs of 235 languages, totaling in 23500 paragraphs. WiLI is a classification dataset: Given an unknown paragraph written in one dominant language, it has to be decided which language it is.

5,000+

优质数据集

54 个

任务类型

进入经典数据集