Red links from Ukrainian Wikipedia, dump September 2018
收藏DataCite Commons2020-08-26 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/ukredlinks_pairs_final_pkl/11550774/4
下载链接
链接失效反馈官方服务:
资源简介:
Dataset of Wikipedia red links from Ukrainian Wikipedia edition (dump from 20th of September, 2018). Table 'ukredlinks_final.csv' is a dataset of the most frequent 3 171 Ukrainian red links. They occur in 20 or more articles which have corresponding articles in English Wikipedia.Table 'ukredlinks_pairs_final.csv' consists of 2 957 927 pairs which are red links and their candidate pages from English Wikipedia. The pairs are created as a dictionary for Named Entity Linking task. Tables 'redlinks_train_set_indexes.csv' and 'redlinks_test_set_indexes.csv' are indexes for train and test sets of red link & English candidate page pairs.
本数据集为乌克兰语维基百科的红链数据集(数据源自2018年9月20日的维基百科转储文件)。其中`ukredlinks_final.csv`为出现频次最高的3171条乌克兰语红链,这类红链在至少20个拥有对应英语维基百科条目的文章中被引用。`ukredlinks_pairs_final.csv`包含2957927条红链与其候选英语维基百科页面的配对数据,此类配对被用作命名实体链接(Named Entity Linking)任务的字典数据集。`redlinks_train_set_indexes.csv`与`redlinks_test_set_indexes.csv`则分别为红链与英语候选页面配对的训练集和测试集的索引文件。
提供机构:
figshare
创建时间:
2020-01-09



