Topics for each Wikipedia Article across Languages
收藏DataCite Commons2020-08-25 更新2024-07-28 收录
下载链接:
https://figshare.com/articles/Topics_for_each_Wikipedia_Article_across_Languages/12127434
下载链接
链接失效反馈官方服务:
资源简介:
This dataset contains the predicted topic(s) for (almost) each Wikipedia article across languages. <br><br>Each row contains the following columns:<pre>Qid,topic,probability,page_id,page_title,wiki_db <br>Where: <br><br>* Qid: Wikidata Item Id<br>* topic: Topic based on the ORES draft topic model (https://www.mediawiki.org/wiki/Talk:ORES/Draft_topic) <br>* probability: Probability to belong to the topic<br>* page_id: page_id<br>* page_title: page_title<br>* wiki_db: wiki_db, for example for english Wikipedia is enwiki<br><br>For example<br>Q1000211,Geography.Regions.Europe.Western_Europe,1.0,166578,Frières-Faillouël,euwiki<br>Topics are predicted using the Wikidata-Topic model developed by Isaac Johnson (https://github.com/geohci/wikidata-topic-model)<br></pre>The source code to create this dataset can be found here:<br>https://github.com/digitalTranshumant/wikidata-topic-model
本数据集涵盖多语言维基百科几乎全部条目对应的预测主题。<br><br>每一行数据包含以下字段:<pre>Qid,topic,probability,page_id,page_title,wiki_db <br>字段说明:<br><br>* Qid:维基数据条目编号(Wikidata Item Id)<br>* 主题:基于ORES草案主题模型的分类主题(参考链接:https://www.mediawiki.org/wiki/Talk:ORES/Draft_topic)<br>* 概率:该条目归属对应主题的概率值<br>* page_id:页面ID<br>* page_title:页面标题<br>* wiki_db:维基数据库标识,例如英语维基百科对应的标识为enwiki<br><br>示例:<br>Q1000211,Geography.Regions.Europe.Western_Europe,1.0,166578,Frières-Faillouël,euwiki<br>本数据集的主题预测由Isaac Johnson开发的Wikidata-Topic模型实现(参考链接:https://github.com/geohci/wikidata-topic-model)<br></pre>本数据集的构建源代码可通过以下链接获取:<br>https://github.com/digitalTranshumant/wikidata-topic-model
提供机构:
figshare
创建时间:
2020-04-15



