WikiRank quality scores and measures for Wikipedia articles (April 2022)
收藏DataCite Commons2025-06-01 更新2024-08-18 收录
下载链接:
https://figshare.com/articles/dataset/WikiRank_quality_scores_and_measures_for_Wikipedia_articles_April_2022_/19762927/1
下载链接
链接失效反馈官方服务:
资源简介:
Those datasets include lists of over 43 million Wikipedia articles in 55 languages with quality scores by WikiRank (https://wikirank.net). Additionally, the datasets contain the quality measures (metrics) which directly affect these scores. Quality measures were extracted based on Wikipedia dumps from April, 2022.<br> <strong>License</strong> All files included in this datasets are released under CC BY 4.0: https://creativecommons.org/licenses/by/4.0/ <strong>Format</strong> page_id -- The identifier of the Wikipedia article (int), e.g. <em>840191</em> page_name -- The title of the Wikipedia article (utf-8), e.g.<em> Sagittarius A*</em> wikirank_quality -- quality score for Wikipedia article in a scale 0-100<em> (as of April 1, 2022). </em>This is a synthetic measure that was calculated based on the metrics below (also included in the datasets). norm_len - normalized "page length" norm_refs - normalized "number of references" norm_img - normalized "number of images" norm_sec - normalized "number of sections" norm_reflen - normalized "references per length ratio" norm_authors - normalized "number of authors" (without bots and anonymous users) flawtemps - flaw templates<br>
提供机构:
figshare
创建时间:
2022-05-13



