WikiRank quality scores and measures for Wikipedia articles (April 2022)
收藏Figshare2022-05-13 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/WikiRank_quality_scores_and_measures_for_Wikipedia_articles_April_2022_/19762927
下载链接
链接失效反馈官方服务:
资源简介:
Those datasets include lists of over 43 million Wikipedia articles in 55 languages with quality scores by WikiRank (https://wikirank.net). Additionally, the datasets contain the quality measures (metrics) which directly affect these scores. Quality measures were extracted based on Wikipedia dumps from April, 2022. License All files included in this datasets are released under CC BY 4.0: https://creativecommons.org/licenses/by/4.0/ Format page_id -- The identifier of the Wikipedia article (int), e.g. 840191 page_name -- The title of the Wikipedia article (utf-8), e.g. Sagittarius A* wikirank_quality -- quality score for Wikipedia article in a scale 0-100 (as of April 1, 2022). This is a synthetic measure that was calculated based on the metrics below (also included in the datasets). norm_len - normalized "page length" norm_refs - normalized "number of references" norm_img - normalized "number of images" norm_sec - normalized "number of sections" norm_reflen - normalized "references per length ratio" norm_authors - normalized "number of authors" (without bots and anonymous users) flawtemps - flaw templates
创建时间:
2022-05-13



