five

Article-level image suggestions evaluation (ALISE) dataset

收藏
Figshare2023-06-07 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Article-level_image_suggestions_evaluation_strong_ALISE_strong_dataset/23301860
下载链接
链接失效反馈
官方服务:
资源简介:
Article-level image suggestions (ALIS, reads alice) is a distributed computing system that recommends images for Wikipedia articles that don't have one [1]. This publication contains roughly 3,800 human ratings made against ALIS output in multiple Wikipedia language editions. Evaluation task Data was collected through an evaluation tool [2], with code available at [3]. Given a language, the user is shown a random Wikipedia article and an image suggested by the system; they are then asked to rate the relevance of the image by clicking on either the Good, Okay, Bad, or Unsure button. The user is also brought to judge whether the image is not suitable for any reason via the It's ok, It's unsuitable, or Unsure button. Content The archive holds 2 tab-separated-values (TSV) text files: evaluation_dataset.tsv contains the evaluation data; unillustrated_articles.tsv keeps track of unillustrated Wikipedia articles. Evaluation dataset headers id (integer) - identifier used for internal storage; unillustratedArticleId (integer) - identifier of the unillustrated Wikipedia article; resultFilePage (string) - Wikimedia Commons image file name. Prepend https://commons.wikimedia.org/wiki/ to form a valid Commons URL; resultImageUrl (string) - Wikimedia Commons thumbnail URL; source (string) - suggestion source. ms = MediaSearch; ima = ALIS prototype algorithm. See [4] and [5] respectively for more details; confidence_class (string) - shallow degree of suggestion confidence. Either low, medium, or high; rating (integer) - human image relevance rating. 1 = good; 0 = okay; -1 = bad; sensitive (integer) - human image suitability rating. 0 = it's okay; 1 = it's unsuitable; -1 = unsure; viewCount (integer) - number of times the suggestion was seen by evaluators. Example 7357 1827 File:Cuphea_cyanea_strybing.jpg https://upload.wikimedia.org/wikipedia/commons/thumb/1/17/Cuphea_cyanea_strybing.jpg/800px-Cuphea_cyanea_strybing.jpg ima high 1 0 1 Unillustrated articles headers id (integer) - identifier used for internal storage. Maps to unillustratedArticleId in the evaluation data; langCode (string) - Wikipedia language code; pageTitle (string) - Wikipedia article title; unsuitableArticleType (integer) - whether the Wikipedia article is suitable for receiving image suggestions. 0 = suitable; 1 = not suitable; Example 1827viCuphea_cyanea0 References [1] https://www.mediawiki.org/wiki/Structured_Data_Across_Wikimedia/Image_Suggestions/Data_Pipeline [2] https://image-recommendation-test.toolforge.org/ [3] https://github.com/cormacparle/media-search-signal-test/tree/master/public_html [4] https://www.mediawiki.org/wiki/Help:MediaSearch [5] https://www.mediawiki.org/wiki/Structured_Data_Across_Wikimedia/Image_Suggestions/Data_Pipeline#How_it_works
创建时间:
2023-06-07
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作