five

Domain expert readability dataset

收藏
NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/2651009
下载链接
链接失效反馈
官方服务:
资源简介:
Judgments gathered from 10 experts through a web-based survey on the readability of publication abstracts. The abstracts used were a subset of the AMiner's DBLP citation nework v10 dataset (https://aminer.org/citation) in the discipline of data and knowledge management. In particular, abstracts containing the following keywords were used: "database", "machine learning", "information retrieval", "data management", "cloud computing", "data mining", "algorithms", "classification", "query processing", "networks", "indexing", "distributed systems". After reading the abstract, each expert had to answer the following questions on a 5 point scale. Q1: Please rate how well-written the abstract is. Q2: Does the abstract contain linguistic errors? Q3: Please rate how clear the contribution of the paper is (based on the abstract). For each question, the interpretation of the extreme scale values (i.e., 1 and 5) were provided. In particular, 1 = “very poorly written” / “so many ling. errors that make abstract incomprehensible” / “not clear at all” (Q1/Q2/Q3) and 5 = “excellently written” / “no errors” / “completely clear” (Q1/Q2/Q3). The pairwise correlations (Kendall’s τ) of expert judgments on questions Q1-Q3 are presented in this table. The contained dataset is a tsv file that includes the following fields: user_id: expert identifier paper_id: AMiner's identifier from DBLP citation nework v10 dataset rating_1: answer for Q1 rating_2: answer for Q2 rating_3: answer fro Q3    Please cite: Thanasis Vergoulis, Ilias Kanellos, Anargiros Tzerefos, Serafeim Chatzopoulos, Theodore Dalamagas, Spiros Skiadopoulos. A study on the readability of scientific publications. 23rd International Conference on Theory and Practice of Digital Libraries. Oslo, Norway 2019 (to appear)
创建时间:
2020-01-21
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作