R4C
收藏arXiv2020-05-02 更新2024-06-21 收录
下载链接:
https://naoya-i.github.io/r4c/
下载链接
链接失效反馈官方服务:
资源简介:
R4C是由东北大学和理研共同创建的数据集,包含4588个问题,每个问题附带3个参考推导(共13764个推导)。该数据集通过众包框架进行可靠且可扩展的标注,旨在评估阅读理解系统的内部推理能力。R4C不仅要求系统给出答案,还要求提供答案的推导过程,以解决现有数据集中存在的标注偏差和系统作弊问题。该数据集适用于评估自然语言理解系统,特别是在多跳问题回答和解释生成方面的能力。
R4C is a dataset co-developed by Northeastern University and RIKEN. It contains 4588 questions, each paired with 3 reference reasoning chains, amounting to a total of 13764 reasoning chains. Annotated via a crowdsourcing framework to enable reliable and scalable annotation, this dataset is intended to evaluate the internal reasoning abilities of reading comprehension systems. Unlike existing datasets, R4C requires systems to not only generate final answers but also provide the corresponding reasoning processes, so as to mitigate the issues of annotation bias and system cheating prevalent in current datasets. This dataset is applicable for assessing natural language understanding systems, especially their capabilities in multi-hop question answering and explanation generation.
提供机构:
东北大学 2理研 3伦敦大学学院
创建时间:
2019-10-10



