RACE (ReAding Comprehension dataset from Examinations)

Name: RACE (ReAding Comprehension dataset from Examinations)
Creator: OpenDataLab
Published: 2026-05-24 04:30:03
License: 暂无描述

OpenDataLab2026-05-24 更新2024-05-09 收录

下载链接：

https://opendatalab.org.cn/OpenDataLab/RACE

下载链接

链接失效反馈

官方服务：

资源简介：

ReAding Comprehension dataset from Exams (RACE) 数据集是一个机器阅读理解数据集，由 27,933 个段落和 97,867 个英语考试题目组成，针对 12-18 岁的中国学生。 RACE 由分别来自中学和高中考试的两个子集 RACE-M 和 RACE-H 组成。 RACE-M 有 28,293 个问题，RACE-H 有 69,574 个问题。每个问题与 4 个候选答案相关联，其中一个是正确的。 RACE 的数据生成过程与大多数机器阅读理解数据集不同 - RACE 中的问题不是通过启发式或众包来生成问题和答案，而是专门为测试人类阅读技能而设计的，并且由领域专家创建。

The Reading Comprehension dataset from Exams (RACE) is a machine reading comprehension dataset composed of 27,933 passages and 97,867 English exam questions, targeting Chinese students aged 12 to 18. RACE consists of two subsets, RACE-M and RACE-H, which are sourced from middle school and high school exams respectively. RACE-M contains 28,293 questions, while RACE-H contains 69,574 questions. Each question is associated with four candidate answers, one of which is correct. Unlike most machine reading comprehension datasets, the data generation process of RACE is distinctive: questions in RACE are not generated through heuristic methods or crowdsourcing; instead, they are specially designed to test human reading skills and created by domain experts.

提供机构：

OpenDataLab

创建时间：

2022-04-28

搜集汇总

数据集介绍