MRCEval

Name: MRCEval
Creator: maas
Published: 2025-10-09 16:41:28
License: 暂无描述

魔搭社区2025-10-09 更新2025-07-19 收录

下载链接：

https://modelscope.cn/datasets/THU-KEG/MRCEval

下载链接

链接失效反馈

官方服务：

资源简介：

# Dataset Card for Dataset Name  MRCEval: A Comprehensive, Challenging and Accessible Machine Reading Comprehension Benchmark by Shengkun Ma, Hao Peng, Lei Hou and Juanzi Li. MRCEval is a comprehensive benchmark for machine reading comprehension (MRC) designed to assess the reading comprehension (RC) capabilities of LLMs, covering 13 sub-tasks with a total of 2.1K high-quality multi-choice questions. ## Dataset Structure  [More Information Needed] ### Data Instances An example from facts_understanding subtask looks as follows: ``` { "index": 0 "category": "Facts_entity", "source": "squad", "context": "Super_Bowl_50 The Broncos took an early lead in Super Bowl 50 and never trailed. Newton was limited by Denver's defense, which sacked him seven times and forced him into three turnovers, including a fumble which they recovered for a touchdown. Denver linebacker Von Miller was named Super Bowl MVP, recording five solo tackles, 2½ sacks, and two forced fumbles." "question": "How many fumbles did Von Miller force?", "choices": ["two", "four", "three", "one"], "answer": "A" } ``` ### Data Fields - `index`: a number, index of the instance - `category`: a string, category of the instance - `source`: a string, source of the instance - `context`: a string - `question`: a string - `choices`: a list of 4 string features - `answer`: a ClassLabel feature

# 数据集卡片：MRCEval  MRCEval：由马胜坤、彭昊、侯磊与李娟子构建的全面、兼具挑战性与易用性的机器阅读理解基准数据集。 MRCEval是一款用于评估大语言模型（Large Language Model，LLM）阅读理解能力的综合性机器阅读理解（Machine Reading Comprehension，MRC）基准，涵盖13个子任务，总计2.1千道高质量单项选择题。 ## 数据集结构  [需补充更多信息] ### 数据样例以下为事实理解（facts_understanding）子任务的一则样例： { "index": 0, "category": "事实实体", "source": "斯坦福问答数据集（SQuAD）", "context": "第50届超级碗丹佛野马队早早取得领先并全程未被对手反超。牛顿受到丹佛防守组的限制，球队7次擒杀他，并迫使他出现3次失误，包括一次掉球，对手完成回攻达阵。丹佛线卫冯·米勒荣膺超级碗最有价值球员，他完成5次单独擒抱、2.5次擒杀以及2次迫使掉球。", "question": "冯·米勒迫使了多少次掉球？", "choices": ["两次", "四次", "三次", "一次"], "answer": "A" } ### 数据字段 - `index`：数值类型，代表数据样例的索引 - `category`：字符串类型，代表数据样例的类别 - `source`：字符串类型，代表数据样例的来源数据集 - `context`：字符串类型，即上下文文本 - `question`：字符串类型，即问题文本 - `choices`：包含4个字符串元素的列表，为候选选项集 - `answer`：ClassLabel类型特征

提供机构：

maas

创建时间：

2025-07-15

5,000+

优质数据集

54 个

任务类型

进入经典数据集