five

EQG-RACE-PLUS

收藏
魔搭社区2025-10-09 更新2025-03-01 收录
下载链接:
https://modelscope.cn/datasets/voidful/EQG-RACE-PLUS
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Card for "QGG-RACE Dataset" Table of Contents - Dataset Description - Dataset Summary - Supported Tasks and Leaderboards - Languages - Dataset Structure - Data Instances - Data Fields - Data Splits - Dataset Creation - Curation Rationale - Source Data - Annotations - Personal and Sensitive Information - Considerations for Using the Data - Social Impact of Dataset - Discussion of Biases - Other Known Limitations - Additional Information - Dataset Curators - Licensing Information - Citation Information - Contributions ## Dataset Description - GitHub Repository: N/A - Paper: N/A - Leaderboard: N/A - Point of Contact: N/A ## Dataset Summary QGG-RACE Dataset is a subset of RACE, containing three types of questions: Factoid, Cloze, and Summarization. Dataset Download: [GitHub Release](https://github.com/p208p2002/QGG-RACE-dataset/releases/download/v1.0/qgg-dataset.zip) Data Statistics: Types | Examples | Train | Dev | Test ------------- | ------------------------------------------ | ----- | ---- | ---- Cloze | Yingying is Wangwang's _ . | 43167 | 2405 | 2462 Factiod | What can Mimi do? | 18405 | 1030 | 944 Summarization | According to this passage we know that _ . | 3004 | 175 | 184 ## Supported Tasks and Leaderboards - Question Generation - Reading Comprehension - Text Summarization ## Languages The dataset is in English. ## Dataset Structure ### Data Instances An example data instance from the dataset is shown below: ```json { "answers": [ "D", "A", "B", "C" ], "options": [ [ "States", "Doubts", "Confirms", "Removes" ], [ "shows the kind of male birds females seek out.", "indicates the wandering albatross is the most faithful.", "is based on Professor Stutchbury's 20 years' research.", "suggests that female birds select males near their home." ], [ "young birds' quality depends on their feather.", "some male birds care for others' young as their own.", "female birds go to find males as soon as autumn comes.", "female birds are responsible for feeding the hungry babies." ], [ "A book about love-birds.", "Birds' living habits and love life", "The fact that birds don't love their mates forever.", "The factors that influence birds to look for another mate." ] ], "questions": [ "What does the underline word \"dispels\" mean?", "The book The Private Lives of Birds _ .", "According to the passage, we can infer that _ .", "What is the passage mainly about?" ], "article": "Birds are not as loyal to their partners as you might think ...", "id": "high11327.txt", "factoid_questions": [ "What does the underline word \"dispels\" mean?" ], "cloze_questions": [ "The book The Private Lives of Birds _ ." ], "summarization_questions": [ "According to the passage, we can infer that _ ." ] } ``` ## Data Fields - id: Unique identifier for the example. - article: The main text passage. - questions: List of questions related to the passage. - options: List of answer options for each question. - answers: Indexes of the correct answers for each question. - factoid_questions: List of factoid questions. - cloze_questions: List of cloze questions. - summarization_questions: List of summarization questions. ### Data Splits - Train: Contains 65,576 examples. - Dev: Contains 3,610 examples. - Test: Contains 3,590 examples. ## Dataset Creation ### Curation Rationale QGG-RACE dataset is created as a subset of RACE, focusing on three types of questions: Factoid, Cloze, and Summarization. This dataset is intended to facilitate research in question generation and reading comprehension. ### Source Data #### Initial Data Collection and Normalization QGG-RACE dataset is derived from RACE dataset. #### Who are the source language producers? The source language producers are the authors of the RACE dataset. ### Annotations #### Annotation process The dataset is annotated with questions and their corresponding answer options. #### Who are the annotators? The annotators are the authors of the RACE dataset. ### Personal and Sensitive Information The dataset does not contain any personal or sensitive information. ## Considerations for Using the Data ### Social Impact of Dataset The QGG-RACE dataset can be used for research in question generation and reading comprehension, leading to improvements in these fields. ### Discussion of Biases The dataset may inherit some biases from the RACE dataset as it is a subset of it. ### Other Known Limitations No other known limitations. ## Additional Information ### Dataset Curators The QGG-RACE dataset is curated by the authors of the QGG-RACE dataset GitHub repository. ### Licensing Information The dataset is released under the [CC BY 4.0 License](https://creativecommons.org/licenses/by/4.0/). ### Citation Information No citation information is available for the QGG-RACE dataset. ### Contributions Thanks to @p208p2002 for creating the QGG-RACE dataset.

# "QGG-RACE数据集"数据集卡片 ## 目录 - 数据集概述 - 数据集概况 - 支持任务与排行榜 - 语言类型 - 数据集结构 - 数据实例 - 数据字段 - 数据划分 - 数据集构建 - 构建初衷 - 源数据 - 标注信息 - 个人与敏感信息 - 数据使用注意事项 - 数据集社会影响 - 偏差讨论 - 其他已知局限性 - 附加信息 - 数据集维护者 - 许可信息 - 引用信息 - 致谢 ## 数据集概述 - GitHub仓库:无 - 论文:无 - 排行榜:无 - 联系方式:无 ## 数据集概况 QGG-RACE数据集是RACE的子集,包含三类问题:事实型(Factoid)、完形填空型(Cloze)与摘要型(Summarization)。 数据集下载:[GitHub发布包](https://github.com/p208p2002/QGG-RACE-dataset/releases/download/v1.0/qgg-dataset.zip) 数据统计: | 问题类型 | 示例问题 | 训练集 | 验证集 | 测试集 | | ---- | ---- | ---- | ---- | ---- | | 完形填空型(Cloze) | Yingying is Wangwang's _ . | 43167 | 2405 | 2462 | | 事实型(Factoid) | What can Mimi do? | 18405 | 1030 | 944 | | 摘要型(Summarization) | According to this passage we know that _ . | 3004 | 175 | 184 | ## 支持任务与排行榜 - 问题生成(Question Generation) - 阅读理解(Reading Comprehension) - 文本摘要(Text Summarization) ## 语言类型 本数据集采用英语。 ## 数据集结构 ### 数据实例 以下展示了数据集中的一个示例实例: json { "answers": [ "D", "A", "B", "C" ], "options": [ [ "States", "Doubts", "Confirms", "Removes" ], [ "shows the kind of male birds females seek out.", "indicates the wandering albatross is the most faithful.", "is based on Professor Stutchbury's 20 years' research.", "suggests that female birds select males near their home." ], [ "young birds' quality depends on their feather.", "some male birds care for others' young as their own.", "female birds go to find males as soon as autumn comes.", "female birds are responsible for feeding the hungry babies." ], [ "A book about love-birds.", "Birds' living habits and love life", "The fact that birds don't love their mates forever.", "The factors that influence birds to look for another mate." ] ], "questions": [ "What does the underline word "dispels" mean?", "The book The Private Lives of Birds _ .", "According to the passage, we can infer that _ .", "What is the passage mainly about?" ], "article": "Birds are not as loyal to their partners as you might think ...", "id": "high11327.txt", "factoid_questions": [ "What does the underline word "dispels" mean?" ], "cloze_questions": [ "The book The Private Lives of Birds _ ." ], "summarization_questions": [ "According to the passage, we can infer that _ ." ] } ### 数据字段 - id:示例的唯一标识符。 - article:主文本段落。 - questions:与该段落相关的问题列表。 - options:每个问题对应的候选答案选项列表。 - answers:每个问题的正确答案索引列表。 - factoid_questions:事实型问题列表。 - cloze_questions:完形填空型问题列表。 - summarization_questions:摘要型问题列表。 ### 数据划分 - 训练集:包含65,576条示例。 - 验证集:包含3,610条示例。 - 测试集:包含3,590条示例。 ## 数据集构建 ### 构建初衷 QGG-RACE数据集作为RACE的子集构建,聚焦于事实型(Factoid)、完形填空型(Cloze)与摘要型(Summarization)三类问题,旨在推动问题生成与阅读理解领域的研究。 ### 源数据 #### 初始数据收集与标准化 QGG-RACE数据集源自RACE数据集。 #### 源语言内容生产者是谁? 源语言内容生产者为RACE数据集的作者。 ### 标注信息 #### 标注流程 本数据集已针对问题及其对应候选答案选项完成标注。 #### 标注人员是谁? 标注人员为RACE数据集的作者。 ### 个人与敏感信息 本数据集未包含任何个人或敏感信息。 ## 数据使用注意事项 ### 数据集社会影响 QGG-RACE数据集可用于问题生成与阅读理解领域的研究,有望推动相关领域的技术改进。 ### 偏差讨论 由于本数据集是RACE数据集的子集,可能继承RACE数据集存在的部分偏差。 ### 其他已知局限性 无其他已知局限性。 ## 附加信息 ### 数据集维护者 QGG-RACE数据集由该数据集GitHub仓库的作者维护。 ### 许可信息 本数据集采用[CC BY 4.0许可协议](https://creativecommons.org/licenses/by/4.0/)发布。 ### 引用信息 暂无QGG-RACE数据集的引用信息。 ### 致谢 感谢@p208p2002 创建QGG-RACE数据集。
提供机构:
maas
创建时间:
2025-03-01
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作