arc-cot

Name: arc-cot
Creator: maas
Published: 2025-12-04 16:15:03
License: 暂无描述

魔搭社区2025-12-04 更新2024-06-08 收录

下载链接：

https://modelscope.cn/datasets/AI-ModelScope/arc-cot

下载链接

链接失效反馈

官方服务：

资源简介：

# Augmented ARC-Challenge Dataset with Chain-of-Thought Reasoning ## Dataset Description This dataset was created by augmenting the train subset of the [AI2 Reasoning Challenge (ARC) dataset](https://allenai.org/data/arc) with chain-of-thought reasoning generated by Google's Gemini Pro language model. The goal is to provide additional context and intermediate reasoning steps to help models better solve the challenging multiple-choice science questions in ARC. ## Dataset Structure The dataset contains 1068 training examples, with the following features: - `question` (string): The natural language science question. - `answer` (string): The correct answer to the question. ## Dataset Creation The chain-of-thought reasoning for each question-answer pair was generated using Google's Gemini Pro model. The model was given each question and the correct answer, and prompted to provide a detailed chain of reasoning for why that answer is correct. The generated chains of thought aim to break down the reasoning process into clear steps, providing additional context and explanations. The train split of the ARC-Challenge dataset was used as the base, which contains 1068 multiple-choice science questions covering topics like physics, chemistry, biology, and earth science. The questions are generally at a 3rd-9th grade level. ## Intended Use This dataset is intended to be used as a resource to train question answering models on reasoning about science questions. By providing the intermediate reasoning steps, the hope is that models can learn to reason more effectively and transparently about complex questions. Potential use cases include: - Benchmarking question answering models on science reasoning - Analyzing the types of reasoning required for science QA - Improving model interpretability by generating reasoning traces - Studying few-shot learning with in-context chain-of-thought examples ## Limitations and Ethical Considerations The chains of thought are generated by an AI system and may not always be entirely accurate or complete. They should be viewed as a supplemental learning resource rather than guaranteed perfect reasoning. Additionally, the underlying ARC-Challenge questions may contain some social biases, as they are drawn from real-world science exams. Users should be aware of potential biases when training on this data. ## Dataset Specs - Number of examples: 1,068 - Dataset size: 472 KB - Format: parquet

# 基于思维链（Chain-of-Thought）推理增强的ARC-Challenge数据集 ## 数据集描述本数据集通过为[AI2推理挑战赛（AI2 Reasoning Challenge, ARC）数据集](https://allenai.org/data/arc)的训练子集添加由谷歌Gemini Pro大语言模型生成的思维链推理内容构建而成。其核心目标是提供额外的上下文信息与中间推理步骤，助力模型更好地解决ARC数据集内具有挑战性的科学类单项选择题。 ## 数据集结构该数据集包含1068条训练样本，具备以下字段： - `question`（字符串类型）：自然语言形式的科学问题 - `answer`（字符串类型）：该问题的正确答案 ## 数据集构建流程每条问答对的思维链推理内容均由谷歌Gemini Pro模型生成。模型会接收对应问题与正确答案，并被提示详细阐述支撑该答案正确性的完整推理链条。生成的思维链旨在将推理过程拆解为清晰的分步逻辑，提供额外的上下文背景与解释说明。本数据集以ARC-Challenge数据集的训练拆分作为基础数据源，该拆分包含1068条覆盖物理、化学、生物与地球科学领域的科学类单项选择题，题目难度大致对应3至9年级水平。 ## 预期用途本数据集旨在作为训练科学问题问答模型的专属资源。通过提供中间推理步骤，期望模型能够更高效且具备可解释性地对复杂科学问题开展推理工作。潜在应用场景包括： - 在科学推理任务中对问答模型进行基准测试 - 分析科学问答任务所需的各类推理类型 - 通过生成推理轨迹提升模型的可解释性 - 结合上下文思维链示例研究少样本学习（Few-shot） ## 局限性与伦理考量思维链内容由人工智能系统生成，未必始终完全准确或完备，仅应作为补充学习资源，而非绝对正确的推理参考标准。此外，原始ARC-Challenge数据集的问题取材自真实世界的科学考试，可能包含部分社会偏见。用户在基于该数据开展模型训练时，需充分留意潜在的偏见问题。 ## 数据集规格 - 样本数量：1068条 - 数据集大小：472 KB - 格式：Parquet

提供机构：

maas

创建时间：

2024-05-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集