库帕思高质量教育思维链(Chain-of-Thought)数据集-计算机(下篇)
收藏OpenDataLab2026-06-14 更新2025-12-27 收录
下载链接:
https://opendatalab.org.cn/Kupasai/HighQualityEducationCoTDataset-CS2
下载链接
链接失效反馈官方服务:
资源简介:
本次开源数据集总量达100万条,首批开源30万条,覆盖高等教育阶段三大基础学科的核心内容。数学包含高等数学、概率论与数理统计、离散数学、线性代数;物理与计算机则涵盖各自学科各章节重点难点。数据集聚焦课堂教学、自主练习、技能评估等场景,细化至概念理解、公式推导、逻辑分析、综合应用等多种能力维度,为教育智能系统与大模型推理能力的构建提供坚实根基。
This open-source dataset has a total scale of 1 million entries, with 300,000 entries released as the initial open-access batch. It covers the core content of three foundational disciplines at the higher education stage: Mathematics includes advanced mathematics, probability theory and mathematical statistics, discrete mathematics, and linear algebra; Physics and Computer Science cover key and difficult points in each chapter of their respective disciplines. The dataset focuses on scenarios such as classroom teaching, independent practice and skill assessment, and is refined into multiple ability dimensions including concept comprehension, formula derivation, logical analysis and comprehensive application, providing a solid foundation for the construction of educational intelligent systems and the reasoning capabilities of large language models (LLMs).
提供机构:
Kupasai
创建时间:
2025-08-12
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集聚焦计算机学科的复杂问题求解,提供简答、论述和编程思路解析等内容,旨在通过高质量数据支持编程教学和模型推理训练。数据经过严格清洗,以JSON Lines格式存储,包含标准答案及大模型生成的采样答案与思考链,并经过自动化评估,确保正确性和逻辑性。
以上内容由遇见数据集搜集并总结生成



