库帕思高质量教育思维链(Chain-of-Thought)数据集-计算机(上篇)
收藏OpenDataLab2026-06-14 更新2025-12-27 收录
下载链接:
https://opendatalab.org.cn/Kupasai/HighQualityEducationCoTDataset-CS1
下载链接
链接失效反馈官方服务:
资源简介:
本次开源数据集总量达100万条,首批开源30万条,覆盖高等教育阶段三大基础学科的核心内容。数学包含高等数学、概率论与数理统计、离散数学、线性代数;物理与计算机则涵盖各自学科各章节重点难点。数据集聚焦课堂教学、自主练习、技能评估等场景,细化至概念理解、公式推导、逻辑分析、综合应用等多种能力维度,为教育智能系统与大模型推理能力的构建提供坚实根基。
This open-source dataset has a total scale of 1 million entries, with 300,000 entries released in the first batch of open access, covering core content of three foundational disciplines at the higher education stage. Mathematics includes Advanced Mathematics, Probability Theory and Mathematical Statistics, Discrete Mathematics, and Linear Algebra; Physics and Computer Science cover the key and difficult points of each chapter in their respective disciplines. The dataset focuses on scenarios such as classroom teaching, independent practice and skill assessment, and is refined across multiple competency dimensions including concept comprehension, formula derivation, logical analysis and comprehensive application, providing a solid foundation for the construction of educational intelligent systems and the reasoning capabilities of Large Language Models (LLMs).
提供机构:
Kupasai
创建时间:
2025-08-12
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是一个高质量的教育思维链数据集,专注于计算机学科的概念辨析,包含判断、单选、多选和填空等题型,旨在辅助教学并提升AI模型的推理能力。数据集采用JSON Lines格式,不仅提供标准答案,还为每个问题配备了由大语言模型生成的思考链和采样答案,经过严格清洗确保数据可靠性。它以MIT协议开源,支持教育智能化和大模型研发的实际应用。
以上内容由遇见数据集搜集并总结生成



