临科智华教育思维链数据集
收藏OpenDataLab2026-06-07 更新2025-12-20 收录
下载链接:
https://opendatalab.org.cn/LinkwiseDataTechnology/LinkWise-Education-CoT-Dataset
下载链接
链接失效反馈官方服务:
资源简介:
随着⼤语⾔模型的快速发展,多模态融合、复杂推理能⼒的提升以及硬件资源的优化利⽤都取得了显 著进展,评分不断攀升,⼀批新兴 AI 模型不断涌现,并⼴泛应⽤于⽣活的各个领域。然⽽,在实际使 ⽤过程中,模型仍然不可避免地出现幻觉、答案不够流畅甚⾄错误的情况。这⼀问题的根源在于,尽 管算⼒和算法不断升级,最关键的数据集质量却未能同步提升,导致模型在复杂多变的真实场景中表 现受限。 数学作为逻辑推理和问题求解的核⼼学科,是衡量⼤语⾔模型理解能⼒和推理深度的重要标准。
为了 提升 AI 的逻辑⽔平和思考能⼒,临科智华率先开源了⾏业⾸个针对中国⾼考数学的数据集:<Education CoT Dataset(Chinese High School Mathematics)> ,该数据集结合思维链(Chain of Thought, 以下简称 CoT )的思想,在临科数智引擎(LinkWise Data Engine),以及数百名数据处 理⼯程师共同努⼒下,通过⼈机协作与⼈⼯验证,构建了⼀个⾼质量的数学数据集,能有效提升⼤模 型的数学推理能⼒及思维能⼒。
With the rapid advancement of large language models (LLMs), remarkable progress has been achieved in multimodal fusion, enhancement of complex reasoning capabilities, and optimized utilization of hardware resources. Their benchmark scores have been steadily rising, and a growing number of emerging AI models have emerged and been widely deployed across various sectors of daily life. However, in practical application scenarios, models still inevitably suffer from hallucinations, incoherent or even erroneous outputs. The root cause of these issues is that despite the continuous upgrading of computational power and algorithms, the quality of the most critical datasets has not been improved in tandem, which restricts the models' performance in complex and dynamic real-world scenarios. Mathematics, as a core discipline of logical reasoning and problem-solving, serves as an important benchmark for evaluating the comprehension abilities and reasoning depth of LLMs. To enhance the logical proficiency and thinking capabilities of AI systems, LinkWise Intelligence took the lead in open-sourcing the industry's first dataset tailored for the Chinese National College Entrance Examination (Gaokao) mathematics domain: <Education CoT Dataset (Chinese High School Mathematics)>. This dataset integrates the concept of Chain of Thought (CoT for short). Constructed through human-machine collaboration and manual verification, with the support of the LinkWise Data Engine and the joint efforts of hundreds of data processing engineers, this high-quality mathematics dataset can effectively improve the mathematical reasoning and thinking abilities of large language models.
提供机构:
LinkwiseDataTechnology
创建时间:
2025-06-11
搜集汇总
数据集介绍

背景与挑战
背景概述
该数据集是针对中国高考数学设计的思维链(CoT)数据集,旨在通过高质量的人机协作构建提升大语言模型的数学推理能力和逻辑思维水平。它针对当前AI模型因数据集质量不足而产生的幻觉和错误问题,提供了专门的解决方案。
以上内容由遇见数据集搜集并总结生成



