five

WilhelmH/Defined-Reasoning

收藏
Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/WilhelmH/Defined-Reasoning
下载链接
链接失效反馈
官方服务:
资源简介:
Defined-Reasoning是一个包含1,400行科学推理数据的数据集。每行数据包含一个GLM 5.1的思维链(chain of thought)以及三个由独立注释过程生成的结构化辅助字段。数据集的主要列包括问题陈述(Question)、答案(Answer)、推理过程(Reasoning)、难度等级(Difficulty)、总结(Summary)、所需知识(KnowledgeRequired)和顿悟时刻(AhaMoments)。数据集的构成是按难度分层的随机样本,包括800个中等难度、400个高难度和200个极端难度的样本。数据来源包括Kassadin88/GLM-5.1-OpenThoughts3-Distill(科学部分)和DeepSeek V4 Pro生成的注释字段。

Defined-Reasoning is a 1,400-row science reasoning dataset. Each row contains a GLM 5.1 chain of thought paired with three structured auxiliary fields produced by a separate annotator pass. The main columns include Question (science problem statement), Answer (GLM 5.1 final answer), Reasoning (GLM 5.1 full chain of thought), Difficulty (3 medium, 4 hard, 5 extreme), Summary (condensed chronological retelling of Reasoning), KnowledgeRequired (JSON list of atomic prior facts the reasoning leans on), and AhaMoments (JSON list of {realization, trigger, shift} entries). The dataset is a stratified random sample by difficulty: 800 medium, 400 hard, and 200 extreme. The source of Question, Answer, Reasoning, and Difficulty is Kassadin88/GLM-5.1-OpenThoughts3-Distill (science split), while Summary, KnowledgeRequired, and AhaMoments were generated by DeepSeek V4 Pro.
提供机构:
WilhelmH
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作