Kassadin88/GLM-5.1-OpenThoughts3-Distill

Name: Kassadin88/GLM-5.1-OpenThoughts3-Distill
Creator: Kassadin88
Published: 2026-04-28 11:47:02
License: 暂无描述

Hugging Face2026-04-28 更新2026-05-03 收录

下载链接：

https://hf-mirror.com/datasets/Kassadin88/GLM-5.1-OpenThoughts3-Distill

下载链接

链接失效反馈

官方服务：

资源简介：

GLM-5.1-OpenThoughts3-Distill是一个通过GLM-5.1模型从OpenThoughts3-1.2M提示中蒸馏生成的推理数据集，涵盖了科学、代码和数学领域。数据集包含用户问题、模型的思维链（thinking）和最终回答（response）。数据格式为JSON，包括唯一标识符（id）、聊天格式的对话（messages）、提取的思维链内容（thinking）、最终回答（response）、质量评分（judge_scores）、领域分类（category）和难度级别（difficulty）。质量评估使用了Qwen3.6-35B-A3B作为评判模型，对响应质量、推理质量、思维与回答的一致性、格式合规性和指令遵循性进行了评分。数据过滤过程包括错误移除、仅思维内容移除等步骤，最终保留了高质量的数据行。

GLM-5.1-OpenThoughts3-Distill is a reasoning dataset generated by distilling prompts from OpenThoughts3-1.2M using the GLM-5.1 model, covering Science, Code, and Math domains. The dataset includes user questions, the models chain-of-thought (thinking), and the final response. The data format is JSON, containing a unique identifier (id), chat-formatted conversation (messages), extracted chain-of-thought content (thinking), final response (response), quality scores (judge_scores), domain classification (category), and difficulty level (difficulty). Quality evaluation was performed using Qwen3.6-35B-A3B as a judge model, scoring dimensions such as response quality, reasoning quality, thinking-response alignment, format compliance, and instruction following. Data filtering involved removing errors and thinking-only rows, resulting in a high-quality final dataset.

提供机构：

Kassadin88

5,000+

优质数据集

54 个

任务类型

进入经典数据集