five

LLM360/TxT360-3efforts

收藏
Hugging Face2025-12-11 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/LLM360/TxT360-3efforts
下载链接
链接失效反馈
官方服务:
资源简介:
TxT360-3efforts是一个用于训练语言模型的监督微调(SFT)数据集,通过聊天模板控制三种推理努力(低、中、高)。该数据集包含约1000万份文档和100亿个损失标记,涵盖了数学、编码、通用聊天、STEM推理、指令遵循、工具使用和安全对齐等多个主要类别。所有问题来源均来自许可公开数据集或合成生成,并经过质量过滤、去重和去污染处理。答案大多使用GPT-OSS-120B在低、中、高推理努力水平下重新生成。该数据集已用于K2-V2 LLM的SFT,展示了随着推理努力的增加,生成长度平滑增加和性能改善的特点。

TxT360-3efforts is a supervised fine-tuning (SFT) dataset designed to train language models with three reasoning efforts (low, medium, high) controllable via chat template. The dataset consists of approximately 10 million documents with 10 billion loss tokens, covering nine major categories including mathematics, coding, general chat, STEM reasoning, instruction following, tool use, and safety alignment. All question sources are either collected from permissively licensed public datasets or synthetically generated, and are subsequently quality-filtered, deduplicated, and decontaminated against evaluation benchmarks. The answers are mostly regenerated using GPT-OSS-120B at low, medium and high reasoning effort levels. TxT360-3efforts was used for the SFT of K2-V2 LLM, demonstrating a smooth increase in generation length and improved performance with increasing reasoning effort.
提供机构:
LLM360
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作