five

TNSA/OpenReasoner-V1

收藏
Hugging Face2026-04-28 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/TNSA/OpenReasoner-V1
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: mit language: - en metrics: - accuracy tags: - reasoning - math - instruction-tuning - ngen --- ![description](https://huggingface.co/datasets/TNSA/OpenReasoner-V1/resolve/main/banner.png) # OpenReasoner-V1 OpenReasoner-V1 is a high-fidelity, unified dataset designed for fine-tuning advanced reasoning models. It combines complex mathematical problem-solving with high-quality general instruction data, specifically optimized for state-of-the-art small language models like **NGen-4 Lite**. ## 🚀 Key Features - **Integrated Math Reasoning**: Includes a curated selection of 220k+ mathematical problems with detailed, step-by-step solutions to improve logical deduction capabilities. - **Hermes-Core Alignment**: Infused with the OpenHermes-2.5 instruction set to ensure the model maintains excellent general-purpose conversational abilities and broad knowledge. - **Deep-Thinking Optimization**: Features distilled reasoning trajectories formatted with `<think>` tags, designed to teach models how to "reason-before-acting" in complex scenarios. - **Unified Format**: All data is provided in a standard multi-turn conversation format, ready for immediate use in SFT (Supervised Fine-Tuning) pipelines. ## 📊 Data Composition | Component | Source Type | Focus | | :--- | :--- | :--- | | **Math-Logic** | OpenR1-Math | Theorem proving, calculus, and multi-step logic. | | **General Instruction** | OpenHermes-2.5 | Creativity, coding, and general knowledge. | | **Distilled Reasoning** | High-Fidelity SOTA Distillation | Advanced "Chain of Thought" (CoT) and strategy. | ## 🛠 Usage This dataset is ideal for training models that require a balance between **high-IQ reasoning** and **low-latency instruction following**. It is the primary training source for the **NGen-4.1** series. ### Formatting Reasoning samples follow the established `<think>` pattern:
提供机构:
TNSA
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作