AmanPriyanshu/reasoning-sft-minimax-microsoft-orca-agentinstruct-1M-v1

Name: AmanPriyanshu/reasoning-sft-minimax-microsoft-orca-agentinstruct-1M-v1
Creator: AmanPriyanshu
Published: 2026-03-16 00:28:09
License: 暂无描述

Hugging Face2026-03-16 更新2026-03-29 收录

下载链接：

https://hf-mirror.com/datasets/AmanPriyanshu/reasoning-sft-minimax-microsoft-orca-agentinstruct-1M-v1

下载链接

链接失效反馈

官方服务：

资源简介：

--- license: cdla-permissive-2.0 task_categories: - text-generation - question-answering language: - en tags: - reasoning - sft - chain-of-thought - tool-use - instruction-following - agents - synthetic size_categories: - 100K<n<1M --- # MiniMax-M2.5 Reasoning SFT (Orca AgentInstruct 1M v1) Reasoning SFT dataset generated by [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) on prompts from the [Stratified K-Means Diverse Instruction-Following 100K-1M](https://huggingface.co/datasets/AmanPriyanshu/stratified-kmeans-diverse-instruction-following-100K-1M) dataset (Orca AgentInstruct subset). ## Format Each row has three columns: - **`input`** — list of dicts `[{"role": "...", "content": "..."}, ...]` (conversation turns) - **`response`** — model-generated response with `<think>` reasoning block - **`source`** — task category (creative_content, text_modification, rc, rag, code_, mcq, follow_up, analytical_reasoning, fermi, brain_teaser, text_classification, open_domain_qa, etc.) ## Generation - Model: MiniMax-M2.5 (8x tensor parallel) - Temperature: 0.3, top_p: 0.95, top_k: 40, max_tokens: 10000 - Filtered to rows with exactly one `<think>` and one `</think>` tag - 944,531 rows survived filtering from 1,121,410 raw (84.23% pass rate) ## Source Prompts Prompts are drawn from the Orca AgentInstruct subset of the Stratified K-Means Diverse Instruction-Following dataset, which provides embedding-based k-means clustered subsets of [microsoft/orca-agentinstruct-1M-v1](https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1) with square-root rebalanced sampling across creative writing, coding, reading comprehension, text editing, analytical reasoning, and more. The original dataset is a fully synthetic set of instruction pairs generated by the AgentInstruct framework, using only raw text content publicly available on the Web as seeds. See [AgentInstruct: Toward Generative Teaching with Agentic Flows](https://arxiv.org/abs/2407.03502) for details. ## Usage ```python from datasets import load_dataset import random ds = load_dataset("AmanPriyanshu/reasoning-sft-microsoft-orca-agentinstruct-1M-v1", split="train") print(f"Rows: {len(ds)}") print(f"Columns: {ds.column_names}\n") for n in range(3): i = random.randint(0, len(ds) - 1) row = ds[i] print(f"{'='*80}") print(f"ROW {n+1} (index {i})") print(f"{'='*80}") print(f"\n[source] {row['source']}") print(f"\n[input] ({len(row['input'])} turns)") for t in row["input"]: preview = t["content"][:250] + ("..." if len(t["content"]) > 250 else "") print(f" {t['role']}: {preview}") resp = row["response"] print(f"\n[response] ({len(resp)} chars)") print(resp[:1000] + ("..." if len(resp) > 1000 else "")) print() ``` ## Authors - **Aman Priyanshu** — [LinkedIn](https://www.linkedin.com/in/aman-priyanshu/) | [Twitter](https://x.com/AmanPriyanshu6) | [Website](https://amanpriyanshu.github.io/) - **Supriti Vijay** — [LinkedIn](https://www.linkedin.com/in/supriti-vijay/) | [Twitter](https://x.com/SupritiVijay) | [Website](https://supritivijay.github.io/) ## Citation ```bibtex @misc{priyanshu2025stratifiedllm, title={{Stratified LLM Subsets: Pre-Training, Instruction-Following, and Reasoning SFT Data at 100K-1M Scale}}, author={Priyanshu, Aman and Vijay, Supriti}, year={2025}, howpublished={\url{https://amanpriyanshu.github.io/Stratified-LLM-Subsets-100K-1M-Scale/}}, } ``` ```bibtex @misc{mitra2024agentinstruct, title={AgentInstruct: Toward Generative Teaching with Agentic Flows}, author={Arindam Mitra and Luciano Del Corro and Guoqing Zheng and Shweti Mahajan and Dany Rouhana and Andres Codas and Yadong Lu and Wei-ge Chen and Olga Vrousgos and Corby Rosset and Fillipe Silva and Hamed Khanpour and Yash Lara and Ahmed Awadallah}, year={2024}, eprint={2407.03502}, archivePrefix={arXiv}, primaryClass={cs.CL} } ``` ## License CDLA-Permissive-2.0 — inherited from the original [microsoft/orca-agentinstruct-1M-v1](https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1).

提供机构：

AmanPriyanshu

5,000+

优质数据集

54 个

任务类型

进入经典数据集