AmanPriyanshu/reasoning-sft-minimax-microsoft-orca-agentinstruct-1M-v1
收藏Hugging Face2026-03-16 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/reasoning-sft-minimax-microsoft-orca-agentinstruct-1M-v1
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cdla-permissive-2.0
task_categories:
- text-generation
- question-answering
language:
- en
tags:
- reasoning
- sft
- chain-of-thought
- tool-use
- instruction-following
- agents
- synthetic
size_categories:
- 100K<n<1M
---
# MiniMax-M2.5 Reasoning SFT (Orca AgentInstruct 1M v1)
Reasoning SFT dataset generated by [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) on prompts from the [Stratified K-Means Diverse Instruction-Following 100K-1M](https://huggingface.co/datasets/AmanPriyanshu/stratified-kmeans-diverse-instruction-following-100K-1M) dataset (Orca AgentInstruct subset).
## Format
Each row has three columns:
- **`input`** — list of dicts `[{"role": "...", "content": "..."}, ...]` (conversation turns)
- **`response`** — model-generated response with `<think>` reasoning block
- **`source`** — task category (creative_content, text_modification, rc, rag, code_, mcq, follow_up, analytical_reasoning, fermi, brain_teaser, text_classification, open_domain_qa, etc.)
## Generation
- Model: MiniMax-M2.5 (8x tensor parallel)
- Temperature: 0.3, top_p: 0.95, top_k: 40, max_tokens: 10000
- Filtered to rows with exactly one `<think>` and one `</think>` tag
- 944,531 rows survived filtering from 1,121,410 raw (84.23% pass rate)
## Source Prompts
Prompts are drawn from the Orca AgentInstruct subset of the Stratified K-Means Diverse Instruction-Following dataset, which provides embedding-based k-means clustered subsets of [microsoft/orca-agentinstruct-1M-v1](https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1) with square-root rebalanced sampling across creative writing, coding, reading comprehension, text editing, analytical reasoning, and more.
The original dataset is a fully synthetic set of instruction pairs generated by the AgentInstruct framework, using only raw text content publicly available on the Web as seeds. See [AgentInstruct: Toward Generative Teaching with Agentic Flows](https://arxiv.org/abs/2407.03502) for details.
## Usage
```python
from datasets import load_dataset
import random
ds = load_dataset("AmanPriyanshu/reasoning-sft-microsoft-orca-agentinstruct-1M-v1", split="train")
print(f"Rows: {len(ds)}")
print(f"Columns: {ds.column_names}\n")
for n in range(3):
i = random.randint(0, len(ds) - 1)
row = ds[i]
print(f"{'='*80}")
print(f"ROW {n+1} (index {i})")
print(f"{'='*80}")
print(f"\n[source] {row['source']}")
print(f"\n[input] ({len(row['input'])} turns)")
for t in row["input"]:
preview = t["content"][:250] + ("..." if len(t["content"]) > 250 else "")
print(f" {t['role']}: {preview}")
resp = row["response"]
print(f"\n[response] ({len(resp)} chars)")
print(resp[:1000] + ("..." if len(resp) > 1000 else ""))
print()
```
## Authors
- **Aman Priyanshu** — [LinkedIn](https://www.linkedin.com/in/aman-priyanshu/) | [Twitter](https://x.com/AmanPriyanshu6) | [Website](https://amanpriyanshu.github.io/)
- **Supriti Vijay** — [LinkedIn](https://www.linkedin.com/in/supriti-vijay/) | [Twitter](https://x.com/SupritiVijay) | [Website](https://supritivijay.github.io/)
## Citation
```bibtex
@misc{priyanshu2025stratifiedllm,
title={{Stratified LLM Subsets: Pre-Training, Instruction-Following, and Reasoning SFT Data at 100K-1M Scale}},
author={Priyanshu, Aman and Vijay, Supriti},
year={2025},
howpublished={\url{https://amanpriyanshu.github.io/Stratified-LLM-Subsets-100K-1M-Scale/}},
}
```
```bibtex
@misc{mitra2024agentinstruct,
title={AgentInstruct: Toward Generative Teaching with Agentic Flows},
author={Arindam Mitra and Luciano Del Corro and Guoqing Zheng and Shweti Mahajan and Dany Rouhana and Andres Codas and Yadong Lu and Wei-ge Chen and Olga Vrousgos and Corby Rosset and Fillipe Silva and Hamed Khanpour and Yash Lara and Ahmed Awadallah},
year={2024},
eprint={2407.03502},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
```
## License
CDLA-Permissive-2.0 — inherited from the original [microsoft/orca-agentinstruct-1M-v1](https://huggingface.co/datasets/microsoft/orca-agentinstruct-1M-v1).
提供机构:
AmanPriyanshu



