AmanPriyanshu/reasoning-sft-minimax-stratified-kmeans-diverse-reasoning-842K-only
收藏Hugging Face2026-03-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/reasoning-sft-minimax-stratified-kmeans-diverse-reasoning-842K-only
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-4.0
task_categories:
- text-generation
language:
- en
tags:
- reasoning
- sft
- chain-of-thought
- math
- code
- science
size_categories:
- 100K<n<1M
---
# MiniMax-M2.5 Reasoning SFT (Stratified K-Means Diverse Reasoning 1M)
Reasoning SFT dataset generated by [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) on prompts from the [Stratified K-Means Diverse Reasoning 100K-1M](https://huggingface.co/datasets/AmanPriyanshu/stratified-kmeans-diverse-reasoning-100K-1M) dataset.
## Format
Each row has three columns:
- **`input`** — list of dicts `[{"role": "...", "content": "..."}, ...]` (conversation turns)
- **`response`** — model-generated response with `<think>` reasoning block
- **`source`** — task category (math, code, science, chat, safety)
## Generation
- Model: MiniMax-M2.5 (8x tensor parallel)
- Temperature: 0.3, top_p: 0.95, top_k: 40, max_tokens: 10000
- Filtered to rows with exactly one `<think>` and one `</think>` tag
## Source Prompts
Prompts are drawn from the Stratified K-Means Diverse Reasoning dataset, which provides embedding-based k-means clustered subsets of [NVIDIA's Llama-Nemotron Post-Training Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset) with square-root rebalanced sampling across math, code, science, chat, and safety categories.
## Authors
- **Aman Priyanshu** — [LinkedIn](https://www.linkedin.com/in/aman-priyanshu/) | [Twitter](https://x.com/AmanPriyanshu6) | [Website](https://amanpriyanshu.github.io/)
- **Supriti Vijay** — [LinkedIn](https://www.linkedin.com/in/supriti-vijay/) | [Twitter](https://x.com/SupritiVijay) | [Website](https://supritivijay.github.io/)
## Usage
```python
from datasets import load_dataset
import random
ds = load_dataset("AmanPriyanshu/reasoning-sft-minimax-stratified-kmeans-diverse-reasoning-842K-only", split="train")
print(f"Rows: {len(ds)}")
print(f"Columns: {ds.column_names}\n")
for n in range(3):
i = random.randint(0, len(ds) - 1)
row = ds[i]
print(f"{'='*80}")
print(f"ROW {n+1} (index {i})")
print(f"{'='*80}")
print(f"\n[source] {row['source']}")
print(f"\n[input] ({len(row['input'])} turns)")
for t in row["input"]:
preview = t["content"][:250] + ("..." if len(t["content"]) > 250 else "")
print(f" {t['role']}: {preview}")
resp = row["response"]
print(f"\n[response] ({len(resp)} chars)")
print(resp[:1000] + ("..." if len(resp) > 1000 else ""))
print()
```
## Citation
```bibtex
@misc{priyanshu2025stratifiedllm,
title={{Stratified LLM Subsets: Pre-Training, Instruction-Following, and Reasoning SFT Data at 100K-1M Scale}},
author={Priyanshu, Aman and Vijay, Supriti},
year={2025},
howpublished={\url{https://amanpriyanshu.github.io/Stratified-LLM-Subsets-100K-1M-Scale/}},
}
```
## License
CC BY 4.0 — inherited from the original [Llama-Nemotron Post-Training Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset).
提供机构:
AmanPriyanshu



