five

AmanPriyanshu/reasoning-sft-minimax-stratified-kmeans-diverse-reasoning-842K-only

收藏
Hugging Face2026-03-15 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/reasoning-sft-minimax-stratified-kmeans-diverse-reasoning-842K-only
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-4.0 task_categories: - text-generation language: - en tags: - reasoning - sft - chain-of-thought - math - code - science size_categories: - 100K<n<1M --- # MiniMax-M2.5 Reasoning SFT (Stratified K-Means Diverse Reasoning 1M) Reasoning SFT dataset generated by [MiniMaxAI/MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) on prompts from the [Stratified K-Means Diverse Reasoning 100K-1M](https://huggingface.co/datasets/AmanPriyanshu/stratified-kmeans-diverse-reasoning-100K-1M) dataset. ## Format Each row has three columns: - **`input`** — list of dicts `[{"role": "...", "content": "..."}, ...]` (conversation turns) - **`response`** — model-generated response with `<think>` reasoning block - **`source`** — task category (math, code, science, chat, safety) ## Generation - Model: MiniMax-M2.5 (8x tensor parallel) - Temperature: 0.3, top_p: 0.95, top_k: 40, max_tokens: 10000 - Filtered to rows with exactly one `<think>` and one `</think>` tag ## Source Prompts Prompts are drawn from the Stratified K-Means Diverse Reasoning dataset, which provides embedding-based k-means clustered subsets of [NVIDIA's Llama-Nemotron Post-Training Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset) with square-root rebalanced sampling across math, code, science, chat, and safety categories. ## Authors - **Aman Priyanshu** — [LinkedIn](https://www.linkedin.com/in/aman-priyanshu/) | [Twitter](https://x.com/AmanPriyanshu6) | [Website](https://amanpriyanshu.github.io/) - **Supriti Vijay** — [LinkedIn](https://www.linkedin.com/in/supriti-vijay/) | [Twitter](https://x.com/SupritiVijay) | [Website](https://supritivijay.github.io/) ## Usage ```python from datasets import load_dataset import random ds = load_dataset("AmanPriyanshu/reasoning-sft-minimax-stratified-kmeans-diverse-reasoning-842K-only", split="train") print(f"Rows: {len(ds)}") print(f"Columns: {ds.column_names}\n") for n in range(3): i = random.randint(0, len(ds) - 1) row = ds[i] print(f"{'='*80}") print(f"ROW {n+1} (index {i})") print(f"{'='*80}") print(f"\n[source] {row['source']}") print(f"\n[input] ({len(row['input'])} turns)") for t in row["input"]: preview = t["content"][:250] + ("..." if len(t["content"]) > 250 else "") print(f" {t['role']}: {preview}") resp = row["response"] print(f"\n[response] ({len(resp)} chars)") print(resp[:1000] + ("..." if len(resp) > 1000 else "")) print() ``` ## Citation ```bibtex @misc{priyanshu2025stratifiedllm, title={{Stratified LLM Subsets: Pre-Training, Instruction-Following, and Reasoning SFT Data at 100K-1M Scale}}, author={Priyanshu, Aman and Vijay, Supriti}, year={2025}, howpublished={\url{https://amanpriyanshu.github.io/Stratified-LLM-Subsets-100K-1M-Scale/}}, } ``` ## License CC BY 4.0 — inherited from the original [Llama-Nemotron Post-Training Dataset](https://huggingface.co/datasets/nvidia/Llama-Nemotron-Post-Training-Dataset).
提供机构:
AmanPriyanshu
二维码
社区交流群
二维码
科研交流群
商业服务