miromind-ai/MiroMind-M1-SFT-719K

Name: miromind-ai/MiroMind-M1-SFT-719K
Creator: miromind-ai
Published: 2025-07-22 01:44:22
License: 暂无描述

Hugging Face2025-07-22 更新2025-08-09 收录

下载链接：

https://hf-mirror.com/datasets/miromind-ai/MiroMind-M1-SFT-719K

下载链接

链接失效反馈

官方服务：

资源简介：

MiroMind-M1是一个完全开源的推理语言模型系列，基于Qwen-2.5构建，专注于提升数学推理能力。它通过在719K个精选问题上的监督微调(SFT)和在62K个具有挑战性的示例上的可验证奖励强化学习(RLVR)进行训练，并采用上下文感知的多阶段策略优化方法(CAMPO)。该模型在AIME24、AIME25和MATH500任务上取得了7B Qwen-2.5基模型中的最佳性能。

MiroMind-M1 is a fully open-source series of reasoning language models built on Qwen-2.5, focused on advancing mathematical reasoning. It is trained through supervised fine-tuning (SFT) on 719K curated problems and reinforcement learning with verifiable rewards (RLVR) on 62K challenging examples, using a context-aware multi-stage policy optimization method (CAMPO). MiroMind-M1 achieves state-of-the-art performance among open-source 7B Qwen-2.5-based models on AIME24, AIME25, and MATH500.

提供机构：

miromind-ai

5,000+

优质数据集

54 个

任务类型

进入经典数据集