qingy2024/NuminaMath-282k-GRPO
收藏Hugging Face2025-02-09 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/qingy2024/NuminaMath-282k-GRPO
下载链接
链接失效反馈官方服务:
资源简介:
NuminaMath 282k GRPO数据集是一个经过清理和验证的数学问题解答数据集,基于AI-MO/NuminaMath-CoT数据集,并部分来源于flatlander1024/numinamath_verifiable_cleaned。该数据集通过筛选可以转换为sympy库的响应创建而成,适用于训练如GRPO等强化学习方法。
The NuminaMath 282k GRPO dataset is a cleaned and verifiable math question answering dataset, based on AI-MO/NuminaMath-CoT and partially derived from flatlander1024/numinamath_verifiable_cleaned. It is created by filtering responses that can be converted into sympy library, making it suitable for training reinforcement learning methods like GRPO.
提供机构:
qingy2024



