virtuoussy/Math-RLVR
收藏Hugging Face2025-04-02 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/virtuoussy/Math-RLVR
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含773k对中文问答的数据集,涵盖小学、初中和高中三个教育水平,用于研究论文《Expanding RL with Verifiable Rewards Across Diverse Domains》。数据集的答案通常是自由形式的,与解释或子问题交织,适合用于测试基于规则的奖励函数在处理非结构化答案时的效果。
This dataset contains 773k Chinese Question Answering (QA) pairs, covering three educational levels: elementary, middle, and high school, for the research paper Expanding RL with Verifiable Rewards Across Diverse Domains. The answers in the dataset are typically free-form, interwoven with explanations or sub-questions, suitable for testing the effectiveness of rule-based reward functions when dealing with unstructured answers.
提供机构:
virtuoussy



