AmanPriyanshu/rlvr-guru-raw-data-extended
收藏Hugging Face2025-10-20 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/AmanPriyanshu/rlvr-guru-raw-data-extended
下载链接
链接失效反馈官方服务:
资源简介:
RLVR GURU Extended 是一个包含150,000个训练样本和221,332个测试样本的全面跨域推理数据集,涵盖各种推理密集型领域。该数据集通过添加额外的STEM推理领域扩展了GURU数据集,同时保持了用于强化学习应用的严格质量标准。数据集专为与Reasoning360 VERL奖励评分框架兼容而设计,并包括具有自动验证的特定于领域的奖励函数。数据以Parquet格式存储,并包括用于统一奖励计算的标准字段。数据集还包括用于特定领域奖励计算的附加元数据,并与RLVR训练流程兼容。
The RLVR GURU Extended is a comprehensive cross-domain reasoning dataset containing 150,000 training samples and 221,332 test samples across various reasoning-intensive domains. It extends the GURU dataset by incorporating additional STEM reasoning domains while maintaining rigorous quality standards for reinforcement learning applications. The dataset is designed for compatibility with the Reasoning360 VERL reward scoring framework and includes domain-specific reward functions with automated verification. The data is stored in Parquet format and includes standardized fields for unified reward computation. The dataset also includes additional metadata for domain-specific reward computation and is compatible with RLVR training pipelines.
提供机构:
AmanPriyanshu



