five

Ring-lite-rl-data

收藏
魔搭社区2026-01-08 更新2025-06-21 收录
下载链接:
https://modelscope.cn/datasets/inclusionAI/Ring-lite-rl-data
下载链接
链接失效反馈
官方服务:
资源简介:
<p align="center"> <img src="https://huggingface.co/inclusionAI/Ling-lite/resolve/main/ant-bailing.png" width="100"/> <p> <p align="center"> 🤗 <a href="https://huggingface.co/inclusionAI">Hugging Face</a> 🤖 <a href="https://modelscope.cn/organization/inclusionAI">ModelScope</a> 🖥️ <a href="https://github.com/inclusionAI/Ring">GitHub</a> <p> # Ring-lite-rl-data This dataset is a curated subset of high-quality problems across mathematics and code domains designed for reinforcement learning in the [Ring-lite](https://modelscope.cn/models/inclusionAI/Ring-lite) model. This dataset contains: * **Mathematics**: Over 39,000 rigorously curated problems sourced from: - Open-source datasets (BigMath, DeepScaleR, DAPO, DeepMath-103K) - Art of Problem Solving (AoPS) contest collections * **Code**: Approximately 8,400 verified coding problems from: - Programming competition resources (CodeContest, TACO, APPS) - All problems include validated "Accepted" solutions and test cases **Note**: Only a partial subset of the complete dataset is publicly released due to third-party data licensing restrictions and procurement agreements. The published portion has been carefully selected to comply with all copyright requirements while maintaining research utility. ## Dataset Construction ### Data Sources - **Mathematics**: Problems collected from open-source datasets, filtered through strict quality control - **Code**: Problems from open-source programming competition resources with verified solutions ### Curation Pipeline Our data undergoes a rigorous three-stage curation process: 1. **Data Cleansing**: - Removal of problems with invalid characters, images, or multiple subquestions - Strict character-based and semantic-based deduplication - Exclusion of easily guessable problems (multiple-choice, True/False questions) 2. **Answer Verification**: - LLM-based verification using models of different sizes - Human expert annotation - Problems failing verification are excluded 3. **Data Annotation**: - Multi-dimensional labeling (source, educational level, domain knowledge) - Mathematical Subject Classification (MSC) for math problems - Model-aware difficulty assessment ## Dataset Fields The dataset contains the following fields for each domain: ### Mathematics - **context**: The problem statement - **groundtruth**: Verified correct answer - **type**: Problem category - **mid**: Unique problem ID - **subject**: Discipline ### Code - **context**: Detailed programming problem description - **groundtruth**: Verified correct Python solution code - **groundtruth_language**: Implementation language - **type**: Problem category - **code_test_cases**: List of validated test cases with: - **input**: Test input - **output**: Expected output - **dataset**: Source dataset - **code_language**: Programming language - **difficulty**: Problem difficulty score - **mid**: Unique problem ID ## Citation Information **Please consider citing our technical report [Ring-lite](https://arxiv.org/abs/2506.14731) if you use this dataset:** ``` @misc{ringteam2025ringlitescalablereasoningc3postabilized, title={Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs}, author={Ling Team and Bin Hu and Cai Chen and Deng Zhao and Ding Liu and Dingnan Jin and Feng Zhu and Hao Dai and Hongzhi Luan and Jia Guo and Jiaming Liu and Jiewei Wu and Jun Mei and Jun Zhou and Junbo Zhao and Junwu Xiong and Kaihong Zhang and Kuan Xu and Lei Liang and Liang Jiang and Liangcheng Fu and Longfei Zheng and Qiang Gao and Qing Cui and Quan Wan and Shaomian Zheng and Shuaicheng Li and Tongkai Yang and Wang Ren and Xiaodong Yan and Xiaopei Wan and Xiaoyun Feng and Xin Zhao and Xinxing Yang and Xinyu Kong and Xuemin Yang and Yang Li and Yingting Wu and Yongkang Liu and Zhankai Xu and Zhenduo Zhang and Zhenglei Zhou and Zhenyu Huang and Zhiqiang Zhang and Zihao Wang and Zujie Wen}, year={2025}, eprint={2506.14731}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2506.14731}, } ``` ## Intended Usage This dataset is designed for: - Training and evaluating LLMs on multi-domain reasoning tasks - Reinforcement learning applications - Benchmarking model performance across mathematics and code domains ## Release Date 06/20/2025 ## Data Version 1.0

<p align="center"> <img src="https://huggingface.co/inclusionAI/Ling-lite/resolve/main/ant-bailing.png" width="100"/> </p> <p align="center"> 🤗 <a href="https://huggingface.co/inclusionAI">Hugging Face</a> 🤖 <a href="https://modelscope.cn/organization/inclusionAI">ModelScope</a> 🖥️ <a href="https://github.com/inclusionAI/Ring">GitHub</a> </p> # Ring-lite-rl-data 本数据集为经严格精选的高质量子集,涵盖数学与代码领域的优质题目,专为适配[Ring-lite](https://modelscope.cn/models/inclusionAI/Ring-lite)模型的强化学习任务设计。本数据集包含以下内容: * **数学领域**:超过39,000道经严谨筛选的题目,来源包括: - 开源数据集(BigMath、DeepScaleR、DAPO、DeepMath-103K) - 数学解题艺术(Art of Problem Solving,简称AoPS)竞赛题库 * **代码领域**:约8,400道经过验证的编程题目,来源包括: - 编程竞赛资源(CodeContest、TACO、APPS) - 所有题目均附带经过验证的Accepted(通过)解决方案与测试用例 **注**:由于第三方数据许可限制与采购协议约束,完整数据集仅公开部分子集。本次发布的子集经过精心挑选,在符合所有版权要求的同时,保留了足够的研究实用性。 ## 数据集构建 ### 数据来源 - **数学领域**:题目取自开源数据集,经过严格的质量控制筛选 - **代码领域**:题目取自开源编程竞赛资源,附带经过验证的解决方案 ### 精选流程 本数据集历经严格的三阶段精选流程: 1. **数据清洗**: - 移除包含无效字符、图片或多小问的题目 - 基于字符与语义进行严格的去重处理 - 排除易猜测答案的题目(如选择题、判断题) 2. **答案验证**: - 使用不同规模的大语言模型(Large Language Model,简称LLM)进行验证 - 人工专家标注 - 未通过验证的题目将被剔除 3. **数据标注**: - 多维度标注(来源、教育层级、领域知识) - 为数学题目添加数学主题分类(Mathematical Subject Classification,简称MSC)标签 - 基于模型适配性的难度评估 ## 数据集字段 本数据集针对各领域包含以下字段: ### 数学领域字段 - **context**:题目描述 - **groundtruth**:经过验证的正确答案 - **type**:题目类别 - **mid**:唯一题目ID - **subject**:所属学科 ### 代码领域字段 - **context**:详细的编程题目描述 - **groundtruth**:经过验证的正确Python解决方案代码 - **groundtruth_language**:实现语言 - **type**:题目类别 - **code_test_cases**:经过验证的测试用例列表,包含: - **input**:测试输入 - **output**:预期输出 - **dataset**:来源数据集 - **code_language**:编程语言 - **difficulty**:题目难度得分 - **mid**:唯一题目ID ## 引用信息 **若您使用本数据集,请引用我们的技术报告《Ring-lite》[Ring-lite](https://arxiv.org/abs/2506.14731):** @misc{ringteam2025ringlitescalablereasoningc3postabilized, title={Ring-lite: Scalable Reasoning via C3PO-Stabilized Reinforcement Learning for LLMs}, author={Ling Team and Bin Hu and Cai Chen and Deng Zhao and Ding Liu and Dingnan Jin and Feng Zhu and Hao Dai and Hongzhi Luan and Jia Guo and Jiaming Liu and Jiewei Wu and Jun Mei and Jun Zhou and Junbo Zhao and Junwu Xiong and Kaihong Zhang and Kuan Xu and Lei Liang and Liang Jiang and Liangcheng Fu and Longfei Zheng and Qiang Gao and Qing Cui and Quan Wan and Shaomian Zheng and Shuaicheng Li and Tongkai Yang and Wang Ren and Xiaodong Yan and Xiaopei Wan and Xiaoyun Feng and Xin Zhao and Xinxing Yang and Xinyu Kong and Xuemin Yang and Yang Li and Yingting Wu and Yongkang Liu and Zhankai Xu and Zhenduo Zhang and Zhenglei Zhou and Zhenyu Huang and Zhiqiang Zhang and Zihao Wang and Zujie Wen}, year={2025}, eprint={2506.14731}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2506.14731}, } ## 预期用途 本数据集旨在用于: - 针对多领域推理任务的大语言模型训练与评估 - 强化学习应用 - 跨数学与代码领域的模型性能基准测试 ## 发布日期 2025年6月20日 ## 数据版本 1.0
提供机构:
maas
创建时间:
2025-06-19
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作