LLM360/guru-RL-92k-extra-info-compressed
收藏Hugging Face2025-06-25 更新2025-07-05 收录
下载链接:
https://hf-mirror.com/datasets/LLM360/guru-RL-92k-extra-info-compressed
下载链接
链接失效反馈官方服务:
资源简介:
Guru是一个包含六个不同推理密集型领域(数学、编程、科学、逻辑、模拟和表格推理)的精选数据集,旨在为大型语言模型(LLM)提供用于复杂推理任务和强化学习(RL)训练的高质量样本。该数据集通过五阶段的精选流程进行质量保证,并包含特定领域的奖励函数以便于可靠评估。
Guru is a curated dataset across six reasoning-intensive domains (math, coding, science, logic, simulation, and tabular reasoning) designed for large language models (LLM) to perform complex reasoning tasks and reinforcement learning (RL) training. It contains high-quality samples that have gone through a five-stage curation pipeline for quality assurance and includes domain-specific reward functions for reliable evaluation.
提供机构:
LLM360



