LLM360/guru-RL-92k

Name: LLM360/guru-RL-92k
Creator: LLM360
Published: 2025-08-20 18:16:52
License: 暂无描述

Hugging Face2025-08-20 更新2025-07-05 收录

下载链接：

https://hf-mirror.com/datasets/LLM360/guru-RL-92k

下载链接

链接失效反馈

官方服务：

资源简介：

Guru是一个专门为训练大型语言模型（LLM）进行复杂推理而设计的六领域数据集，采用强化学习（RL）。该数据集包含了91.9K高质量样本，跨越六个不同的推理密集型领域，并经过五阶段的精选流程，以确保领域的多样性和奖励的可验证性。数据集旨在提高LLM在数学、编程、科学、逻辑、模拟和表格推理领域的跨领域推理能力。

Guru is a curated six-domain dataset designed for training large language models (LLM) for complex reasoning with reinforcement learning (RL). The dataset contains 91.9K high-quality samples spanning six diverse reasoning-intensive domains, processed through a comprehensive five-stage curation pipeline to ensure both domain diversity and reward verifiability, aiming to enhance the cross-domain reasoning capabilities of LLMs in areas such as math, coding, science, logic, simulation, and tabular reasoning.

提供机构：

LLM360

5,000+

优质数据集

54 个

任务类型

进入经典数据集