Dolci-RLZero-General-7B

Name: Dolci-RLZero-General-7B
Creator: maas
Published: 2026-01-02 16:55:33
License: 暂无描述

魔搭社区2026-01-02 更新2025-11-29 收录

下载链接：

https://modelscope.cn/datasets/allenai/Dolci-RLZero-General-7B

下载链接

链接失效反馈

官方服务：

资源简介：

# Dolci-RL-Zero-General-7B ## Dataset Summary **Dolci-RL-Zero-General-7B** is the reinforcement learning dataset used to train the *Olmo3-RL-Zero-7B-General* model. It contains **12,841** general chat prompts sampled from the larger Dolci-Think-RL mixture. The reward was dervied by using an LM judge. --- ## Downloading You can download and load this data using HuggingFace's `datasets` library with the following code: ```python from datasets import load_dataset dataset = load_dataset("allenai/Dolci-RL-Zero-General-7B", split="train",) ``` ### Licensing Information This dataset is licensed under ODC-BY. It is intended for research and educational use in accordance with Ai2's [Responsible Use Guidelines](https://allenai.org/responsible-use). ### Citation Technical manuscript coming soon!

# Dolci-RL-Zero-General-7B ## 数据集概述 **Dolci-RL-Zero-General-7B** 是用于训练*Olmo3-RL-Zero-7B-General*模型的强化学习（Reinforcement Learning）数据集。该数据集包含**12,841**条通用对话提示词，均从规模更大的Dolci-Think-RL混合数据集中采样得到。其奖励信号由语言模型评判器（LM Judge）生成。 --- ## 下载方式可通过HuggingFace的`datasets`库下载并加载该数据集，代码如下： python from datasets import load_dataset dataset = load_dataset("allenai/Dolci-RL-Zero-General-7B", split="train",) ### 许可信息本数据集采用ODC-BY许可证发布，仅供研究与教育用途，需遵循AI2的《负责任使用指南》（https://allenai.org/responsible-use）。 ### 引用信息技术稿件即将发布！

提供机构：

maas

创建时间：

2025-11-21

5,000+

优质数据集

54 个

任务类型

进入经典数据集