Dolci-RLZero-General-7B
收藏魔搭社区2026-01-02 更新2025-11-29 收录
下载链接:
https://modelscope.cn/datasets/allenai/Dolci-RLZero-General-7B
下载链接
链接失效反馈官方服务:
资源简介:
# Dolci-RL-Zero-General-7B
## Dataset Summary
**Dolci-RL-Zero-General-7B** is the reinforcement learning dataset used to train the *Olmo3-RL-Zero-7B-General* model.
It contains **12,841** general chat prompts sampled from the larger Dolci-Think-RL mixture. The reward was dervied by using an LM judge.
---
## Downloading
You can download and load this data using HuggingFace's `datasets` library with the following code:
```python
from datasets import load_dataset
dataset = load_dataset("allenai/Dolci-RL-Zero-General-7B", split="train",)
```
### Licensing Information
This dataset is licensed under ODC-BY. It is intended for research and educational use in accordance with Ai2's [Responsible Use Guidelines](https://allenai.org/responsible-use).
### Citation
Technical manuscript coming soon!
# Dolci-RL-Zero-General-7B
## 数据集概述
**Dolci-RL-Zero-General-7B** 是用于训练*Olmo3-RL-Zero-7B-General*模型的强化学习(Reinforcement Learning)数据集。
该数据集包含**12,841**条通用对话提示词,均从规模更大的Dolci-Think-RL混合数据集中采样得到。其奖励信号由语言模型评判器(LM Judge)生成。
---
## 下载方式
可通过HuggingFace的`datasets`库下载并加载该数据集,代码如下:
python
from datasets import load_dataset
dataset = load_dataset("allenai/Dolci-RL-Zero-General-7B", split="train",)
### 许可信息
本数据集采用ODC-BY许可证发布,仅供研究与教育用途,需遵循AI2的《负责任使用指南》(https://allenai.org/responsible-use)。
### 引用信息
技术稿件即将发布!
提供机构:
maas
创建时间:
2025-11-21



