Harmlessness Reward Model

arXiv2025-09-30 收录

下载链接：

https://huggingface.co/Ray2333/gpt2-large-harmless-reward_model

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集旨在评估生成文本的无害性，它设计了一个奖励模型来完成这一任务。此外，该模型是多目标强化学习算法训练中使用的奖励模型之一。具体任务是对强化学习进行奖励建模。

This dataset aims to evaluate the harmlessness of generated text, and a reward model is constructed for this purpose. In addition, this reward model is among those utilized during the training of multi-objective reinforcement learning algorithms. Its specific task is reward modeling for reinforcement learning.

5,000+

优质数据集

54 个

任务类型

进入经典数据集