Robust-Decoding/HH_gemma-2-2b-it

Name: Robust-Decoding/HH_gemma-2-2b-it
Creator: Robust-Decoding
Published: 2025-03-14 15:40:41
License: 暂无描述

Hugging Face2025-03-14 更新2025-04-26 收录

下载链接：

https://hf-mirror.com/datasets/Robust-Decoding/HH_gemma-2-2b-it

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集是基于Helpful-Harmless数据集的提示，使用gemma-2-2b-it模型生成的响应集合。每个提示生成了4个响应，每个响应长达256个token。这些响应用于训练价值函数和测试《Robust Multi-Objective Decoding》论文中的方法，并使用Ray2333/gpt2-large-helpful-reward_model和Ray2333/gpt2-large-harmless-reward_model进行评估。

This dataset is a collection of responses generated from the prompts of the Helpful-Harmless dataset using the gemma-2-2b-it model. Each prompt generates 4 responses, each up to 256 tokens in length. These responses are used for training value functions and testing methods in the Robust Multi-Objective Decoding paper, and are evaluated using Ray2333/gpt2-large-helpful-reward_model and Ray2333/gpt2-large-harmless-reward_model.

提供机构：

Robust-Decoding

5,000+

优质数据集

54 个

任务类型

进入经典数据集