Robust-Decoding/HH_gemma-2-2b-it
收藏Hugging Face2025-03-14 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Robust-Decoding/HH_gemma-2-2b-it
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是基于Helpful-Harmless数据集的提示,使用gemma-2-2b-it模型生成的响应集合。每个提示生成了4个响应,每个响应长达256个token。这些响应用于训练价值函数和测试《Robust Multi-Objective Decoding》论文中的方法,并使用Ray2333/gpt2-large-helpful-reward_model和Ray2333/gpt2-large-harmless-reward_model进行评估。
This dataset is a collection of responses generated from the prompts of the Helpful-Harmless dataset using the gemma-2-2b-it model. Each prompt generates 4 responses, each up to 256 tokens in length. These responses are used for training value functions and testing methods in the Robust Multi-Objective Decoding paper, and are evaluated using Ray2333/gpt2-large-helpful-reward_model and Ray2333/gpt2-large-harmless-reward_model.
提供机构:
Robust-Decoding



