allenai/tulu-3-ultrafeedback-cleaned-on-policy-70b

Name: allenai/tulu-3-ultrafeedback-cleaned-on-policy-70b
Creator: allenai
Published: 2024-11-21 15:59:11
License: 暂无描述

Hugging Face2024-11-21 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/allenai/tulu-3-ultrafeedback-cleaned-on-policy-70b

下载链接

链接失效反馈

官方服务：

资源简介：

Llama 3.1 Tulu 3 Ultrafeedback (Cleaned) (on-policy 70B)数据集是一个偏好数据集，包含了来自Ai2的Ultrafeedback清理版本的提示，并进一步过滤了ShareGPT的实例。数据集包含了41.6k的生成对，其中一些是使用Llama-3.1-Tulu-3-70B模型生成的。数据集的特征包括id、prompt、chosen和rejected，其中chosen和rejected都是包含content和role的列表。数据集的分割包括train，包含41634个示例和229587065字节的数据。数据集的生成方法是通过合成管道结合on-policy和off-policy数据，并使用Ultrafeedback模板和LLM法官在四个不同方面获得偏好注释。数据集遵循ODC-BY许可证，适用于研究和教育用途。

The Llama 3.1 Tulu 3 Ultrafeedback (Cleaned) (on-policy 70B) dataset is a preference dataset that includes prompts from Ai2s cleaned version of Ultrafeedback, further filtered to remove instances from ShareGPT. It contains 41.6k generation pairs, some of which are generated using the Llama-3.1-Tulu-3-70B model. The dataset features include id, prompt, chosen, and rejected, where chosen and rejected are lists containing content and role. The dataset split includes train, with 41634 examples and 229587065 bytes of data. The generation approach involves a synthetic pipeline combining on-policy and off-policy data, with preference annotations obtained on four different aspects using the Ultrafeedback template and an LLM judge. The dataset is licensed under ODC-BY and is intended for research and educational use.

提供机构：

allenai

5,000+

优质数据集

54 个

任务类型

进入经典数据集