allenai/tulu-3-ultrafeedback-cleaned-on-policy-8b

Name: allenai/tulu-3-ultrafeedback-cleaned-on-policy-8b
Creator: allenai
Published: 2024-11-21 16:48:22
License: 暂无描述

Hugging Face2024-11-21 更新2024-12-14 收录

下载链接：

https://hf-mirror.com/datasets/allenai/tulu-3-ultrafeedback-cleaned-on-policy-8b

下载链接

链接失效反馈

官方服务：

资源简介：

Llama 3.1 Tulu 3 Ultrafeedback (Cleaned) (on-policy 8B)数据集是一个偏好数据集，包含了来自Ai2的Ultrafeedback清理版本的提示，并进一步过滤了ShareGPT的实例。数据集包含41.6k生成对，这些生成对是通过多个模型生成的，包括Mistral、Tulu、Yi、MPT、Google Gemma、InternLM、Falcon、Qwen、Llama、GPT和Claude等。生成方法结合了on-policy和off-policy数据，并使用Ultrafeedback模板和LLM法官进行偏好标注。数据集许可证为ODC-BY，适用于研究和教育用途。

This preference dataset is part of our Tulu 3 preference mixture. It contains prompts from Ai2s cleaned version of Ultrafeedback, further filtered to remove instances from ShareGPT. The dataset contains 41.6k generation pairs generated using various models including Mistral, Tulu, Yi, MPT, Google Gemma, InternLM, Falcon, Qwen, Llama, GPT-4, and Claude. The generation approach combines on-policy and off-policy data, and uses the Ultrafeedback template and an LLM judge for preference annotations. The dataset is licensed under ODC-BY, intended for research and educational use, and follows Ai2s Responsible Use Guidelines.

提供机构：

allenai

5,000+

优质数据集

54 个

任务类型

进入经典数据集