davanstrien/magpie-preference

Name: davanstrien/magpie-preference
Creator: davanstrien
Published: 2024-07-22 06:45:57
License: 暂无描述

Hugging Face2024-07-22 更新2024-06-25 收录

下载链接：

https://hf-mirror.com/datasets/davanstrien/magpie-preference

下载链接

链接失效反馈

官方服务：

资源简介：

Magpie Preference数据集是一个通过Magpie方法生成的合成指令-响应对，并结合人类偏好标签的众包数据集。该数据集通过用户与Magpie Preference Gradio Space的交互不断更新。数据集的主要任务是支持语言模型的偏好学习，特别是在指令跟随和响应生成方面。数据集的结构包括时间戳、生成的指令、生成的响应、用户偏好标签和会话ID。数据集的创建目的是支持语言模型偏好学习的研究，特别是使用Magpie方法生成高质量合成数据。数据集的注释是通过用户提供的二元偏好标签进行的，用户无需特定资格。数据集不包含个人信息，每个会话分配一个随机UUID。使用该数据集时需要考虑生成模型和用户反馈偏好中的偏见，以及数据质量依赖于用户的理解和反馈的细致程度。

The Magpie Preference Dataset is a crowdsourced collection of human preferences on synthetic instruction-response pairs generated using the Magpie approach. This dataset is continuously updated through user interactions with the Magpie Preference Gradio Space. The primary task supported by this dataset is preference learning for language models, particularly in the context of instruction-following and response generation. The dataset structure includes a timestamp, generated instruction, generated response, user preference label, and session ID. The datasets creation rationale is to support research in preference learning for language models, particularly using the Magpie approach for generating high-quality synthetic data. Annotations are in the form of binary preference labels provided by users of the Gradio Space, and no specific qualifications are required for annotators. The dataset should not contain personal information, and each session is assigned a random UUID. Considerations for using the data include potential biases in the generating model and user feedback preferences, as well as the dependency of data quality on users understanding and diligence in providing feedback.

提供机构：

davanstrien

原始信息汇总

数据集卡片：Magpie Preference 数据集

数据集描述

Magpie Preference 数据集是一个众包收集的人类对使用 Magpie 方法生成的合成指令-响应对偏好的集合。该数据集通过用户与 Magpie Preference Gradio Space 的交互持续更新。

数据集概述

该数据集包含由大型语言模型（LLM）使用 Magpie 方法生成的指令-响应对和人类偏好标签。数据通过 Gradio 界面收集，用户可以生成指令-响应对并提供质量反馈。

支持的任务

该数据集主要支持语言模型的偏好学习任务，特别是在指令遵循和响应生成方面。

语言

数据集中的语言取决于用于生成的模型（meta-llama/Meta-Llama-3-8B-Instruct）。主要语言是英语，但也可能包括模型支持的其他语言。

数据集结构

数据实例

数据集中的每个实例包含：

时间戳
生成的指令（提示）
生成的响应（完成）
用户偏好标签（点赞/踩）
会话 ID

数据字段

timestamp：数据生成和评级时的 ISO 格式时间戳
prompt：LLM 生成的指令
completion：LLM 生成的响应
label：表示用户偏好的二进制标签（true 表示点赞，false 表示踩）
session_id：用于分组同一会话反馈的 UUID

数据分割

该数据集没有预定义的分割，它持续更新新条目。

数据集创建

策划理由

该数据集支持语言模型的偏好学习研究，特别是使用 Magpie 方法生成高质量合成数据。

源数据

源数据实时生成使用 meta-llama/Meta-Llama-3-8B-Instruct。

初始数据收集和规范化

指令和响应使用预定义模板和 LLM 生成。用户偏好通过 Gradio 界面收集。

注释

注释以二进制偏好标签的形式由 Gradio Space 的用户提供。

注释过程

用户通过 Gradio 界面生成指令-响应对并提供点赞/踩反馈。

注释者

注释者是公共 Gradio Space 的用户，不需要特定资格。

个人和敏感信息

数据集不应包含个人信息。每个会话分配一个随机 UUID，不收集用户识别信息。

使用数据的注意事项

数据集的社会影响

该数据集旨在提高语言模型遵循指令和生成高质量响应的能力，可能带来更有用和一致的 AI 系统。

偏见的讨论

数据集可能反映生成模型和用户反馈偏好的偏见。在使用数据集时应考虑这些偏见。

其他已知限制

数据质量取决于用户在提供反馈时的理解和细致程度。
数据集持续演变，可能导致时间上的不一致。

附加信息

数据集策展人

该数据集由 Magpie Preference Gradio Space 的创建者和 Hugging Face 社区的贡献者策展。

引用信息

如果您使用此数据集，请引用 Magpie 论文：

bibtex @misc{xu2024magpie, title={Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing}, author={Zhangchen Xu and Fengqing Jiang and Luyao Niu and Yuntian Deng and Radha Poovendran and Yejin Choi and Bill Yuchen Lin}, year={2024}, eprint={2406.08464}, archivePrefix={arXiv}, primaryClass={cs.CL} }

贡献

该数据集因 Magpie Preference Gradio Space 用户的贡献而不断增长。我们欢迎并感谢所有贡献！

5,000+

优质数据集

54 个

任务类型

进入经典数据集