lhkhiem28/ultrafeedback-dpo-iter1

Name: lhkhiem28/ultrafeedback-dpo-iter1
Creator: lhkhiem28
Published: 2025-10-24 15:56:11
License: 暂无描述

Hugging Face2025-10-24 更新2025-10-25 收录

下载链接：

https://hf-mirror.com/datasets/lhkhiem28/ultrafeedback-dpo-iter1

下载链接

链接失效反馈

官方服务：

资源简介：

这是一个包含对话提示、选择的回复、被拒绝的回复以及完整对话消息的数据集。数据集特征包括对话提示文本（prompt）、提示ID（prompt_id）、选择的回复内容（chosen.content）和角色（chosen.role）、被拒绝的回复内容（rejected.content）和角色（rejected.role）、以及对话中的所有消息内容（messages.content）和角色（messages.role）。数据集提供了一个布尔字段（swap_preferences）用于表示是否交换偏好。训练集包含20378个示例，总大小为171978590字节。

This dataset includes conversation prompts, selected responses, rejected responses, and full conversation messages. The features of the dataset consist of the text of the conversation prompt (prompt), prompt ID (prompt_id), selected response content (chosen.content) and role (chosen.role), rejected response content (rejected.content) and role (rejected.role), and all message contents (messages.content) and roles (messages.role) in the conversation. The dataset provides a boolean field (swap_preferences) to indicate whether preferences are swapped. The training set contains 20378 examples with a total size of 171978590 bytes.

提供机构：

lhkhiem28

5,000+

优质数据集

54 个

任务类型

进入经典数据集