five

antony-bryan-3D2Y/synthetic-preference-data

收藏
Hugging Face2026-04-22 更新2026-04-26 收录
下载链接:
https://hf-mirror.com/datasets/antony-bryan-3D2Y/synthetic-preference-data
下载链接
链接失效反馈
官方服务:
资源简介:
一个小型的、合成的偏好数据集,旨在测试RLHF/DPO训练流程。每个示例包含一个提示和两个响应,一个是正确的(chosen),另一个是有细微缺陷的(rejected)。数据集由OpenAI的gpt-4o-mini模型生成,采用了特定的生成方法和提示模板。数据集经过多阶段的过滤流程,包括长度、独特性、去重和安全性检查。数据集分为训练集和测试集,并进行了质量审核。已知的局限性包括单模型生成、小规模、合成的拒绝响应、仅限英语以及技术主题偏向。

A small, synthetically generated preference dataset intended for testing RLHF / DPO training pipelines. Each example contains a prompt and two responses — one correct (`chosen`), one subtly flawed (`rejected`). The dataset is generated using OpenAIs gpt-4o-mini model with specific generation methods and prompt templates. It undergoes a multi-stage filtering pipeline including length, distinctness, deduplication, and safety checks. The dataset is split into training and test sets and has undergone a quality audit. Known limitations include single-model generation, small scale, synthetic rejected responses, English-only content, and a skew toward technical topics.
提供机构:
antony-bryan-3D2Y
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作