formula-x/prefChat_1k
收藏Hugging Face2025-09-25 更新2025-11-30 收录
下载链接:
https://hf-mirror.com/datasets/formula-x/prefChat_1k
下载链接
链接失效反馈官方服务:
资源简介:
prefChat-1k是一个由Formula X团队精心策划的偏好数据集,设计用于通过Direct Preference Optimization (DPO)方法对会话模型进行对齐。每个实例包含一个用户发言(prompt)、一个选定的类似人类的回复(chosen)和一个被拒绝的机械回复(rejected)。该数据集的目标是教会模型偏好自然、相关的回复,而不是机械/机器人般的声音。
prefChat-1k is a curated preference dataset by Formula X designed for aligning conversational models using Direct Preference Optimization (DPO). Each example includes a user utterance (prompt), a chosen human-like response, and a rejected robotic response. The goal of the dataset is to teach models to prefer natural, relatable replies over mechanical/robotic-sounding ones.
提供机构:
formula-x



