SaaSttt/DPO_Dataset
收藏Hugging Face2025-12-14 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/SaaSttt/DPO_Dataset
下载链接
链接失效反馈官方服务:
资源简介:
TRAG-DPO偏好数据集(预览版)包含用于直接偏好优化(DPO)微调阶段的样本数据。目前仅提供用于演示的训练数据子集,完整数据集将在论文被接受后开源。数据集格式为JSON列表,每个条目代表一个用于对齐模型响应的偏好对,包含唯一标识符、番茄叶图像路径、用户查询、知识库检索文本、优选响应和非优选响应等字段。
The TRAG-DPO Preference Dataset (Preview) contains sample data for the Direct Preference Optimization (DPO) fine-tuning stage. Currently, only a subset of the training data is provided for demonstration purposes, and the full dataset will be open-sourced upon the acceptance of the paper. The dataset is formatted as a JSON list, where each entry represents a preference pair used to align the models responses, including fields such as a unique identifier, tomato leaf image path, user query, knowledge base retrieved text, preferred response, and dispreferred response.
提供机构:
SaaSttt



