mncai/orpo-vlm-pairs-full
收藏Hugging Face2026-02-06 更新2026-02-07 收录
下载链接:
https://hf-mirror.com/datasets/mncai/orpo-vlm-pairs-full
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个版本的视觉语言偏好对,用于使用ORPO、DPO或类似的基于偏好的对齐方法训练VLM模型。数据集包含67,754行经过精炼/过滤的对(推荐使用)和94,346行过滤前的完整数据集。图像数量为11,982张,格式为JSONL + PNG图像,语言为英语,任务为视觉语言偏好学习。每个行包含prompt(带有图像引用的聊天消息)、chosen(首选响应)、rejected(非首选响应)和meta(包括源数据集、使用的模型、判断信息等元数据)。图像在prompt中引用为{"type": "image", "image": "images/docmatix-3.png"}。元字段包括dataset(源数据集名称)、row_index(源中的原始行索引)、has_image(此数据集为true)、chosen_model/rejected_model(生成响应的模型)、judge_choice_1st/judge_choice_2nd(判断决策)和trainable(样本是否适合训练)。数据集源自HuggingFaceM4/Docmatix,许可证继承自源数据集。
This dataset contains two versions of vision-language preference pairs for training VLM models using ORPO, DPO, or similar preference-based alignment methods. The dataset includes 67,754 rows of refined/filtered pairs (recommended) and 94,346 rows of the full dataset before filtering. There are 11,982 images, the format is JSONL + PNG images, the language is English, and the task is vision-language preference learning. Each row contains prompt (chat messages with image references), chosen (preferred response), rejected (non-preferred response), and meta (metadata including source dataset, models used, judge info, etc.). Images are referenced in the prompt as {"type": "image", "image": "images/docmatix-3.png"}. Meta fields include dataset (source dataset name), row_index (original row index in source), has_image (true for this dataset), chosen_model/rejected_model (models that generated responses), judge_choice_1st/judge_choice_2nd (judge decisions), and trainable (whether the sample is suitable for training). The dataset is derived from HuggingFaceM4/Docmatix, and the license inherits from the source datasets.
提供机构:
mncai



