mncai/orpo-text-pairs

Name: mncai/orpo-text-pairs
Creator: mncai
Published: 2026-02-05 14:11:07
License: 暂无描述

Hugging Face2026-02-05 更新2026-02-07 收录

下载链接：

https://hf-mirror.com/datasets/mncai/orpo-text-pairs

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含用于训练语言模型的偏好对，适用于ORPO（Odds Ratio Preference Optimization）、DPO或类似的基于偏好的对齐方法。数据集包含8,249个经过过滤/精炼的偏好对，格式为JSONL，语言为英语，任务为纯文本偏好学习（无图像）。每个行包含以下字段：`prompt`（聊天消息，用户回合）、`chosen`（首选响应）、`rejected`（非首选响应）和`meta`（元数据，包括来源数据集、使用的模型和判断信息）。元数据字段包括来源数据集名称、原始行索引、是否包含图像（始终为`false`）、生成响应的模型、判断决策以及样本是否适合训练。数据集来源于多个来源数据集，包括HelpSteer2、MathInstruct、CodeIO-PyEdu-Reasoning和MathV360K，使用时需遵守各自的许可证条款。

This dataset contains preference pairs for training language models using ORPO (Odds Ratio Preference Optimization), DPO, or similar preference-based alignment methods. The dataset consists of 8,249 filtered/refined preference pairs in JSONL format, with English text-only preference learning (no images). Each row includes fields for `prompt` (chat messages, user turn), `chosen` (preferred response), `rejected` (non-preferred response), and `meta` (metadata including source dataset, models used, and judge info). Meta fields include the source dataset name, original row index, has_image (always `false` for this dataset), models that generated responses, judge decisions, and whether the sample is trainable. The dataset is derived from multiple source datasets, including HelpSteer2, MathInstruct, CodeIO-PyEdu-Reasoning, and MathV360K, and users must respect each datasets license terms.

提供机构：

mncai

5,000+

优质数据集

54 个

任务类型

进入经典数据集