leduo123/assignment4-preferences
收藏Hugging Face2026-04-27 更新2026-05-03 收录
下载链接:
https://hf-mirror.com/datasets/leduo123/assignment4-preferences
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: prompt
dtype: string
- name: chosen
dtype: string
- name: rejected
dtype: string
splits:
- name: train
num_bytes: 135314
num_examples: 50
download_size: 90303
dataset_size: 135314
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Assignment 4 Preference Dataset
This dataset was created for Assignment 4.
It contains 50 preference training examples built from instructions sampled from the LIMA dataset. For each instruction, 5 candidate responses were generated using `Qwen/Qwen2.5-7B-Instruct`, and the responses were ranked with `llm-blender/PairRM`. The highest-ranked response was used as `chosen`, and the lowest-ranked response was used as `rejected`.
## Fields
- `prompt`: the chat-formatted training prompt
- `chosen`: the preferred response
- `rejected`: the less preferred response
## Usage
This dataset is intended for preference-based fine-tuning methods such as DPO.
提供机构:
leduo123



