ShenaoZ/0.0001_withdpo_5iters_bs256_5102lr_dataset
收藏Hugging Face2024-05-08 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/ShenaoZ/0.0001_withdpo_5iters_bs256_5102lr_dataset
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: prompt
dtype: string
- name: prompt_id
dtype: string
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
- name: score_chosen
dtype: float64
- name: score_rejected
dtype: float64
- name: reference_response
dtype: string
- name: chosen
list:
- name: content
dtype: string
- name: role
dtype: string
- name: rejected
list:
- name: content
dtype: string
- name: role
dtype: string
splits:
- name: test_prefs_1
num_bytes: 17123520
num_examples: 2000
- name: train_prefs_1
num_bytes: 103596044
num_examples: 12227
- name: test_prefs_2
num_bytes: 17162212
num_examples: 2000
- name: train_prefs_2
num_bytes: 105196032
num_examples: 12227
- name: test_prefs_3
num_bytes: 17159863
num_examples: 2000
- name: train_prefs_3
num_bytes: 104817478
num_examples: 12227
- name: test_prefs_4
num_bytes: 17193567
num_examples: 2000
- name: train_prefs_4
num_bytes: 105334421
num_examples: 12227
download_size: 262075507
dataset_size: 487583137
configs:
- config_name: default
data_files:
- split: test_prefs_1
path: data/test_prefs_1-*
- split: train_prefs_1
path: data/train_prefs_1-*
- split: test_prefs_2
path: data/test_prefs_2-*
- split: train_prefs_2
path: data/train_prefs_2-*
- split: test_prefs_3
path: data/test_prefs_3-*
- split: train_prefs_3
path: data/train_prefs_3-*
- split: test_prefs_4
path: data/test_prefs_4-*
- split: train_prefs_4
path: data/train_prefs_4-*
---
# Dataset Card for "0.0001_withdpo_5iters_bs256_5102lr_dataset"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset includes multiple features such as prompt, prompt_id, messages (containing content and role), score_chosen, score_rejected, reference_response, chosen, and rejected. Each feature has its data type. The dataset is divided into multiple splits, each with different file paths and number of examples. The total download size and actual size of the dataset are also provided.
提供机构:
ShenaoZ
原始信息汇总
数据集概述
数据集名称
- 名称: 0.0001_withdpo_5iters_bs256_5102lr_dataset
数据集特征
- prompt: 字符串类型
- prompt_id: 字符串类型
- messages: 列表类型,包含以下子特征:
- content: 字符串类型
- role: 字符串类型
- score_chosen: 浮点数类型 (float64)
- score_rejected: 浮点数类型 (float64)
- reference_response: 字符串类型
- chosen: 列表类型,包含以下子特征:
- content: 字符串类型
- role: 字符串类型
- rejected: 列表类型,包含以下子特征:
- content: 字符串类型
- role: 字符串类型
数据集拆分
- test_prefs_1: 2000个示例,17123520字节
- train_prefs_1: 12227个示例,103596044字节
- test_prefs_2: 2000个示例,17162212字节
- train_prefs_2: 12227个示例,105196032字节
- test_prefs_3: 2000个示例,17159863字节
- train_prefs_3: 12227个示例,104817478字节
- test_prefs_4: 2000个示例,17193567字节
- train_prefs_4: 12227个示例,105334421字节
数据集大小
- 下载大小: 262075507字节
- 数据集大小: 487583137字节
配置文件
- config_name: default
- data_files:
- split: test_prefs_1, path: data/test_prefs_1-*
- split: train_prefs_1, path: data/train_prefs_1-*
- split: test_prefs_2, path: data/test_prefs_2-*
- split: train_prefs_2, path: data/train_prefs_2-*
- split: test_prefs_3, path: data/test_prefs_3-*
- split: train_prefs_3, path: data/train_prefs_3-*
- split: test_prefs_4, path: data/test_prefs_4-*
- split: train_prefs_4, path: data/train_prefs_4-*



