Ali-C137/ultrafeedback-arabic
收藏Hugging Face2024-01-31 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/Ali-C137/ultrafeedback-arabic
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: prompt
dtype: string
- name: prompt_id
dtype: string
- name: chosen
list:
- name: content
dtype: string
- name: role
dtype: string
- name: rejected
list:
- name: content
dtype: string
- name: role
dtype: string
- name: messages
list:
- name: content
dtype: string
- name: role
dtype: string
- name: score_chosen
dtype: float64
- name: score_rejected
dtype: float64
splits:
- name: train
num_bytes: 592459639
num_examples: 61135
- name: test
num_bytes: 19249550
num_examples: 2000
download_size: 286659820
dataset_size: 611709189
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: test
path: data/test-*
---
The dataset includes multiple features such as prompt, prompt_id, chosen, rejected, messages, score_chosen, and score_rejected. The data types cover strings and floats. The dataset is divided into training and test sets, containing 61135 and 2000 samples respectively. The download size of the dataset is 286659820 bytes, and the total size is 611709189 bytes.
提供机构:
Ali-C137
原始信息汇总
数据集概述
数据集特征
- prompt: 字符串类型
- prompt_id: 字符串类型
- chosen: 列表类型,包含以下字段:
- content: 字符串类型
- role: 字符串类型
- rejected: 列表类型,包含以下字段:
- content: 字符串类型
- role: 字符串类型
- messages: 列表类型,包含以下字段:
- content: 字符串类型
- role: 字符串类型
- score_chosen: 浮点数类型
- score_rejected: 浮点数类型
数据集分割
- train:
- 字节数: 592459639
- 样本数: 61135
- test:
- 字节数: 19249550
- 样本数: 2000
数据集大小
- 下载大小: 286659820 字节
- 数据集大小: 611709189 字节
配置
- default:
- 数据文件路径:
- train: data/train-*
- test: data/test-*
- 数据文件路径:



