argilla/ultrafeedback_binarized_full
收藏Hugging Face2023-11-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/argilla/ultrafeedback_binarized_full
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: source
dtype: string
- name: instruction
dtype: string
- name: best_response
struct:
- name: annotations
struct:
- name: helpfulness
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: Rationale For Rating
dtype: string
- name: Type
sequence: string
- name: honesty
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: instruction_following
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: truthfulness
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: Rationale For Rating
dtype: string
- name: Type
sequence: string
- name: critique
dtype: string
- name: custom_system_prompt
dtype: string
- name: model
dtype: string
- name: overall_score
dtype: float64
- name: principle
dtype: string
- name: response
dtype: string
- name: best_model
dtype: string
- name: best_score
dtype: float64
- name: random_response
struct:
- name: annotations
struct:
- name: helpfulness
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: Rationale For Rating
dtype: string
- name: Type
sequence: string
- name: honesty
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: instruction_following
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: truthfulness
struct:
- name: Rating
dtype: string
- name: Rationale
dtype: string
- name: Rationale For Rating
dtype: string
- name: Type
sequence: string
- name: critique
dtype: string
- name: custom_system_prompt
dtype: string
- name: model
dtype: string
- name: overall_score
dtype: float64
- name: principle
dtype: string
- name: response
dtype: string
- name: random_model
dtype: string
- name: random_score
dtype: float64
- name: correct_answers
sequence: string
- name: incorrect_answers
sequence: string
splits:
- name: train
num_bytes: 447221757
num_examples: 63967
download_size: 199896433
dataset_size: 447221757
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
---
# Dataset Card for "ultrafeedback_binarized_full"
[More Information needed](https://github.com/huggingface/datasets/blob/main/CONTRIBUTING.md#how-to-contribute-to-the-dataset-cards)
The dataset ultrafeedback_binarized_full includes multiple features such as source, instruction, best response, random response, etc. Each response contains multiple annotations like helpfulness, honesty, instruction following, and truthfulness, each with ratings and rationales. Additionally, the dataset includes model, overall score, principle, response, best model, best score, random model, random score, correct answers, and incorrect answers. The dataset is split into a training set with 63967 samples.
提供机构:
argilla
原始信息汇总
数据集概述
数据集信息
- 特征列表:
- source: 数据类型为字符串。
- instruction: 数据类型为字符串。
- best_response: 包含以下结构:
- annotations: 包含以下结构:
- helpfulness: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- Rationale For Rating: 数据类型为字符串。
- Type: 数据类型为字符串序列。
- honesty: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- instruction_following: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- truthfulness: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- Rationale For Rating: 数据类型为字符串。
- Type: 数据类型为字符串序列。
- helpfulness: 包含以下结构:
- critique: 数据类型为字符串。
- custom_system_prompt: 数据类型为字符串。
- model: 数据类型为字符串。
- overall_score: 数据类型为浮点数(float64)。
- principle: 数据类型为字符串。
- response: 数据类型为字符串。
- annotations: 包含以下结构:
- best_model: 数据类型为字符串。
- best_score: 数据类型为浮点数(float64)。
- random_response: 包含以下结构:
- annotations: 包含以下结构:
- helpfulness: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- Rationale For Rating: 数据类型为字符串。
- Type: 数据类型为字符串序列。
- honesty: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- instruction_following: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- truthfulness: 包含以下结构:
- Rating: 数据类型为字符串。
- Rationale: 数据类型为字符串。
- Rationale For Rating: 数据类型为字符串。
- Type: 数据类型为字符串序列。
- helpfulness: 包含以下结构:
- critique: 数据类型为字符串。
- custom_system_prompt: 数据类型为字符串。
- model: 数据类型为字符串。
- overall_score: 数据类型为浮点数(float64)。
- principle: 数据类型为字符串。
- response: 数据类型为字符串。
- annotations: 包含以下结构:
- random_model: 数据类型为字符串。
- random_score: 数据类型为浮点数(float64)。
- correct_answers: 数据类型为字符串序列。
- incorrect_answers: 数据类型为字符串序列。
数据集分割
- train:
- 字节数: 447221757
- 样本数: 63967
数据集大小
- 下载大小: 199896433 字节
- 数据集大小: 447221757 字节
配置
- config_name: default
- data_files:
- split: train
- path: data/train-*
- split: train
- data_files:



