mcding-org/CorrectDPO-Dataset-U30

Name: mcding-org/CorrectDPO-Dataset-U30
Creator: mcding-org
Published: 2024-05-21 06:33:09
License: 暂无描述

Hugging Face2024-05-21 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/mcding-org/CorrectDPO-Dataset-U30

下载链接

链接失效反馈

官方服务：

资源简介：

--- dataset_info: features: - name: source dtype: string - name: instruction dtype: string - name: models sequence: string - name: completions list: - name: annotations struct: - name: helpfulness struct: - name: Rating dtype: string - name: Rationale dtype: string - name: Rationale For Rating dtype: string - name: Type sequence: string - name: honesty struct: - name: Rating dtype: string - name: Rationale dtype: string - name: instruction_following struct: - name: Rating dtype: string - name: Rationale dtype: string - name: truthfulness struct: - name: Rating dtype: string - name: Rationale dtype: string - name: Rationale For Rating dtype: string - name: Type sequence: string - name: critique dtype: string - name: custom_system_prompt dtype: string - name: fine-grained_score dtype: float64 - name: model dtype: string - name: overall_score dtype: float64 - name: principle dtype: string - name: response dtype: string - name: correct_answers sequence: string - name: incorrect_answers sequence: string - name: prompt dtype: string - name: chosen dtype: string - name: rejected dtype: string splits: - name: train num_bytes: 377483827 num_examples: 26901 - name: eval num_bytes: 41465252 num_examples: 2986 download_size: 171068972 dataset_size: 418949079 configs: - config_name: default data_files: - split: train path: data/train-* - split: eval path: data/eval-* ---

提供机构：

mcding-org

原始信息汇总

数据集概述

数据集特征

source: 数据类型为字符串。
instruction: 数据类型为字符串。
models: 数据类型为字符串序列。
completions: 包含多个子特征：
- annotations: 结构化数据，包含以下子特征：
  - helpfulness: 包含评级和理由等子特征。
  - honesty: 包含评级和理由等子特征。
  - instruction_following: 包含评级和理由等子特征。
  - truthfulness: 包含评级和理由等子特征。
- critique: 数据类型为字符串。
- custom_system_prompt: 数据类型为字符串。
- fine-grained_score: 数据类型为浮点数。
- model: 数据类型为字符串。
- overall_score: 数据类型为浮点数。
- principle: 数据类型为字符串。
- response: 数据类型为字符串。
correct_answers: 数据类型为字符串序列。
incorrect_answers: 数据类型为字符串序列。
prompt: 数据类型为字符串。
chosen: 数据类型为字符串。
rejected: 数据类型为字符串。

数据集分割

train: 包含26901个示例，占用377483827字节。
eval: 包含2986个示例，占用41465252字节。

数据集大小

下载大小: 171068972字节。
数据集总大小: 418949079字节。

配置文件

default: 包含训练和评估数据的路径配置。

5,000+

优质数据集

54 个

任务类型

进入经典数据集