edinburgh-dawg/labelchaos

Name: edinburgh-dawg/labelchaos
Creator: edinburgh-dawg
Published: 2024-05-31 09:47:47
License: 暂无描述

Hugging Face2024-05-31 更新2024-06-12 收录

下载链接：

https://hf-mirror.com/datasets/edinburgh-dawg/labelchaos

下载链接

链接失效反馈

官方服务：

资源简介：

该数据集包含多个配置，其中clean配置是将六个手动注释的数据集（OpenBookQA、ARC-Challenge、ARC-Easy、TruthfulQA、MedQA和MathQA）合并为MMLU格式的版本。数据集应用了多种数据损坏策略，包括错误答案、无正确答案、多个正确答案、问题不清晰和选项不清晰等。每个配置都有训练集、测试集和验证集，并且每个配置的特征包括问题、选项、答案、主题、原始数据集和损坏类型等。

提供机构：

edinburgh-dawg

原始信息汇总

数据集概述

本数据集包含多个配置（configs），每个配置对应不同的数据集变体，具有不同的特征和数据分割。以下是各配置的详细信息：

1. bad_options_clarity

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
- corruptions: 字符串
分割:
- train: 28730个样本，13582226字节
- test: 8432个样本，3104348字节
- validation: 7249个样本，2691106字节
下载大小: 9798546字节
数据集大小: 19377680字节

2. bad_questions_clarity

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
- corruptions: 字符串
- llm_for_corruption: 字符串
- original_question: 字符串
分割:
- train: 28730个样本，18456693字节
- test: 8432个样本，4422718字节
- validation: 7249个样本，3916109字节
下载大小: 13266776字节
数据集大小: 26795520字节

3. clean

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
分割:
- train: 28730个样本，13079684字节
- test: 8432个样本，2953255字节
- validation: 7249个样本，2557618字节
下载大小: 9879285字节
数据集大小: 18590557字节

4. clean_subsampled

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
分割:
- train: 29173个样本，10934197.161861075字节
- test: 6758个样本，2389029.595985832字节
- validation: 4076个样本，1464083.072949581字节
下载大小: 9509887字节
数据集大小: 14787309.830796488字节

5. multiple_correct_answers

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
- corruptions: 字符串
- llm for corruption: 字符串
- added_correct_answer: 字符串
分割:
- train: 28730个样本，15352477字节
- test: 8432个样本，3613882字节
- validation: 7249个样本，3073950字节
下载大小: 10862696字节
数据集大小: 22040309字节

6. no_correct_answer

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
- corruptions: 字符串
- original_correct: 字符串
分割:
- train: 28730个样本，14257614字节
- test: 8432个样本，3298967字节
- validation: 7249个样本，2854827字节
下载大小: 10129114字节
数据集大小: 20411408字节

7. small

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
- corruptions: 字符串
- llm_for_corruption: 字符串
- original_question: 字符串
- llm for corruption: 字符串
- added_correct_answer: 字符串
- original_correct: 字符串
- original_grountruth: 整数64位
分割:
- test: 1632个样本，704446.2903225806字节
下载大小: 341020字节
数据集大小: 704446.2903225806字节

8. wrong_groundtruth

特征:
- question: 字符串
- choices: 字符串序列
- answer: 整数64位
- subject: 字符串
- original_dataset: 字符串
- corruptions: 字符串
- original_grountruth: 整数64位
分割:
- train: 28730个样本，13912854字节
- test: 8432个样本，3197783字节
- validation: 7249个样本，2767839字节
下载大小: 9922795字节
数据集大小: 19878476字节

数据集分割文件路径

bad_options_clarity:
- train: bad_options_clarity/train-*
- test: bad_options_clarity/test-*
- validation: bad_options_clarity/validation-*
bad_questions_clarity:
- train: bad_questions_clarity/train-*
- test: bad_questions_clarity/test-*
- validation: bad_questions_clarity/validation-*
clean:
- train: clean/train-*
- test: clean/test-*
- validation: clean/validation-*
clean_subsampled:
- train: clean_subsampled/train-*
- test: clean_subsampled/test-*
- validation: clean_subsampled/validation-*
multiple_correct_answers:
- train: multiple_correct_answers/train-*
- test: multiple_correct_answers/test-*
- validation: multiple_correct_answers/validation-*
no_correct_answer:
- train: no_correct_answer/train-*
- test: no_correct_answer/test-*
- validation: no_correct_answer/validation-*
small:
- test: small/test-*
wrong_groundtruth:
- train: wrong_groundtruth/train-*
- test: wrong_groundtruth/test-*
- validation: wrong_groundtruth/validation-*

数据集来源

原始数据集:
- OpenBookQA (general)
- ARC-Challenge (general)
- ARC-Easy (general)
- TruthfulQA (mix)
- MedQA (medical)
- MathQA (math)

数据集中的腐败（Corruptions）策略

Wrong groundtruth: 随机选择一个错误的答案选项并相应地修改示例。
No correct answer: 将正确答案替换为“列出的每个选项”。
Multiple correct answers: 生成一个新的正确答案，其含义与原始正确答案相同。使用llm进行此操作。
Bad question clarity: 使用llm生成一个与原始问题含义相同的新问题。
Bad options clarity: 将一个错误的选项拆分为两个选项。这是一种常见的多选题腐败，其中错误的选项在解析过程中被拆分为两个选项。这里我们随机应用这种腐败到一个错误的选项。

5,000+

优质数据集

54 个

任务类型

进入经典数据集