five

MLP-SEMO/eval_datasets

收藏
Hugging Face2024-05-24 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/MLP-SEMO/eval_datasets
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: BookSum features: - name: output dtype: string - name: context dtype: string - name: instruction dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 93606936 num_examples: 1484 download_size: 54478608 dataset_size: 93606936 - config_name: BoolQ features: - name: instruction dtype: string - name: context dtype: string - name: output dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 6891356 num_examples: 3270 download_size: 3892421 dataset_size: 6891356 - config_name: CNN-DM features: - name: context dtype: string - name: output dtype: string - name: instruction dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 167811674 num_examples: 13368 download_size: 98983832 dataset_size: 167811674 - config_name: CosmosQA features: - name: context dtype: string - name: instruction dtype: string - name: output dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 5700702 num_examples: 2985 download_size: 2675997 dataset_size: 5700702 - config_name: DROP features: - name: context dtype: string - name: instruction dtype: string - name: output dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 33717950 num_examples: 9535 download_size: 2129159 dataset_size: 33717950 - config_name: GovReport features: - name: context dtype: string - name: output dtype: string - name: instruction dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 161869948 num_examples: 973 download_size: 76684247 dataset_size: 161869948 - config_name: HotpotQA features: - name: instruction dtype: string - name: output dtype: string - name: context sequence: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 130649377 num_examples: 7405 download_size: 77167918 dataset_size: 130649377 - config_name: ReCoRD features: - name: context dtype: string - name: instruction dtype: string - name: output dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 36350165 num_examples: 10000 download_size: 16728065 dataset_size: 36350165 - config_name: SQuAD features: - name: context dtype: string - name: instruction dtype: string - name: output dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 28976334 num_examples: 10570 download_size: 4267576 dataset_size: 28976334 - config_name: XSum features: - name: context dtype: string - name: output dtype: string - name: instruction dtype: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 78896418 num_examples: 11332 download_size: 49124930 dataset_size: 78896418 - config_name: infbench-choice features: - name: context dtype: string - name: instruction dtype: string - name: output sequence: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 208601113 num_examples: 119 download_size: 129856115 dataset_size: 208601113 - config_name: infbench-qa features: - name: context dtype: string - name: instruction dtype: string - name: output sequence: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 354313400 num_examples: 190 download_size: 219611755 dataset_size: 354313400 - config_name: infbench-sum features: - name: context dtype: string - name: instruction dtype: string - name: output sequence: string - name: instruction_sentence sequence: string - name: context_sentence sequence: string - name: context_sentences sequence: string splits: - name: validation num_bytes: 102964962 num_examples: 62 download_size: 63669989 dataset_size: 102964962 configs: - config_name: BookSum data_files: - split: validation path: BookSum/validation-* - config_name: BoolQ data_files: - split: validation path: BoolQ/validation-* - config_name: CNN-DM data_files: - split: validation path: CNN-DM/validation-* - config_name: CosmosQA data_files: - split: validation path: CosmosQA/validation-* - config_name: DROP data_files: - split: validation path: DROP/validation-* - config_name: GovReport data_files: - split: validation path: GovReport/validation-* - config_name: HotpotQA data_files: - split: validation path: HotpotQA/validation-* - config_name: ReCoRD data_files: - split: validation path: ReCoRD/validation-* - config_name: SQuAD data_files: - split: validation path: SQuAD/validation-* - config_name: XSum data_files: - split: validation path: XSum/validation-* - config_name: infbench-choice data_files: - split: validation path: infbench-choice/validation-* - config_name: infbench-qa data_files: - split: validation path: infbench-qa/validation-* - config_name: infbench-sum data_files: - split: validation path: infbench-sum/validation-* ---

The dataset includes multiple configurations such as BookSum, BoolQ, CNN-DM, etc., each with specific features and validation sets. Features include output, context, instruction, and related sentence sequences. Each configurations validation set has specific byte counts and example numbers. The dataset size and download size are also specified in each configuration.
提供机构:
MLP-SEMO
原始信息汇总

数据集概述

BookSum

  • 特征: output, context, instruction, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 93,606,936字节
    • 示例数: 1,484
    • 下载大小: 54,478,608字节

BoolQ

  • 特征: instruction, context, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 6,891,356字节
    • 示例数: 3,270
    • 下载大小: 3,892,421字节

CNN-DM

  • 特征: context, output, instruction, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 167,811,674字节
    • 示例数: 13,368
    • 下载大小: 98,983,832字节

CosmosQA

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 5,700,702字节
    • 示例数: 2,985
    • 下载大小: 2,675,997字节

DROP

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 33,717,950字节
    • 示例数: 9,535
    • 下载大小: 2,129,159字节

GovReport

  • 特征: context, output, instruction, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 161,869,948字节
    • 示例数: 973
    • 下载大小: 76,684,247字节

HotpotQA

  • 特征: instruction, output, context, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 130,649,377字节
    • 示例数: 7,405
    • 下载大小: 77,167,918字节

ReCoRD

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 36,350,165字节
    • 示例数: 10,000
    • 下载大小: 16,728,065字节

SQuAD

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 28,976,334字节
    • 示例数: 10,570
    • 下载大小: 4,267,576字节

XSum

  • 特征: context, output, instruction, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 78,896,418字节
    • 示例数: 11,332
    • 下载大小: 49,124,930字节

infbench-choice

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 208,601,113字节
    • 示例数: 119
    • 下载大小: 129,856,115字节

infbench-qa

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 354,313,400字节
    • 示例数: 190
    • 下载大小: 219,611,755字节

infbench-sum

  • 特征: context, instruction, output, instruction_sentence, context_sentence, context_sentences
  • 数据类型: string
  • 验证集信息:
    • 大小: 102,964,962字节
    • 示例数: 62
    • 下载大小: 63,669,989字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作