five

mcarthuradal/malawi

收藏
Hugging Face2024-03-13 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/mcarthuradal/malawi
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: - config_name: booklets features: - name: content dtype: string - name: index dtype: int64 splits: - name: bk0 num_bytes: 154222 num_examples: 695 - name: bk1 num_bytes: 203638 num_examples: 941 - name: bk2 num_bytes: 300902 num_examples: 2064 - name: bk3 num_bytes: 180070 num_examples: 740 - name: bk4 num_bytes: 45812 num_examples: 270 - name: bk5 num_bytes: 460592 num_examples: 515 download_size: 6325415 dataset_size: 1798680 - config_name: context features: - name: Section dtype: string - name: Heading dtype: string - name: Paragraph dtype: string - name: 'No' dtype: int64 splits: - name: bk0 num_bytes: 204564 num_examples: 695 - name: bk1 num_bytes: 272474 num_examples: 939 - name: bk2 num_bytes: 441129 num_examples: 2064 - name: bk3 num_bytes: 253362 num_examples: 740 - name: bk4 num_bytes: 59109 num_examples: 270 - name: bk5 num_bytes: 452144 num_examples: 515 download_size: 4734383 dataset_size: 1682782 - config_name: default features: - name: Question Text dtype: string - name: Question Answer dtype: string - name: Reference Document dtype: string - name: Paragraph(s) Number dtype: string - name: Keywords dtype: string - name: ID dtype: string splits: - name: train num_bytes: 345473 num_examples: 748 - name: test num_bytes: 71682 num_examples: 499 download_size: 2803571 dataset_size: 417155 - config_name: individual features: - name: ID dtype: string - name: Question Text dtype: string - name: Question Answer dtype: string - name: Reference Document dtype: string - name: Paragraph(s) Number dtype: string - name: Individual Numbers dtype: string - name: Keywords dtype: string splits: - name: train num_bytes: 356169 num_examples: 748 download_size: 179442 dataset_size: 356169 - config_name: merged features: - name: Question Text dtype: string - name: Context dtype: string - name: Question Answer dtype: string - name: Reference Document dtype: string - name: Paragraph(s) Number dtype: string - name: Keywords dtype: string - name: ID dtype: string - name: Individual Numbers sequence: int64 - name: __index_level_0__ dtype: int64 splits: - name: train num_bytes: 442161 num_examples: 748 download_size: 210107 dataset_size: 442161 configs: - config_name: booklets data_files: - split: bk0 path: booklets/bk0-* - split: bk1 path: booklets/bk1-* - split: bk2 path: booklets/bk2-* - split: bk3 path: booklets/bk3-* - split: bk4 path: booklets/bk4-* - split: bk5 path: booklets/bk5-* - config_name: context data_files: - split: bk0 path: context/bk0-* - split: bk1 path: context/bk1-* - split: bk2 path: context/bk2-* - split: bk3 path: context/bk3-* - split: bk4 path: context/bk4-* - split: bk5 path: context/bk5-* - config_name: default data_files: - split: train path: data/train-* - split: test path: data/test-* - config_name: individual data_files: - split: train path: individual/train-* - config_name: merged data_files: - split: train path: merged/train-* ---
提供机构:
mcarthuradal
原始信息汇总

数据集详情

配置:booklets

  • 特征:
    • content: 字符串
    • index: 整数
  • 分割:
    • bk0: 154222 字节, 695 个样本
    • bk1: 203638 字节, 941 个样本
    • bk2: 300902 字节, 2064 个样本
    • bk3: 180070 字节, 740 个样本
    • bk4: 45812 字节, 270 个样本
    • bk5: 460592 字节, 515 个样本
  • 下载大小: 6325415 字节
  • 数据集大小: 1798680 字节

配置:context

  • 特征:
    • Section: 字符串
    • Heading: 字符串
    • Paragraph: 字符串
    • No: 整数
  • 分割:
    • bk0: 204564 字节, 695 个样本
    • bk1: 272474 字节, 939 个样本
    • bk2: 441129 字节, 2064 个样本
    • bk3: 253362 字节, 740 个样本
    • bk4: 59109 字节, 270 个样本
    • bk5: 452144 字节, 515 个样本
  • 下载大小: 4734383 字节
  • 数据集大小: 1682782 字节

配置:default

  • 特征:
    • Question Text: 字符串
    • Question Answer: 字符串
    • Reference Document: 字符串
    • Paragraph(s) Number: 字符串
    • Keywords: 字符串
    • ID: 字符串
  • 分割:
    • train: 345473 字节, 748 个样本
    • test: 71682 字节, 499 个样本
  • 下载大小: 2803571 字节
  • 数据集大小: 417155 字节

配置:individual

  • 特征:
    • ID: 字符串
    • Question Text: 字符串
    • Question Answer: 字符串
    • Reference Document: 字符串
    • Paragraph(s) Number: 字符串
    • Individual Numbers: 字符串
    • Keywords: 字符串
  • 分割:
    • train: 356169 字节, 748 个样本
  • 下载大小: 179442 字节
  • 数据集大小: 356169 字节

配置:merged

  • 特征:
    • Question Text: 字符串
    • Context: 字符串
    • Question Answer: 字符串
    • Reference Document: 字符串
    • Paragraph(s) Number: 字符串
    • Keywords: 字符串
    • ID: 字符串
    • Individual Numbers: 整数序列
    • __index_level_0__: 整数
  • 分割:
    • train: 442161 字节, 748 个样本
  • 下载大小: 210107 字节
  • 数据集大小: 442161 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作