five

booydar/babilong-1k-samples

收藏
Hugging Face2024-05-21 更新2024-05-25 收录
下载链接:
https://hf-mirror.com/datasets/booydar/babilong-1k-samples
下载链接
链接失效反馈
官方服务:
资源简介:
--- language: - en dataset_info: - config_name: 0k features: - name: target dtype: string - name: input dtype: string - name: question dtype: string splits: - name: qa1 num_bytes: 214511 num_examples: 1000 - name: qa2 num_bytes: 497258 num_examples: 999 - name: qa3 num_bytes: 1515195 num_examples: 999 - name: qa4 num_bytes: 118279 num_examples: 999 - name: qa5 num_bytes: 617596 num_examples: 999 download_size: 355443 dataset_size: 2962839 - config_name: 128k features: - name: target dtype: string - name: question dtype: string - name: input dtype: string splits: - name: qa1 num_bytes: 507056606 num_examples: 1000 - name: qa2 num_bytes: 506895155 num_examples: 999 - name: qa3 num_bytes: 506392085 num_examples: 999 - name: qa4 num_bytes: 505933273 num_examples: 999 - name: qa5 num_bytes: 506678193 num_examples: 999 download_size: 1567936012 dataset_size: 2532955312 - config_name: 16k features: - name: target dtype: string - name: input dtype: string - name: question dtype: string splits: - name: qa1 num_bytes: 61776253 num_examples: 1000 - name: qa2 num_bytes: 61918118 num_examples: 999 - name: qa3 num_bytes: 62127205 num_examples: 999 - name: qa4 num_bytes: 61819981 num_examples: 999 - name: qa5 num_bytes: 61618082 num_examples: 999 download_size: 191994799 dataset_size: 309259639 - config_name: 1k features: - name: target dtype: string - name: input dtype: string - name: question dtype: string splits: - name: qa1 num_bytes: 2801155 num_examples: 1000 - name: qa2 num_bytes: 2836748 num_examples: 999 - name: qa3 num_bytes: 2586775 num_examples: 862 - name: qa4 num_bytes: 2780635 num_examples: 999 - name: qa5 num_bytes: 2833684 num_examples: 997 download_size: 8143277 dataset_size: 13838997 - config_name: 2k features: - name: target dtype: string - name: input dtype: string - name: question dtype: string splits: - name: qa1 num_bytes: 6732635 num_examples: 1000 - name: qa2 num_bytes: 6726012 num_examples: 999 - name: qa3 num_bytes: 6915887 num_examples: 998 - name: qa4 num_bytes: 6657774 num_examples: 999 - name: qa5 num_bytes: 6717935 num_examples: 999 download_size: 20623714 dataset_size: 33750243 - config_name: 32k features: - name: question dtype: string - name: input dtype: string - name: target dtype: string splits: - name: qa1 num_bytes: 125475409 num_examples: 1000 - name: qa2 num_bytes: 125188567 num_examples: 999 - name: qa3 num_bytes: 125820515 num_examples: 999 - name: qa4 num_bytes: 125548589 num_examples: 999 - name: qa5 num_bytes: 125758751 num_examples: 999 download_size: 389385950 dataset_size: 627791831 - config_name: 4k features: - name: target dtype: string - name: input dtype: string - name: question dtype: string splits: - name: qa1 num_bytes: 14544692 num_examples: 1000 - name: qa2 num_bytes: 14490282 num_examples: 999 - name: qa3 num_bytes: 14809504 num_examples: 999 - name: qa4 num_bytes: 14373460 num_examples: 999 - name: qa5 num_bytes: 14626210 num_examples: 999 download_size: 45139181 dataset_size: 72844148 - config_name: 64k features: - name: question dtype: string - name: input dtype: string - name: target dtype: string splits: - name: qa1 num_bytes: 252925262 num_examples: 1000 - name: qa2 num_bytes: 252376557 num_examples: 999 - name: qa3 num_bytes: 252406388 num_examples: 999 - name: qa4 num_bytes: 251983216 num_examples: 999 - name: qa5 num_bytes: 252531238 num_examples: 999 download_size: 783464022 dataset_size: 1262222661 - config_name: 8k features: - name: target dtype: string - name: input dtype: string - name: question dtype: string splits: - name: qa1 num_bytes: 30154491 num_examples: 1000 - name: qa2 num_bytes: 29997147 num_examples: 999 - name: qa3 num_bytes: 30237437 num_examples: 999 - name: qa4 num_bytes: 30289396 num_examples: 999 - name: qa5 num_bytes: 30114676 num_examples: 999 download_size: 93474610 dataset_size: 150793147 configs: - config_name: 0k data_files: - split: qa1 path: 0k/qa1-* - split: qa2 path: 0k/qa2-* - split: qa3 path: 0k/qa3-* - split: qa4 path: 0k/qa4-* - split: qa5 path: 0k/qa5-* - config_name: 128k data_files: - split: qa1 path: 128k/qa1-* - split: qa2 path: 128k/qa2-* - split: qa3 path: 128k/qa3-* - split: qa4 path: 128k/qa4-* - split: qa5 path: 128k/qa5-* - config_name: 16k data_files: - split: qa1 path: 16k/qa1-* - split: qa2 path: 16k/qa2-* - split: qa3 path: 16k/qa3-* - split: qa4 path: 16k/qa4-* - split: qa5 path: 16k/qa5-* - config_name: 1k data_files: - split: qa1 path: 1k/qa1-* - split: qa2 path: 1k/qa2-* - split: qa3 path: 1k/qa3-* - split: qa4 path: 1k/qa4-* - split: qa5 path: 1k/qa5-* - config_name: 2k data_files: - split: qa1 path: 2k/qa1-* - split: qa2 path: 2k/qa2-* - split: qa3 path: 2k/qa3-* - split: qa4 path: 2k/qa4-* - split: qa5 path: 2k/qa5-* - config_name: 32k data_files: - split: qa1 path: 32k/qa1-* - split: qa2 path: 32k/qa2-* - split: qa3 path: 32k/qa3-* - split: qa4 path: 32k/qa4-* - split: qa5 path: 32k/qa5-* - config_name: 4k data_files: - split: qa1 path: 4k/qa1-* - split: qa2 path: 4k/qa2-* - split: qa3 path: 4k/qa3-* - split: qa4 path: 4k/qa4-* - split: qa5 path: 4k/qa5-* - config_name: 64k data_files: - split: qa1 path: 64k/qa1-* - split: qa2 path: 64k/qa2-* - split: qa3 path: 64k/qa3-* - split: qa4 path: 64k/qa4-* - split: qa5 path: 64k/qa5-* - config_name: 8k data_files: - split: qa1 path: 8k/qa1-* - split: qa2 path: 8k/qa2-* - split: qa3 path: 8k/qa3-* - split: qa4 path: 8k/qa4-* - split: qa5 path: 8k/qa5-* ---

The dataset includes multiple configurations (0k, 128k, 16k, 1k, 2k, 32k, 4k, 64k, 8k), each with specific features including target, input, and question, all of dtype string. Each configuration has multiple splits (qa1, qa2, qa3, qa4, qa5) with specified number of bytes and examples. Additionally, it mentions the download size and dataset size for each configuration. The data files for each configuration are also specified with paths for each split.
提供机构:
booydar
原始信息汇总

数据集概述

配置信息

0k

  • 特征:
    • target: string
    • input: string
    • question: string
  • 分割:
    • qa1: 214511 字节, 1000 个样本
    • qa2: 497258 字节, 999 个样本
    • qa3: 1515195 字节, 999 个样本
    • qa4: 118279 字节, 999 个样本
    • qa5: 617596 字节, 999 个样本
  • 下载大小: 355443 字节
  • 数据集大小: 2962839 字节

128k

  • 特征:
    • target: string
    • question: string
    • input: string
  • 分割:
    • qa1: 507056606 字节, 1000 个样本
    • qa2: 506895155 字节, 999 个样本
    • qa3: 506392085 字节, 999 个样本
    • qa4: 505933273 字节, 999 个样本
    • qa5: 506678193 字节, 999 个样本
  • 下载大小: 1567936012 字节
  • 数据集大小: 2532955312 字节

16k

  • 特征:
    • target: string
    • input: string
    • question: string
  • 分割:
    • qa1: 61776253 字节, 1000 个样本
    • qa2: 61918118 字节, 999 个样本
    • qa3: 62127205 字节, 999 个样本
    • qa4: 61819981 字节, 999 个样本
    • qa5: 61618082 字节, 999 个样本
  • 下载大小: 191994799 字节
  • 数据集大小: 309259639 字节

1k

  • 特征:
    • target: string
    • input: string
    • question: string
  • 分割:
    • qa1: 2801155 字节, 1000 个样本
    • qa2: 2836748 字节, 999 个样本
    • qa3: 2586775 字节, 862 个样本
    • qa4: 2780635 字节, 999 个样本
    • qa5: 2833684 字节, 997 个样本
  • 下载大小: 8143277 字节
  • 数据集大小: 13838997 字节

2k

  • 特征:
    • target: string
    • input: string
    • question: string
  • 分割:
    • qa1: 6732635 字节, 1000 个样本
    • qa2: 6726012 字节, 999 个样本
    • qa3: 6915887 字节, 998 个样本
    • qa4: 6657774 字节, 999 个样本
    • qa5: 6717935 字节, 999 个样本
  • 下载大小: 20623714 字节
  • 数据集大小: 33750243 字节

32k

  • 特征:
    • question: string
    • input: string
    • target: string
  • 分割:
    • qa1: 125475409 字节, 1000 个样本
    • qa2: 125188567 字节, 999 个样本
    • qa3: 125820515 字节, 999 个样本
    • qa4: 125548589 字节, 999 个样本
    • qa5: 125758751 字节, 999 个样本
  • 下载大小: 389385950 字节
  • 数据集大小: 627791831 字节

4k

  • 特征:
    • target: string
    • input: string
    • question: string
  • 分割:
    • qa1: 14544692 字节, 1000 个样本
    • qa2: 14490282 字节, 999 个样本
    • qa3: 14809504 字节, 999 个样本
    • qa4: 14373460 字节, 999 个样本
    • qa5: 14626210 字节, 999 个样本
  • 下载大小: 45139181 字节
  • 数据集大小: 72844148 字节

64k

  • 特征:
    • question: string
    • input: string
    • target: string
  • 分割:
    • qa1: 252925262 字节, 1000 个样本
    • qa2: 252376557 字节, 999 个样本
    • qa3: 252406388 字节, 999 个样本
    • qa4: 251983216 字节, 999 个样本
    • qa5: 252531238 字节, 999 个样本
  • 下载大小: 783464022 字节
  • 数据集大小: 1262222661 字节

8k

  • 特征:
    • target: string
    • input: string
    • question: string
  • 分割:
    • qa1: 30154491 字节, 1000 个样本
    • qa2: 29997147 字节, 999 个样本
    • qa3: 30237437 字节, 999 个样本
    • qa4: 30289396 字节, 999 个样本
    • qa5: 30114676 字节, 999 个样本
  • 下载大小: 93474610 字节
  • 数据集大小: 150793147 字节
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作