five

chargoddard/hellaswag-train-10k

收藏
Hugging Face2024-05-16 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/chargoddard/hellaswag-train-10k
下载链接
链接失效反馈
官方服务:
资源简介:
--- dataset_info: features: - name: ind dtype: int32 - name: activity_label dtype: string - name: ctx_a dtype: string - name: ctx_b dtype: string - name: ctx dtype: string - name: endings sequence: string - name: source_id dtype: string - name: split dtype: string - name: split_type dtype: string - name: label dtype: string splits: - name: train num_bytes: 10833886.480390929 num_examples: 10000 - name: train_4k num_bytes: 4333554.592156371 num_examples: 4000 - name: train_2k num_bytes: 2166777.2960781856 num_examples: 2000 - name: fewshot num_bytes: 10833886.480390929 num_examples: 10000 download_size: 16616377 dataset_size: 28168104.849016413 configs: - config_name: default data_files: - split: train path: data/train-* - split: train_4k path: data/train_4k-* - split: train_2k path: data/train_2k-* - split: fewshot path: data/fewshot-* ---

The dataset includes multiple features such as integer type ind and string type activity_label, ctx_a, ctx_b, ctx, endings, source_id, split, split_type, label. The dataset is divided into several subsets including train, train_4k, train_2k, fewshot, each with corresponding byte size and number of examples. The total download size and actual size of the dataset are also provided.
提供机构:
chargoddard
原始信息汇总

数据集概述

数据集特征

  • ind: 整数类型 (int32)
  • activity_label: 字符串类型 (string)
  • ctx_a: 字符串类型 (string)
  • ctx_b: 字符串类型 (string)
  • ctx: 字符串类型 (string)
  • endings: 字符串序列类型 (sequence: string)
  • source_id: 字符串类型 (string)
  • split: 字符串类型 (string)
  • split_type: 字符串类型 (string)
  • label: 字符串类型 (string)

数据集分割

  • train: 包含10000个样本,总大小为10833886.480390929字节
  • train_4k: 包含4000个样本,总大小为4333554.592156371字节
  • train_2k: 包含2000个样本,总大小为2166777.2960781856字节
  • fewshot: 包含10000个样本,总大小为10833886.480390929字节

数据集大小

  • 下载大小: 16616377字节
  • 数据集总大小: 28168104.849016413字节

配置文件

  • config_name: default
  • data_files:
    • split: train, path: data/train-*
    • split: train_4k, path: data/train_4k-*
    • split: train_2k, path: data/train_2k-*
    • split: fewshot, path: data/fewshot-*
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作