chargoddard/hellaswag-train-10k
收藏Hugging Face2024-05-16 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/chargoddard/hellaswag-train-10k
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: ind
dtype: int32
- name: activity_label
dtype: string
- name: ctx_a
dtype: string
- name: ctx_b
dtype: string
- name: ctx
dtype: string
- name: endings
sequence: string
- name: source_id
dtype: string
- name: split
dtype: string
- name: split_type
dtype: string
- name: label
dtype: string
splits:
- name: train
num_bytes: 10833886.480390929
num_examples: 10000
- name: train_4k
num_bytes: 4333554.592156371
num_examples: 4000
- name: train_2k
num_bytes: 2166777.2960781856
num_examples: 2000
- name: fewshot
num_bytes: 10833886.480390929
num_examples: 10000
download_size: 16616377
dataset_size: 28168104.849016413
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: train_4k
path: data/train_4k-*
- split: train_2k
path: data/train_2k-*
- split: fewshot
path: data/fewshot-*
---
The dataset includes multiple features such as integer type ind and string type activity_label, ctx_a, ctx_b, ctx, endings, source_id, split, split_type, label. The dataset is divided into several subsets including train, train_4k, train_2k, fewshot, each with corresponding byte size and number of examples. The total download size and actual size of the dataset are also provided.
提供机构:
chargoddard
原始信息汇总
数据集概述
数据集特征
- ind: 整数类型 (int32)
- activity_label: 字符串类型 (string)
- ctx_a: 字符串类型 (string)
- ctx_b: 字符串类型 (string)
- ctx: 字符串类型 (string)
- endings: 字符串序列类型 (sequence: string)
- source_id: 字符串类型 (string)
- split: 字符串类型 (string)
- split_type: 字符串类型 (string)
- label: 字符串类型 (string)
数据集分割
- train: 包含10000个样本,总大小为10833886.480390929字节
- train_4k: 包含4000个样本,总大小为4333554.592156371字节
- train_2k: 包含2000个样本,总大小为2166777.2960781856字节
- fewshot: 包含10000个样本,总大小为10833886.480390929字节
数据集大小
- 下载大小: 16616377字节
- 数据集总大小: 28168104.849016413字节
配置文件
- config_name: default
- data_files:
- split: train, path: data/train-*
- split: train_4k, path: data/train_4k-*
- split: train_2k, path: data/train_2k-*
- split: fewshot, path: data/fewshot-*



