Self-GRIT/PILE_Wikipedia_Pretraining_subset_10k-distill
收藏Hugging Face2024-08-09 更新2025-04-26 收录
下载链接:
https://hf-mirror.com/datasets/Self-GRIT/PILE_Wikipedia_Pretraining_subset_10k-distill
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: input
dtype: string
splits:
- name: train
num_bytes: 106762349
num_examples: 10000
- name: valid
num_bytes: 5352699
num_examples: 500
download_size: 47968028
dataset_size: 112115048
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
- split: valid
path: data/valid-*
---
数据集信息:
特征项:
- 字段名称:input
数据类型:字符串
数据集拆分:
- 拆分名称:train(训练集)
字节数:106762349
样本数:10000
- 拆分名称:valid(验证集)
字节数:5352699
样本数:500
下载大小:47968028
数据集总大小:112115048
配置项:
- 配置名称:default
数据文件:
- 拆分:train
路径:data/train-*
- 拆分:valid
路径:data/valid-*
提供机构:
Self-GRIT



