haritzpuerto/the_pile_00_StackExchange
收藏Hugging Face2024-09-24 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/haritzpuerto/the_pile_00_StackExchange
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含文本和元数据两个主要特征,其中元数据中有一个名为pile_set_name的字段。数据集被分为训练集、验证集和测试集,分别包含984049、29950和30378个示例,对应的字节大小为2191240391、66520547和66957983。总下载大小为1221118982字节,数据集总大小为2324718921字节。数据文件路径分别为data/train-*、data/validation-*和data/test-*。
The dataset includes two main features: text and metadata, with a field named pile_set_name in the metadata. The dataset is divided into training, validation, and test sets, containing 984049, 29950, and 30378 examples respectively, with corresponding byte sizes of 2191240391, 66520547, and 66957983. The total download size is 1221118982 bytes, and the total dataset size is 2324718921 bytes. The data file paths are data/train-*, data/validation-*, and data/test-*.
提供机构:
haritzpuerto



