rookshanks/pile_uncopyrighted_combined_pythia_1024
收藏Hugging Face2025-05-29 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/rookshanks/pile_uncopyrighted_combined_pythia_1024
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含input_ids、attention_mask和length字段的大型数据集,分为训练集、验证集和测试集三个部分。训练集包含71697033个示例,验证集包含2240个示例,测试集包含20120个示例。数据集的总下载大小为101691331046字节,整体大小为255096935158字节。
This is a large dataset containing fields such as input_ids, attention_mask, and length, divided into three parts: training set, validation set, and test set. The training set includes 71697033 examples, the validation set includes 2240 examples, and the test set includes 20120 examples. The total download size of the dataset is 101691331046 bytes, and the overall size is 255096935158 bytes.
提供机构:
rookshanks



