rookshanks/pile_uncopyrighted_1024
收藏Hugging Face2025-01-22 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/rookshanks/pile_uncopyrighted_1024
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含input_ids特征的序列数据集,分为训练集、验证集和测试集三个部分。训练集包含23920745个示例,大小为67882864972字节;验证集包含361788个示例,大小为1020745168字节;测试集包含362652个示例,大小为1026589516字节。数据集总大小为69930199656字节,下载大小为34817006772字节。
This is a sequence dataset with the feature input_ids, divided into three parts: training set, validation set, and test set. The training set contains 23,920,745 examples, with a size of 67,882,864,972 bytes; the validation set contains 361,788 examples, with a size of 1,020,745,168 bytes; the test set contains 362,652 examples, with a size of 1,026,589,516 bytes. The total size of the dataset is 69,930,199,656 bytes, and the download size is 34,817,006,772 bytes.
提供机构:
rookshanks



