p1atdev/202408-at20240906-tokenized-shuffle-241014
收藏Hugging Face2024-10-15 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/p1atdev/202408-at20240906-tokenized-shuffle-241014
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个主要分割:训练集和测试集。训练集包含7,635,694个示例,占用约928,847,774.54字节;测试集包含10,000个示例,占用约1,216,454.95字节。数据集的总大小约为930,064,229.48字节,下载大小为523,969,256字节。数据被切分为256个token。
The dataset consists of two main splits: a training set and a test set. The training set contains 7,635,694 examples, occupying approximately 928,847,774.54 bytes; the test set contains 10,000 examples, occupying approximately 1,216,454.95 bytes. The total size of the dataset is approximately 930,064,229.48 bytes, with a download size of 523,969,256 bytes. The data is split into 256 tokens.
提供机构:
p1atdev



