p1atdev/202408-at20240906-tokenized-shuffle-1
收藏Hugging Face2024-09-14 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/p1atdev/202408-at20240906-tokenized-shuffle-1
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个主要部分:训练集和测试集。训练集包含7,635,931个示例,占用约928,876,604.52字节;测试集包含10,000个示例,占用约1,216,454.95字节。数据集的总大小为930,093,059.47字节,下载大小为524,422,872字节。数据集的特征包括`input_ids`,它是一个整数序列。数据文件路径分别为`data/train-*`和`data/test-*`。
The dataset consists of two main parts: a training set and a test set. The training set contains 7,635,931 examples, occupying approximately 928,876,604.52 bytes; the test set contains 10,000 examples, occupying approximately 1,216,454.95 bytes. The total size of the dataset is 930,093,059.47 bytes, and the download size is 524,422,872 bytes. The dataset features include `input_ids`, which is a sequence of integers. The data file paths are `data/train-*` and `data/test-*`.
提供机构:
p1atdev



