alea-institute/kl3m-data-sample-005-shuffled-test
收藏Hugging Face2025-11-11 更新2025-11-15 收录
下载链接:
https://hf-mirror.com/datasets/alea-institute/kl3m-data-sample-005-shuffled-test
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含三个字段:标识符、MIME类型和文本内容。数据集被划分为训练集,共有100万个示例,大小为2,802,324,403字节。此外,该数据集的下载大小为1,360,134,365字节。
The dataset includes three fields: identifier, MIME type, and text content. The dataset is split into a training set, which contains 1,000,000 examples and is 2,802,324,403 bytes in size. Additionally, the download size of the dataset is 1,360,134,365 bytes.
提供机构:
alea-institute



