leogagnon/wikipedia-short-paragraphs
收藏Hugging Face2025-10-15 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/leogagnon/wikipedia-short-paragraphs
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含输入ID(input_ids)的字符串类型数据集,具有训练集(train)的划分。训练集大小为17631130个样本,数据集总大小为6556973243字节。数据集下载大小为4105532280字节。该数据集包含默认配置,并且提供了训练数据文件的路径。
This is a dataset containing string-typed input IDs with a train split. The training set consists of 17631130 examples, and the total size of the dataset is 6556973243 bytes. The download size of the dataset is 4105532280 bytes. The dataset includes a default configuration and provides the path to the training data files.
提供机构:
leogagnon



