SmallDoge/decay_dataset
收藏Hugging Face2025-03-11 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/SmallDoge/decay_dataset
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含两个特征:input_ids和attention_mask,其中input_ids为序列的int32类型数据,attention_mask为序列的int8类型数据。数据集分为训练集和测试集,训练集包含3,276,800个样本,大小为5,312,610,410字节;测试集包含1,000个样本,大小为1,607,570字节。数据集的总下载大小为2,281,838,860字节,解压后的总大小为5,314,217,980字节。数据集适用于需要输入ID和注意力掩码的NLP任务。
The dataset includes two features: input_ids and attention_mask, where input_ids are sequence data of type int32, and attention_mask are sequence data of type int8. The dataset is split into a training set and a test set, with the training set containing 3,276,800 samples and being 5,312,610,410 bytes in size; the test set contains 1,000 samples and is 1,607,570 bytes in size. The total download size of the dataset is 2,281,838,860 bytes, and the total size after decompression is 5,314,217,980 bytes. The dataset is suitable for NLP tasks that require input IDs and attention masks.
提供机构:
SmallDoge



