QueenieFi/Tokenized_dataset_noText
收藏Hugging Face2024-10-30 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/QueenieFi/Tokenized_dataset_noText
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含三个字段:input_ids, attention_mask和labels,其中input_ids和attention_mask是整数序列,分别使用int32和int8类型存储,labels是整数序列,使用int64类型存储。数据集仅包含一个train split,共有982266个样本,总文件大小为52315487160字节,下载大小为2482590940字节。
The dataset consists of three fields: input_ids, attention_mask, and labels. input_ids and attention_mask are integer sequences stored as int32 and int8 types, respectively, while labels are integer sequences stored as int64 type. The dataset includes only one train split with a total of 982266 samples, with a total file size of 52315487160 bytes and a download size of 2482590940 bytes.
提供机构:
QueenieFi



