YoussefAnwar/tokenized_ds_second_stage
收藏Hugging Face2024-10-22 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/YoussefAnwar/tokenized_ds_second_stage
下载链接
链接失效反馈官方服务:
资源简介:
这是一个包含input_ids、attention_mask和labels等特征的NLP数据集,分为训练集和验证集两部分。训练集包含1,055,727个示例,大小为3,229,226,879字节;验证集包含117,303个示例,大小为359,887,629字节。数据集总大小为3,589,114,508字节,下载大小为1,466,711,504字节。
This is an NLP dataset with features such as input_ids, attention_mask, and labels, divided into training and validation sets. The training set contains 1,055,727 examples, totaling 3,229,226,879 bytes in size; the validation set contains 117,303 examples, totaling 359,887,629 bytes in size. The total size of the dataset is 3,589,114,508 bytes, with a download size of 1,466,711,504 bytes.
提供机构:
YoussefAnwar



