CHINAYWX/g06f_tokenizered
收藏Hugging Face2024-09-25 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/CHINAYWX/g06f_tokenizered
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含用于自然语言处理任务的训练和测试数据,主要特征包括input_ids(整数序列)、attention_mask(8位整数序列)和labels(64位整数序列)。数据集分为训练集和测试集,训练集包含1,000,000个样本,测试集包含10,000个样本。总下载大小为134,151,339字节,数据集总大小为458,097,870字节。
This dataset contains training and testing data for natural language processing tasks, with main features including input_ids (sequence of integers), attention_mask (sequence of 8-bit integers), and labels (sequence of 64-bit integers). The dataset is divided into a training set with 1,000,000 samples and a test set with 10,000 samples. The total download size is 134,151,339 bytes, and the total dataset size is 458,097,870 bytes.
提供机构:
CHINAYWX



