kp7742/YALM-pretrain1-tokenized-envy2
收藏Hugging Face2025-03-18 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/kp7742/YALM-pretrain1-tokenized-envy2
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含了输入ID、token类型ID、注意力掩码和标签等序列特征。数据集分为训练集和测试集,训练集包含约19755528个样本,测试集包含199551个样本,适合用于自然语言处理任务。
The dataset includes sequence features such as input IDs, token type IDs, attention masks, and labels. It is split into a training set with approximately 19755528 examples and a test set with 199551 examples, suitable for natural language processing tasks.
提供机构:
kp7742



