bismarck91/enA-frA-tokenised-part3
收藏Hugging Face2025-04-09 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/bismarck91/enA-frA-tokenised-part3
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个机器学习或自然语言处理训练数据集,包含约9万个训练样本。数据集的特征包括input_ids(整数序列,int32类型)、labels(标签序列,int64类型)和attention_mask(注意力掩码序列,int8类型)。数据集目前只有一个训练集(train),文件存储在data/train-*路径下。
This dataset is a machine learning or natural language processing training dataset containing approximately 90,000 training samples. The features of the dataset include input_ids (integer sequence, int32 type), labels (label sequence, int64 type), and attention_mask (attention mask sequence, int8 type). The dataset currently has only one training set (train), and the files are stored in the path data/train-*.
提供机构:
bismarck91



