amuvarma/stt-contentonly-textloss-300k
收藏Hugging Face2024-11-06 更新2024-12-14 收录
下载链接:
https://hf-mirror.com/datasets/amuvarma/stt-contentonly-textloss-300k
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含三个主要特征:input_ids(int32类型序列)、attention_mask(int8类型序列)和labels(int64类型序列)。数据集分为一个训练集(train),包含300,000个示例,总大小为6,099,300,000字节。下载大小为651,323,549字节。数据文件路径为data/train-*。
The dataset includes three main features: input_ids (sequence of int32), attention_mask (sequence of int8), and labels (sequence of int64). It is divided into one training set (train) containing 300,000 examples, with a total size of 6,099,300,000 bytes. The download size is 651,323,549 bytes. The data files are located at data/train-*.
提供机构:
amuvarma



