kreasof-ai/SPL-Combined
收藏Hugging Face2025-03-14 更新2025-04-12 收录
下载链接:
https://hf-mirror.com/datasets/kreasof-ai/SPL-Combined
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集包含了文本(text)和其对应的损坏文本(corrupted_text),以及用于模型训练的input_ids、attention_mask和labels。数据集分为训练集和验证集,其中训练集包含189万个示例,大小为53.4GB,验证集包含21万个示例,大小为5.9GB。整个数据集的大小为59.4GB,下载大小为3.1GB。数据集的配置文件提供了训练集和验证集的数据文件路径。
The dataset includes text and its corresponding corrupted text, as well as input_ids, attention_mask, and labels for model training. The dataset is divided into a training set and a validation set, with the training set containing 1,890,000 examples and size of 53.4GB, and the validation set containing 210,000 examples and size of 5.9GB. The total size of the dataset is 59.4GB, with a download size of 3.1GB. The configuration file of the dataset provides the paths to the data files for the training and validation sets.
提供机构:
kreasof-ai



