minashirinchi/perspell-tokens-thirtytwo-labeled
收藏Hugging Face2025-08-05 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/minashirinchi/perspell-tokens-thirtytwo-labeled
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个文本纠错数据集,包含错误文本和相应的正确文本。数据集中的特征包括文本输入的token ID列表(input_ids)、注意力掩码(attention_mask)、标签(labels)、token列表(tokens)、单词ID列表(word_ids)和纠正后的文本列表(corrections)。数据集分为训练集、验证集和测试集,分别用于模型的训练、验证和测试。
This dataset is a text correction dataset containing pairs of incorrect and correct texts. The features in the dataset include lists of token IDs (input_ids), attention masks (attention_mask), labels (labels), list of tokens (tokens), list of word IDs (word_ids), and list of corrected texts (corrections). The dataset is split into training, validation, and test sets for model training, validation, and testing respectively.
提供机构:
minashirinchi



