Macropodus/csc_public_de3
收藏Hugging Face2025-01-20 更新2025-02-15 收录
下载链接:
https://hf-mirror.com/datasets/Macropodus/csc_public_de3
下载链接
链接失效反馈官方服务:
资源简介:
csc_public_de3数据集是一个中文文本纠错数据集,主要针对的地得这类常见语法错误进行纠错。数据集由人民日报、学习强国网站等高质量数据源人工生成,包括训练集、验证集和测试集,总计约141843条数据。句子长度分布广泛,平均长度为36,适合用于训练中文文本纠错模型。
The csc_public_de3 dataset is a Chinese text correction dataset focusing on correcting common grammatical errors such as de, di, and de. It is generated from high-quality sources such as Peoples Daily and Learning Power website, including training, validation, and test sets, totaling about 141843 entries. The sentence lengths are diverse, with an average length of 36, making it suitable for training Chinese text correction models.
提供机构:
Macropodus



