SIGHAN 2014
收藏arXiv2025-09-30 收录
下载链接:
https://github.com/wdimmy/Automatic-Corpus-Generation/blob/master/corpus/train.sgml
下载链接
链接失效反馈官方服务:
资源简介:
该数据集来源于中文拼写检查比赛,用于拼写错误校正。它包含了标准的训练和测试数据划分,并进行了预处理步骤,将繁体中文转换为简体中文,以便更好地执行拼写错误校正任务。
This dataset is derived from a Chinese Spelling Check Competition and is intended for spelling error correction tasks. It includes standardized training and test data splits, and has undergone preprocessing steps to convert Traditional Chinese into Simplified Chinese to better facilitate the execution of spelling error correction tasks.
提供机构:
SIGHAN



