SIGHAN15
收藏arXiv2025-09-30 收录
下载链接:
http://ir.itc.ntnu.edu.tw/lre/sighan8csc.html
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个用于评估中文语法错误校正(CGEC)的基准数据集,其中包含2,339个样本用于训练,以及1,100个样本用于测试。它被广泛用作衡量模型性能的基准数据集。该数据集的规模为训练样本2,339个,测试样本1,100个,针对的任务是中文语法错误校正。
This dataset is a benchmark dataset for evaluating Chinese Grammatical Error Correction (CGEC), which contains 2,339 samples for training and 1,100 samples for testing. It is widely used as a benchmark to measure model performance. The dataset has 2,339 training samples and 1,100 test samples, targeting the task of Chinese Grammatical Error Correction.



