c4-200m
收藏OpenXLab2026-04-18 收录
下载链接:
https://openxlab.org.cn/datasets/OpenDataLab/c4-200m
下载链接
链接失效反馈官方服务:
资源简介:
c4-200m is a collection of 185 million sentence pairs generated from the cleaned English dataset from C4. This dataset can be used in grammatical error correction (GEC) tasks.
The corruption edits and scripts used to synthesize this dataset is referenced from: C4-200M Synthetic Dataset
提供机构:
OpenDataLab
创建时间:
2023-12-07



