c4-200m

Name: c4-200m
Creator: OpenDataLab
License: 暂无描述

OpenXLab2026-04-18 收录

下载链接：

https://openxlab.org.cn/datasets/OpenDataLab/c4-200m

下载链接

链接失效反馈

官方服务：

资源简介：

c4-200m is a collection of 185 million sentence pairs generated from the cleaned English dataset from C4. This dataset can be used in grammatical error correction (GEC) tasks. The corruption edits and scripts used to synthesize this dataset is referenced from: C4-200M Synthetic Dataset

提供机构：

OpenDataLab

创建时间：

2023-12-07