Ro551/WikiCorrupted_spanish_to_GEC-GED_myyyT
收藏Hugging Face2025-12-16 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/Ro551/WikiCorrupted_spanish_to_GEC-GED_myyyT
下载链接
链接失效反馈官方服务:
资源简介:
该数据集包含带有语法错误的句子,每个句子都有一个corrupted(错误)版本、tokens(分词)、error_tags(错误标签)、error_type(错误类型)和corrupted_tagged(标记错误)字段。错误标签被分类为特定类别,如G-gen、G-nsing等,表示不同类型的语法错误。数据集分为训练集、验证集和测试集,并指定了每个分区的示例数量和字节大小。
This dataset contains sentences with grammatical errors, where each sentence has a corrupted version, tokens, error tags, error types, and a corrupted_tagged field. The error tags are classified into specific categories such as G-gen, G-nsing, etc., indicating different types of grammatical errors. The dataset is divided into train, validation, and test splits with specified numbers of examples and bytes.
提供机构:
Ro551



