27Group/noisy_zarma
收藏Hugging Face2025-08-13 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/27Group/noisy_zarma
下载链接
链接失效反馈官方服务:
资源简介:
Zarma Noisy Dataset是一个包含人工引入噪声的Zarma语句的数据集,用于模拟人类错误。该数据集适用于语法错误纠正、文本去噪和低资源语言(如Zarma)的自然语言处理鲁棒性测试。数据集由原始的单语Zarma数据集通过添加不同类型的噪声(包括字符级别和单词级别的修改)衍变而来。
The Zarma Noisy Dataset is a collection of Zarma sentences with artificially introduced noise to simulate human-like errors. It is designed for tasks such as grammatical error correction, text denoising, and robustness testing in natural language processing for low-resource languages like Zarma. The dataset is derived from a clean monolingual Zarma dataset by applying various types of noise, including character-level and word-level modifications.
提供机构:
27Group



