27Group/noisy_zarma

Name: 27Group/noisy_zarma
Creator: 27Group
Published: 2025-08-13 15:45:30
License: 暂无描述

Hugging Face2025-08-13 更新2025-11-01 收录

下载链接：

https://hf-mirror.com/datasets/27Group/noisy_zarma

下载链接

链接失效反馈

官方服务：

资源简介：

Zarma Noisy Dataset是一个包含人工引入噪声的Zarma语句的数据集，用于模拟人类错误。该数据集适用于语法错误纠正、文本去噪和低资源语言（如Zarma）的自然语言处理鲁棒性测试。数据集由原始的单语Zarma数据集通过添加不同类型的噪声（包括字符级别和单词级别的修改）衍变而来。

The Zarma Noisy Dataset is a collection of Zarma sentences with artificially introduced noise to simulate human-like errors. It is designed for tasks such as grammatical error correction, text denoising, and robustness testing in natural language processing for low-resource languages like Zarma. The dataset is derived from a clean monolingual Zarma dataset by applying various types of noise, including character-level and word-level modifications.

提供机构：

27Group

5,000+

优质数据集

54 个

任务类型

进入经典数据集