alakxender/dv-synthetic-errors-mixed
收藏Hugging Face2025-04-21 更新2026-03-29 收录
下载链接:
https://hf-mirror.com/datasets/alakxender/dv-synthetic-errors-mixed
下载链接
链接失效反馈官方服务:
资源简介:
---
dataset_info:
features:
- name: correct
dtype: string
- name: incorrect
dtype: string
splits:
- name: train
num_bytes: 7447530381
num_examples: 18966984
download_size: 2181739369
dataset_size: 7447530381
configs:
- config_name: default
data_files:
- split: train
path: data/train-*
license: apache-2.0
language:
- dv
tags:
- dhivehi
- thaana
- errors
pretty_name: thaana-errors
size_categories:
- 10M<n<100M
---
DV Text Errors
Dhivehi text error correction dataset containing correct sentences and synthetically generated errors.
The dataset aims to test Dhivehi language error correction models and tools.
About Dataset
Task: Text error correction
Language: Dhivehi (dv)
Dataset Structure
Input-output pairs of Dhivehi text:
correct: Original correct sentences
incorrect: Sentences with synthetic errors
***Note: This is replica of alakxender/dv-synthetic-errors: added more synthetic errors. x5***
提供机构:
alakxender



