five

LocalDoc/azerbaijani_spelling

收藏
Hugging Face2024-06-03 更新2024-06-12 收录
下载链接:
https://hf-mirror.com/datasets/LocalDoc/azerbaijani_spelling
下载链接
链接失效反馈
官方服务:
资源简介:
--- license: cc-by-nc-sa-4.0 dataset_info: features: - name: index dtype: string - name: original_text dtype: string - name: correct_text dtype: string splits: - name: train num_bytes: 19647891 num_examples: 84154 download_size: 14582901 dataset_size: 19647891 configs: - config_name: default data_files: - split: train path: data/train-* task_categories: - text2text-generation language: - az tags: - spelling - azerbaijan pretty_name: Dataset of Corrected Spelling Errors in Azerbaijani size_categories: - 10K<n<100K --- # Dataset of Corrected Spelling Errors in Azerbaijani ## Overview This repository contains a dataset specifically curated for correcting spelling errors in the Azerbaijani language. The dataset consists of 84,000 text pairs, where each pair includes an original text and its corresponding corrected version. This dataset is designed to aid in the development and evaluation of machine learning models for spelling correction in Azerbaijani. ## Dataset Structure - **index**: Unique index generated using uuid. - **original_text**: Original text that was collected from social networks. - **correct_text**: Corrected version of the original_text text. ## License This dataset licensed under the CC BY-NC-ND 4.0 license. What does this license allow? Attribution: You must give appropriate credit, provide a link to the license, and indicate if changes were made. Non-Commercial: You may not use the material for commercial purposes. No Derivatives: If you remix, transform, or build upon the material, you may not distribute the modified material. For more information, please refer to the <a target="_blank" href="https://creativecommons.org/licenses/by-nc-nd/4.0/">CC BY-NC-ND 4.0 license</a>. ## Contact For more information, questions, or issues, please contact LocalDoc at [v.resad.89@gmail.com].
提供机构:
LocalDoc
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作