murat/kyrgyz_sentences_with_incorrect_and_correct_umlaut_characters
收藏Hugging Face2025-08-18 更新2025-10-25 收录
下载链接:
https://hf-mirror.com/datasets/murat/kyrgyz_sentences_with_incorrect_and_correct_umlaut_characters
下载链接
链接失效反馈官方服务:
资源简介:
这个数据集旨在为语言模型进行微调,以进行基里尔字符的修正。它解决了数字吉尔吉斯文本中特定字符被俄语键盘对应字符替换的问题。数据集采用对话提示格式,每个数据点包括一个可能包含字符替换的用户输入文本和一个包含正确吉尔吉斯字符的助手修正文本。
This dataset is designed to fine-tune language models for a Kyrgyz-to-Kyrgyz orthographic correction task. It addresses the common issue of specific Cyrillic characters specific to the Kyrgyz language being replaced by their Russian keyboard counterparts.
提供机构:
murat



