TSjB/qarachay-malqar_russian_parallel_corpora_dialectic-free
收藏Hugging Face2024-04-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/TSjB/qarachay-malqar_russian_parallel_corpora_dialectic-free
下载链接
链接失效反馈官方服务:
资源简介:
---
license: cc-by-nc-sa-4.0
language:
- krc
- ru
task_categories:
- translation
dataset_info:
features:
- name: krc
dtype: string
- name: rus
dtype: string
- name: short_source
dtype: string
- name: source
dtype: string
- name: type
dtype: string
viewer: true
---
**288532** parallel sentences between russian and Qarachay-Malqar languages.
Taken from: A corpuse of Qarachay-Malqar folklore (tales, epics and etc.), Poems of Kaisyn Kuliev, Uzden Codex, Religious literature, Artistic literature, Movies, Cartoons, Soviet reports, Qarachay-Malqar phrasebook, Dictionary.
All dataset is devided into sereral **types**: one_sentence (One sentence), one_word (one word or short phrase), several_sentences (several sentences about 5 or Paragraph).
Because of dialects of Qarachay-Malqar language and diphthong change some letter on latin:
b - б/п/ф
w - ў
q - къ
g - гъ
n - нг
For translation we use this dataset.
**Authors**: [Bogdan Tewunalany](https://t.me/bogdan_tewunalany), [Ali Berberov](https://t.me/ali_berberov)
提供机构:
TSjB
原始信息汇总
数据集概述
数据集信息
- 许可证: cc-by-nc-sa-4.0
- 语言:
- krc (Qarachay-Malqar)
- ru (俄语)
- 任务类别: 翻译
数据集特征
- 特征列表:
krc: 类型为字符串rus: 类型为字符串short_source: 类型为字符串source: 类型为字符串type: 类型为字符串
数据集内容
- 句子数量: 288532对平行句子
- 来源:
- Qarachay-Malqar民间传说(故事、史诗等)
- Kaisyn Kuliev的诗歌
- Uzden Codex
- 宗教文献
- 艺术文学
- 电影
- 卡通
- 苏联报告
- Qarachay-Malqar短语手册
- 词典
- 类型:
one_sentence: 单个句子one_word: 单个词或短语several_sentences: 多个句子(约5个或段落)
语言变体
- 字母变化:
b: б/п/фw: ўq: къg: гъn: нг
作者
- 作者:
- Bogdan Tewunalany
- Ali Berberov



