five

allandclive/UgandaLex2

收藏
Hugging Face2023-07-12 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/allandclive/UgandaLex2
下载链接
链接失效反馈
官方服务:
资源简介:
--- task_categories: - text-generation - translation language: - ach - alz - teo - gwr - adh - keo - kin - laj - lgg - myx - kdj - nyn - nuj - xog - lg - en - luc - kbo - tjl - rub - ndp - nyo - lsm pretty_name: UgandaLex2 size_categories: - 1K<n<10K --- ### UgandaLex2: A Parallel Text Translation Corpus in 24 Ugandan Languages (3 added languages) UgandaLex Parallel Texts in Ugandan Languages is a remarkable dataset consisting of parallel texts sourced from Bible translations across 21 Ugandan languages. This expansive corpus provides an invaluable resource for studying and analyzing the linguistic variations and nuances within Uganda's diverse language landscape. With aligned texts from various Bible translations, researchers, linguists, and developers can delve into the intricacies of Ugandan languages, explore translation patterns, and investigate the cultural and linguistic heritage of different communities. UgandaLex opens up avenues for advancing research in computational linguistics, cross-linguistic analysis, and the development of language technologies tailored specifically for Ugandan languages. ### Languages **Kebu, Acholi, **Saamya-Gwe, **Nyoro, Alur, Aringa, Ateso, Ganda, Gwere, Jopadhola, Kakwa, Kinyarwanda, Kumam, Lango, Lugbara, Masaaba, Ng'akarimojong, Nyankore, Nyole, Soga, Swahili, English, Gungu, Keliko, Talinga-Bwisi ### Contributors @allandclive & @oumo_os
提供机构:
allandclive
原始信息汇总

数据集概述

名称: UgandaLex2

类别:

  • 文本生成
  • 翻译

语言:

  • Ach
  • Alz
  • Teo
  • Gwr
  • Adh
  • Keo
  • Kin
  • Laj
  • Lgg
  • Myx
  • Kdj
  • Nyn
  • Nuj
  • Xog
  • Lg
  • En
  • Luc
  • Kbo
  • Tjl
  • Rub
  • Ndp
  • Nyo
  • Lsm

美观名称: UgandaLex2

大小类别:

  • 1K<n<10K

数据集描述

UgandaLex2是一个包含24种乌干达语言的平行文本翻译语料库,其中新增了3种语言。该数据集由来自21种乌干达语言的圣经翻译平行文本组成,为研究乌干达语言多样性提供了宝贵的资源。通过这些对齐的文本,研究人员、语言学家和开发者可以深入探索乌干达语言的细微差别,研究翻译模式,并调查不同社区的文化和语言遗产。UgandaLex2为计算语言学、跨语言分析和针对乌干达语言的语言技术开发提供了研究途径。

包含语言

  • Kebu
  • Acholi
  • Saamya-Gwe
  • Nyoro
  • Alur
  • Aringa
  • Ateso
  • Ganda
  • Gwere
  • Jopadhola
  • Kakwa
  • Kinyarwanda
  • Kumam
  • Lango
  • Lugbara
  • Masaaba
  • Ngakarimojong
  • Nyankore
  • Nyole
  • Soga
  • Swahili
  • English
  • Gungu
  • Keliko
  • Talinga-Bwisi
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作