five

rosettarandd/rosetta_balcanica

收藏
Hugging Face2021-11-14 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/rosettarandd/rosetta_balcanica
下载链接
链接失效反馈
官方服务:
资源简介:
# Dataset Summary We present *rosetta-balcanica* a manually extracted multilingual machine translation dataset for low resource western Balkan languages. The documents were sourced from Organization for Security and Co-operation in Europe (OSCE) website by applying appropriate language filters. Filtered list of documents can be found [here](https://www.osce.org/resources/documents?filters=%20sm_translations%3A%28sq%29&solrsort=score%20desc&rows=10). # Languages Supported Currently, our dataset has documents sourced from [Macedonian](https://github.com/ebegoli/rosetta-balcanica) and [Albanian](https://en.wikipedia.org/wiki/Albanian_language)(also known as Shqip).
提供机构:
rosettarandd
原始信息汇总

数据集概述

  • 名称: rosetta-balcanica
  • 类型: 多语种机器翻译数据集
  • 目标语言: 针对资源较少的西巴尔干语言
  • 数据来源: 从Organization for Security and Co-operation in Europe (OSCE)网站手动提取,通过应用适当的语言过滤器筛选文档。
  • 筛选文档链接: OSCE网站筛选文档列表

支持的语言

  • 当前包含的语言:
    • Macedonian
    • Albanian (又称Shqip)
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作