MWirelabs/garo-english-parallel-corpus
收藏Hugging Face2025-10-01 更新2025-11-01 收录
下载链接:
https://hf-mirror.com/datasets/MWirelabs/garo-english-parallel-corpus
下载链接
链接失效反馈官方服务:
资源简介:
Garo-English平行语料库是一个小型的 teaser 子集,包含2.5k句对,用于实验和管道演示。完整的语料库(200k对)是专有的。这个teaser子集不适合用于基准测试或生产训练。语料库包括英语到Garo语的翻译任务,包含源句子、目标句子以及源语言和目标语言的代码。数据集采用UTF-8编码,遵循CC BY 4.0许可。
The Garo-English Parallel Corpus is a small teaser subset containing 2.5k sentence pairs for experimentation and pipeline demos. The full corpus (200k pairs) remains proprietary. This teaser subset is not intended for benchmarking or production training. It includes translation tasks from English to Garo, containing source sentences, target sentences, and language codes for both source and target. The dataset is encoded in UTF-8 and is licensed under CC BY 4.0.
提供机构:
MWirelabs



