freococo/english_myanmar_corpus
收藏Hugging Face2025-12-19 更新2025-12-20 收录
下载链接:
https://hf-mirror.com/datasets/freococo/english_myanmar_corpus
下载链接
链接失效反馈官方服务:
资源简介:
该数据集是一个英语-缅甸语(缅甸)平行语料库,专注于自然口语风格的翻译。它旨在用于机器翻译研究,特别是英语到缅甸语的翻译。英语句子来源于AAC-C4数据集,缅甸语翻译则是使用Gemini Pro 3生成并经过人工校对以确保流畅性和自然性。数据集包含300,365条条目,以CSV格式提供,包含文本ID、英语句子和缅甸语口语风格翻译。数据集的发布遵循Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0)许可证,仅限非商业用途,需署名,适合研究和学术目的。
This dataset is an English–Myanmar (Burmese) parallel corpus focused on natural, spoken-style translations. It is intended for machine translation research, especially EN → MY. English sentences are derived from the AAC-C4 dataset. Myanmar translations are newly generated using Gemini Pro 3 and manually curated for fluency and naturalness. The dataset contains 300,365 entries, provided in CSV format with text ID, English sentence, and Burmese spoken-style translation. The dataset is released under the Creative Commons Attribution–NonCommercial 4.0 (CC BY-NC 4.0) license, for non-commercial use only, attribution required, and suitable for research and academic purposes.
提供机构:
freococo



