Machine Translation Evaluation Dataset for Amharic
收藏NIAID Data Ecosystem2026-03-11 收录
下载链接:
https://zenodo.org/record/3669948
下载链接
链接失效反馈官方服务:
资源简介:
# Machine Translation Evaluation Dataset for Amharic
The dataset contains sentences in Amharic and their corresponding translations
in English that were collected using crowd sourcing. These ground-truth
sentences are from across different domains such as news headlines, social
media, Wikipedia and everyday conversation.
## Metadata of files in the dataset
amen.tsv
- Domain: news | wiki | twitter | convo
- Source Sentence: Amharic sentence
- Reference Translation: English translation
- Google Translate: output of Google Translate
- Yandex Translate: output of Yandex Translate
enam.tsv
- Domain: news | wiki | twitter | convo
- Source Sentence: English sentence
- Reference Translation: Amharic translation
- Google Translate: output of Google Translate
- Yandex Translate: output of Yandex Translate
## Reference translations across domains
**News**
- These are news headlines from Ethiopian news websites.
**Wikipedia**
- A random sample of sentences from the Amharic Wikipedia.
**Twitter**
- Amharic Twitter posts on consumer products.
**Conversational**
- Everyday conversational expressions from Amharic native speakers.
## Evaluation of two systems that provide Amharic translation
The dataset also contains evaluation of two commercial systems: [Google
Translate](https://translate.google.com/) and [Yandex
Translate](https://translate.yandex.com/). Both systems provide free APIs that
users can sign up and get access keys. The translations were generated on 14th
February 2020.
创建时间:
2020-03-31



