CLARIN_MT_EN
收藏DataCite Commons2025-06-11 更新2025-06-14 收录
下载链接:
https://b2share.eudat.eu/records/f37920cd5a634986b261dbdc4cc55476
下载链接
链接失效反馈官方服务:
资源简介:
This dataset includes trained Marian NMT models for Maltese-English translation, exploring the impact of BPE vocabulary size (5k, 10k, 20k) on translation quality. Models were trained on approximately 24,000 sentence pairs from the ELRC-SHARE Maltese-English parallel corpus. Each model was fine-tuned for 5 epochs and evaluated using BLEU and chrF. Additionally, selected samples were linguistically annotated using UDPipe and published as a searchable concordance corpus on AutoSearch. This release includes models, vocabulary files, tokenized corpora, and documentation for reproducibility.
提供机构:
https://b2share.eudat.eu
创建时间:
2025-06-11



