five

CLARIN_MT_EN

收藏
DataCite Commons2025-06-11 更新2025-06-14 收录
下载链接:
https://b2share.eudat.eu/records/f37920cd5a634986b261dbdc4cc55476
下载链接
链接失效反馈
官方服务:
资源简介:
This dataset includes trained Marian NMT models for Maltese-English translation, exploring the impact of BPE vocabulary size (5k, 10k, 20k) on translation quality. Models were trained on approximately 24,000 sentence pairs from the ELRC-SHARE Maltese-English parallel corpus. Each model was fine-tuned for 5 epochs and evaluated using BLEU and chrF. Additionally, selected samples were linguistically annotated using UDPipe and published as a searchable concordance corpus on AutoSearch. This release includes models, vocabulary files, tokenized corpora, and documentation for reproducibility.
提供机构:
https://b2share.eudat.eu
创建时间:
2025-06-11
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作