Data Augmentation for Low resource Neural Machine Translation Based on Se mantic Related Word Replacement and Grammatical Error Correction
收藏科学数据银行2021-12-10 更新2026-04-23 收录
下载链接:
https://www.scidb.cn/en/detail?dataSetId=d83ceeecfbe549b9a0fe8f4202a3b7fd
下载链接
链接失效反馈官方服务:
资源简介:
This paper proposes a low-resource language neural machine translation data enhancement method based on semantically related word replacement and grammatical error correction. Firstly, the low-resource language is data-enhanced through the method of semantically related word replacement; secondly, the data-enhanced bilingual parallel corpus is grammatically corrected to make it conform to linguistic syntax and common sense reasoning. The results show that the method proposed in this paper not only guarantees the quantity of training corpus, but also improves the quality of training corpus, realizes effective data enhancement for low-resource languages, and further improves the effect of neural machine translation for low-resource languages.
提供机构:
Xiaobing Zhao; School of Information Engineering, Minzu University of China、National Language Resource Monitoring & Research Center of Minority Languages, Minzu University of China
创建时间:
2021-12-08



