Machine Translation with Large Language Models Based on Correction Mechanism of Error-Prone Words in Translations

中国科学数据2026-02-09 更新2026-04-25 收录

下载链接：

https://www.sciengine.com/AA/doi/10.19678/j.issn.1000-3428.0069767

下载链接

链接失效反馈

官方服务：

资源简介：

Large Language Models (LLMs) demonstrate a certain level of performance in machine translation tasks. These models can generate translations upon receiving a translation prompt. However, owing to limitations imposed by the quality of pre-training corpora and the distribution of languages, translations generated by LLMs still show quality issues such as mistranslations, omissions, hallucinations, and off-target translations. To mitigate the issue of low-quality translations generated by LLMs, this paper proposes a machine translation method using LLMs based on the correction mechanism of error-prone words in translations. Initially, error-prone words for a particular language direction are defined using model and reference translations from the original training set. Subsequently, a dataset for correcting these error-prone words is constructed based on the error-prone words in the model translations and their corresponding corrections. The correction model is then obtained by fine-tuning a small pre-trained model using the correction dataset. During the inference phase, the correction model is employed to rectify error-prone words in the translations generated by the LLM; subsequently, the LLM performs autoregressive decoding to produce a higher-quality translation. Experiments were conducted using the Llama2-7B model across six language directions (Chinese↔English, German↔English, and Russian↔English) on the WMT2022 test set. The results indicate that the average Crosslingual Optimized Metric for Evaluation of Translation (COMET) and SacreBilingual Evaluation Understudy (BLEU) scores for the X-English translation direction improved by 0.018 7 and 1.26 points, respectively, while those for the English-X translation direction improved by 0.087 9 and 7.67 points, respectively, when compared to translations without correction. These experiments substantiate the effectiveness of the correction mechanism of error-prone words in enhancing the quality of text translation by LLMs.

创建时间：

2026-02-09

5,000+

优质数据集

54 个

任务类型

进入经典数据集