five

Statistical Significance Analysis of Metrics.

收藏
Figshare2025-06-25 更新2026-04-28 收录
下载链接:
https://figshare.com/articles/dataset/Statistical_Significance_Analysis_of_Metrics_/29407647
下载链接
链接失效反馈
官方服务:
资源简介:
People’s need for English translation is gradually growing in the modern era of technological advancements, and a computer that can comprehend and interpret English is now more crucial than ever. Some issues, including ambiguity in English translation and improper word choice in translation techniques, must be addressed to enhance the quality of the English translation model and accuracy based on the corpus. Hence, an edge computing-based translation model (FSRL-P2O) is proposed to improve translation accuracy by using huge bilingual corpora, considering Fuzzy Semantic (FS) properties, and maximizing the translation output using optimal control techniques with the incorporation of Reinforcement Learning and Proximal Policy Optimisation (PPO) techniques. The corpus data is initially gathered, and necessary preprocessing and feature extraction techniques are made. The preprocessed sentences are given as input to the fuzzy semantic similarity phase, which aims to avoid uncertainties by measuring the semantic resemblance between two linguistic elements, such as phrases, words, or sentences involved in a translation using the Jaccard similarity coefficient. The fuzzy semantic resemblance component’s training estimates the degree of overlap or similarity between two sentences, such as calculating the percentage of characters and length of the longest matching sequence of characters. The suggested Reinforcement learning and PPO can address specific uncertainty causes in machine translation assessment, like out-of-domain data and low-quality references. In addition to simple word-level comparison, it permits a more complex grasp of the semantic link. Reinforcement Learning (RL) and Proximal Policy Optimisation (PPO) techniques are implemented as optimal control techniques to optimize the translation procedures and enhance the quality and precision of generated translations. RL and PPO aim to improve a machine translation system’s translation policy depending on a predetermined reward signal or quality parameter. The system’s effectiveness is evaluated by various metrics such as accuracy, Fuzzy semantic similarity, Bi-Lingual Evaluation Understudy (BLEU), and National Institute of Standards and Technology score (NIST). Thus, the proposed system achieves higher quality and translation accuracy of the text that has been translated and produces higher semantic similarity.

在科技飞速发展的当代,人们对英语翻译的需求与日俱增,能够理解并译制英语文本的计算机系统也因此变得愈发重要。为提升基于语料库的英语翻译模型质量与翻译准确率,亟需解决当前英语翻译中存在的歧义问题以及翻译技巧层面的选词不当等痛点。为此,本文提出一种基于边缘计算(edge computing)的翻译模型FSRL-P2O:该模型借助大规模双语语料库,结合模糊语义(Fuzzy Semantic, FS)特性,并融合强化学习(Reinforcement Learning, RL)与近端策略优化(Proximal Policy Optimisation, PPO)技术作为最优控制手段以最大化翻译输出质量,进而提升翻译准确率。首先完成语料数据的采集,并开展必要的预处理与特征提取工作。将预处理后的语句输入至模糊语义相似度模块,该模块通过雅卡尔相似度系数(Jaccard similarity coefficient)计算翻译任务中短语、单词或句子等两类语言单元间的语义相似度,以此规避歧义不确定性。模糊语义相似度模块的训练过程会估算两句语句间的重叠度或相似度,例如通过计算字符占比与最长匹配字符序列的长度实现。所提出的强化学习与PPO技术可解决机器翻译评估中存在的特定不确定性诱因,例如域外数据与低质量参考译文等问题。相较于单纯的词级对比,该方法能够更深入地理解语义关联。本研究将强化学习(RL)与近端策略优化(PPO)技术作为最优控制手段,用于优化翻译流程,提升生成译文的质量与精准度。强化学习与PPO技术旨在依据预设的奖励信号或质量参数,优化机器翻译系统的翻译策略。本系统的性能通过多项指标进行评估,包括准确率、模糊语义相似度、双语评估替补(Bi-Lingual Evaluation Understudy, BLEU)得分以及美国国家标准与技术研究院(National Institute of Standards and Technology, NIST)评分。综上,所提出的FSRL-P2O模型能够有效提升译文质量与翻译准确率,同时实现更高的语义相似度。
创建时间:
2025-06-25
5,000+
优质数据集
54 个
任务类型
进入经典数据集
二维码
社区交流群

面向社区/商业的数据集话题

二维码
科研交流群

面向高校/科研机构的开源数据集话题

数据驱动未来

携手共赢发展

商业合作