TY  - EJOU
AU  - Wang, Liqing 
AU  - Xiao, Yiheng 

TI  - Improving Low-Resource Machine Translation Using Reinforcement Learning from Human Feedback
T2  - Intelligent Automation \& Soft Computing

PY  - 2024
VL  - 39
IS  - 4
SN  - 2326-005X

AB  - Neural Machine Translation is one of the key research directions in Natural Language Processing. However, limited by the scale and quality of parallel corpus, the translation quality of low-resource Neural Machine Translation has always been unsatisfactory. When Reinforcement Learning from Human Feedback (RLHF) is applied to low-resource machine translation, commonly encountered issues of substandard preference data quality and the higher cost associated with manual feedback data. Therefore, a more cost-effective method for obtaining feedback data is proposed. At first, optimizing the quality of preference data through the prompt engineering of the Large Language Model (LLM), then combining human feedback to complete the evaluation. In this way, the reward model could acquire more semantic information and human preferences during the training phase, thereby enhancing feedback efficiency and the result’s quality. Experimental results demonstrate that compared with the traditional RLHF method, our method has been proven effective on multiple datasets and exhibits a notable improvement of 1.07 in BLUE. Meanwhile, it is also more favorably received in the assessments conducted by human evaluators and GPT-4o.
KW  - Low-resource neural machine translation; RLHF; prompt engineering; LLM

DO  - 10.32604/iasc.2024.052971