Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.072392
Special Issues
Table of Content

Open Access

ARTICLE

Domain-Aware Transformer for Multi-Domain Neural Machine Translation

Shuangqing Song1, Yuan Chen2, Xuguang Hu1, Juwei Zhang1,3,*
1 College of Information Engineering and Artificial Intelligence, Henan University of Science and Technology, Luoyang, 471023, China
2 School of Foreign Languages, Zhengzhou University of Aeronautics, Zhengzhou, 450046, China
3 School of Electronics and Information, Zhengzhou University of Aeronautics, Zhengzhou, 450046, China
* Corresponding Author: Juwei Zhang. Email: email
(This article belongs to the Special Issue: Enhancing AI Applications through NLP and LLM Integration)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.072392

Received 26 August 2025; Accepted 04 November 2025; Published online 28 November 2025

Abstract

In multi-domain neural machine translation tasks, the disparity in data distribution between domains poses significant challenges in distinguishing domain features and sharing parameters across domains. This paper proposes a Transformer-based multi-domain-aware mixture of experts model. To address the problem of domain feature differentiation, a mixture of experts (MoE) is introduced into attention to enhance the domain perception ability of the model, thereby improving the domain feature differentiation. To address the trade-off between domain feature distinction and cross-domain parameter sharing, we propose a domain-aware mixture of experts (DMoE). A domain-aware gating mechanism is introduced within the MoE module, simultaneously activating all domain experts to effectively blend domain feature distinction and cross-domain parameter sharing. A loss balancing function is then added to dynamically adjust the impact of the loss function on the expert distribution, enabling fine-tuning of the expert activation distribution to achieve a balance between domains. Experimental results on multiple Chinese-to-English and English-to-French datasets demonstrate that our proposed method significantly outperforms baseline models in both BLEU, chrF, and COMET metrics, validating its effectiveness in multi-domain neural machine translation. Further analysis of the probability distribution of expert activations shows that our method achieves remarkable results in both domain differentiation and cross-domain parameter sharing.

Keywords

Natural language processing; multi-domain neural machine translation; mixture-of-expert; domain-aware gating mechanism
  • 67

    View

  • 9

    Download

  • 0

    Like

Share Link