Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.072681
Special Issues
Table of Content

Open Access

ARTICLE

Research on UAV–MEC Cooperative Scheduling Algorithms Based on Multi-Agent Deep Reinforcement Learning

Yonghua Huo1,2, Ying Liu1,*, Anni Jiang3, Yang Yang3
1 School of Electronic and Information Engineering, Beijing Jiaotong University, Beijing, 100044, China
2 The 54th Research Institute of CETC, Shijiazhuang, 050081, China
3 State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, 100876, China
* Corresponding Author: Ying Liu. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.072681

Received 01 September 2025; Accepted 06 November 2025; Published online 05 December 2025

Abstract

With the advent of sixth-generation mobile communications (6G), space–air–ground integrated networks have become mainstream. This paper focuses on collaborative scheduling for mobile edge computing (MEC) under a three-tier heterogeneous architecture composed of mobile devices, unmanned aerial vehicles (UAVs), and macro base stations (BSs). This scenario typically faces fast channel fading, dynamic computational loads, and energy constraints, whereas classical queuing-theoretic or convex-optimization approaches struggle to yield robust solutions in highly dynamic settings. To address this issue, we formulate a multi-agent Markov decision process (MDP) for an air–ground-fused MEC system, unify link selection, bandwidth/power allocation, and task offloading into a continuous action space and propose a joint scheduling strategy that is based on an improved MATD3 algorithm. The improvements include Alternating Layer Normalization (ALN) in the actor to suppress gradient variance, Residual Orthogonalization (RO) in the critic to reduce the correlation between the twin Q-value estimates, and a dynamic-temperature reward to enable adaptive trade-offs during training. On a multi-user, dual-link simulation platform, we conduct ablation and baseline comparisons. The results reveal that the proposed method has better convergence and stability. Compared with MADDPG, TD3, and DSAC, our algorithm achieves more robust performance across key metrics.

Keywords

UAV-MEC networks; multi-agent deep reinforcement learning; MATD3; task offloading
  • 85

    View

  • 11

    Download

  • 0

    Like

Share Link