Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.073202
Special Issues
Table of Content

Open Access

ARTICLE

Segment-Conditioned Latent-Intent Framework for Cooperative Multi-UAV Search

Gang Hou1,#, Aifeng Liu1,#, Tao Zhao1, Wenyuan Wei2, Bo Li1, Jiancheng Liu3,*, Siwen Wei4,5,*
1 Northwest Institute of Mechanical and Electrical Engineering, Xianyang, 712099, China
2 Department of Railway Transportation Operations Management, Baotou Railway Vocational & Technical College, Baotou, 014060, China
3 School of Mechanical Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China
4 Shaanxi Key Laboratory of Antenna and Control Technology, Xi’an, 710076, China
5 39th Research Institute of China Electronics Technology Group Corporation, Xi’an, 710076, China
* Corresponding Author: Jiancheng Liu. Email: email; Siwen Wei. Email: email
# These authors contributed equally to this work
(This article belongs to the Special Issue: Cooperation and Autonomy in Multi-Agent Systems: Models, Algorithms, and Applications)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.073202

Received 12 September 2025; Accepted 24 December 2025; Published online 23 January 2026

Abstract

Cooperative multi-UAV search requires jointly optimizing wide-area coverage, rapid target discovery, and endurance under sensing and motion constraints. Resolving this coupling enables scalable coordination with high data efficiency and mission reliability. We formulate this problem as a discounted Markov decision process on an occupancy grid with a cellwise Bayesian belief update, yielding a Markov state that couples agent poses with a probabilistic target field. On this belief–MDP we introduce a segment-conditioned latent-intent framework, in which a discrete intent head selects a latent skill every K steps and an intra-segment GRU policy generates per-step control conditioned on the fixed intent; both components are trained end-to-end with proximal updates under a centralized critic. On the 50×50 grid, coverage and discovery convergence times are reduced by up to 48% and 40% relative to a flat actor-critic benchmark, and the aggregated convergence metric improves by about 12% compared with a state-of-the-art hierarchical method. Qualitative analyses further reveal stable spatial sectorization, low path overlap, and fuel-aware patrolling, indicating that segment-conditioned latent intents provide an effective and scalable mechanism for coordinated multi-UAV search.

Keywords

Multi-agent reinforcement learning; Markov decision process; multi-UAV cooperative search
  • 91

    View

  • 19

    Download

  • 0

    Like

Share Link