IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments

Xuan-Thuc Nguyen; Le-Minh Nguyen; Ngoc-Quynh Nguyen; Nhu-Nghia Bui; Dinh-Quy Vu; Thai-Viet Dang

doi:10.32604/cmc.2026.080008

Open Access icon Open Access

ARTICLE

IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments

Xuan-Thuc Nguyen¹, Le-Minh Nguyen¹, Ngoc-Quynh Nguyen¹, Nhu-Nghia Bui², Dinh-Quy Vu^3,*, Thai-Viet Dang^2,*

1 Viettel High Technology Industries Corporation–Viettel Group, Hanoi, Vietnam
2 Department of Mechatronics, School of Mechanical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
3 Department of Vehicle and Energy Conversion Engineering, School of Mechanical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam

* Corresponding Authors: Dinh-Quy Vu. Email: email ; Thai-Viet Dang. Email: email

(This article belongs to the Special Issue: Aerial Innovation Spectrum: All-Domain Research in UAV Communication, Navigation, and Autonomy)

Computers, Materials & Continua 2026, 88(2), 41 https://doi.org/10.32604/cmc.2026.080008

Received 01 February 2026; Accepted 14 April 2026; Issue published 15 June 2026

Abstract

The development of unmanned automated vehicles (UAVs) has become a key focus in aerial robotics, fueling the need for navigation systems capable of performing complex and delicate tasks with speed and precision. However, the end-to-end path tracking process often encounters challenges in learning efficiency, generalization, and varying environmental conditions. In this paper, we propose the novel IRL-TP framework for learning-based UAVs’ trajectory planning that employs a deep inverse reinforcement learning (IRL) approach. Firstly, the RL-based path planner must develop a reward function that effectively captures flight safety, collision avoidance, trajectory smoothness, and navigation efficiency within constrained environments filled with numerous obstacles. To achieve optimal results, a deep reward network is constructed to parametrize the unknown reward function, which effectively and implicitly models the satisfaction of multiple objectives. The regularization of entropy through the learned reward function is utilized to optimize the continuous control policy and improve stability and exploration ability during training with a soft actor-critic (SAC) agent. By combining the reward function inference and policy learning processes, the proposed framework empowers the UAVs to mimic expert behavior and create highly generalized navigation strategies in the “potential map”. In experimental environments with a dense obstacle level, our method achieves a success rate of 97.6% while maintaining an instability metric as low as 0.044 throughout the process. Furthermore, the number of episodes needed to converge the parameters was much faster than other methods (~340). The proposed model not only achieves rapid convergence and a reward value 1.6 times higher in the first 200 training episodes and 1.3 times higher after the entire training process, but also demonstrates an impressive inference time of 2.6 ms per step compared to the basic IRL framework. Compared to state-of-the-art methods—including DQN, PPO, SAC, BC, and GAIL—our approach achieves superior trajectory efficiency, enhanced safety margins, smoother motion, and greater training stability, even in complex 3D environments.

Keywords

Interference-constrained environments; inverse reinforcement learning; unmanned automated vehicles (UAV); soft actor-critic (SAC); trajectory planning

Cite This Article

APA Style

Nguyen, X., Nguyen, L., Nguyen, N., Bui, N., Vu, D. et al. (2026). IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments. Computers, Materials & Continua, 88(2), 41. https://doi.org/10.32604/cmc.2026.080008

Vancouver Style

Nguyen X, Nguyen L, Nguyen N, Bui N, Vu D, Dang T. IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments. Comput Mater Contin. 2026;88(2):41. https://doi.org/10.32604/cmc.2026.080008

IEEE Style

X. Nguyen, L. Nguyen, N. Nguyen, N. Bui, D. Vu, and T. Dang, “IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments,” Comput. Mater. Contin., vol. 88, no. 2, pp. 41, 2026. https://doi.org/10.32604/cmc.2026.080008

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments

Abstract

Keywords

Cite This Article

552

201

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link