Open Access iconOpen Access

ARTICLE

IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments

Xuan-Thuc Nguyen1, Le-Minh Nguyen1, Ngoc-Quynh Nguyen1, Nhu-Nghia Bui2, Dinh-Quy Vu3,*, Thai-Viet Dang2,*

1 Viettel High Technology Industries Corporation–Viettel Group, Hanoi, Vietnam
2 Department of Mechatronics, School of Mechanical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
3 Department of Vehicle and Energy Conversion Engineering, School of Mechanical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam

* Corresponding Authors: Dinh-Quy Vu. Email: email; Thai-Viet Dang. Email: email

(This article belongs to the Special Issue: Aerial Innovation Spectrum: All-Domain Research in UAV Communication, Navigation, and Autonomy)

Computers, Materials & Continua 2026, 88(2), 41 https://doi.org/10.32604/cmc.2026.080008

Abstract

The development of unmanned automated vehicles (UAVs) has become a key focus in aerial robotics, fueling the need for navigation systems capable of performing complex and delicate tasks with speed and precision. However, the end-to-end path tracking process often encounters challenges in learning efficiency, generalization, and varying environmental conditions. In this paper, we propose the novel IRL-TP framework for learning-based UAVs’ trajectory planning that employs a deep inverse reinforcement learning (IRL) approach. Firstly, the RL-based path planner must develop a reward function that effectively captures flight safety, collision avoidance, trajectory smoothness, and navigation efficiency within constrained environments filled with numerous obstacles. To achieve optimal results, a deep reward network is constructed to parametrize the unknown reward function, which effectively and implicitly models the satisfaction of multiple objectives. The regularization of entropy through the learned reward function is utilized to optimize the continuous control policy and improve stability and exploration ability during training with a soft actor-critic (SAC) agent. By combining the reward function inference and policy learning processes, the proposed framework empowers the UAVs to mimic expert behavior and create highly generalized navigation strategies in the “potential map”. In experimental environments with a dense obstacle level, our method achieves a success rate of 97.6% while maintaining an instability metric as low as 0.044 throughout the process. Furthermore, the number of episodes needed to converge the parameters was much faster than other methods (~340). The proposed model not only achieves rapid convergence and a reward value 1.6 times higher in the first 200 training episodes and 1.3 times higher after the entire training process, but also demonstrates an impressive inference time of 2.6 ms per step compared to the basic IRL framework. Compared to state-of-the-art methods—including DQN, PPO, SAC, BC, and GAIL—our approach achieves superior trajectory efficiency, enhanced safety margins, smoother motion, and greater training stability, even in complex 3D environments.

Keywords

Interference-constrained environments; inverse reinforcement learning; unmanned automated vehicles (UAV); soft actor-critic (SAC); trajectory planning

Cite This Article

APA Style
Nguyen, X., Nguyen, L., Nguyen, N., Bui, N., Vu, D. et al. (2026). IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments. Computers, Materials & Continua, 88(2), 41. https://doi.org/10.32604/cmc.2026.080008
Vancouver Style
Nguyen X, Nguyen L, Nguyen N, Bui N, Vu D, Dang T. IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments. Comput Mater Contin. 2026;88(2):41. https://doi.org/10.32604/cmc.2026.080008
IEEE Style
X. Nguyen, L. Nguyen, N. Nguyen, N. Bui, D. Vu, and T. Dang, “IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments,” Comput. Mater. Contin., vol. 88, no. 2, pp. 41, 2026. https://doi.org/10.32604/cmc.2026.080008



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 552

    View

  • 201

    Download

  • 0

    Like

Share Link