IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments

Xuan-Thuc Nguyen¹, Le-Minh Nguyen¹, Ngoc-Quynh Nguyen¹, Nhu-Nghia Bui², Dinh-Quy Vu^3,*, Thai-Viet Dang^2,*
1 Viettel High Technology Industries Corporation–Viettel Group, Hanoi, Vietnam
2 Department of Mechatronics, School of Mechanical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
3 Department of Vehicle and Energy Conversion Engineering, School of Mechanical Engineering, Hanoi University of Science and Technology, Hanoi, Vietnam
* Corresponding Author: Dinh-Quy Vu. Email: email ; Thai-Viet Dang. Email: email
(This article belongs to the Special Issue: Aerial Innovation Spectrum: All-Domain Research in UAV Communication, Navigation, and Autonomy)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.080008

Received 01 February 2026; Accepted 14 April 2026; Published online 05 May 2026

Download PDF

Abstract

The development of unmanned automated vehicles (UAVs) has become a key focus in aerial robotics, fueling the need for navigation systems capable of performing complex and delicate tasks with speed and precision. However, the end-to-end path tracking process often encounters challenges in learning efficiency, generalization, and varying environmental conditions. In this paper, we propose the novel IRL-TP framework for learning-based UAVs’ trajectory planning that employs a deep inverse reinforcement learning (IRL) approach. Firstly, the RL-based path planner must develop a reward function that effectively captures flight safety, collision avoidance, trajectory smoothness, and navigation efficiency within constrained environments filled with numerous obstacles. To achieve optimal results, a deep reward network is constructed to parametrize the unknown reward function, which effectively and implicitly models the satisfaction of multiple objectives. The regularization of entropy through the learned reward function is utilized to optimize the continuous control policy and improve stability and exploration ability during training with a soft actor-critic (SAC) agent. By combining the reward function inference and policy learning processes, the proposed framework empowers the UAVs to mimic expert behavior and create highly generalized navigation strategies in the “potential map”. In experimental environments with a dense obstacle level, our method achieves a success rate of 97.6% while maintaining an instability metric as low as 0.044 throughout the process. Furthermore, the number of episodes needed to converge the parameters was much faster than other methods (~340). The proposed model not only achieves rapid convergence and a reward value 1.6 times higher in the first 200 training episodes and 1.3 times higher after the entire training process, but also demonstrates an impressive inference time of 2.6 ms per step compared to the basic IRL framework. Compared to state-of-the-art methods—including DQN, PPO, SAC, BC, and GAIL—our approach achieves superior trajectory efficiency, enhanced safety margins, smoother motion, and greater training stability, even in complex 3D environments.

Keywords

Interference-constrained environments; inverse reinforcement learning; unmanned automated vehicles (UAV); soft actor-critic (SAC); trajectory planning

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

469

View
150

Download
0

Like

DQN-Based Proactive Trajectory Planning of UAVs in Multi-Access Edge Computing
Adil Khan, Jinling Zhang, Shabeer...
A Multi-Strategy-Improved Northern Goshawk Optimization Algorithm for Global Optimization and Engineering Design
Liang Zeng, Mai Hu, Chenning Zhang,...
Collaborative Trajectory Planning for Stereoscopic Agricultural Multi-UAVs Driven by the Aquila Optimizer
Xinyu Liu, Longfei Wang, Yuxin...
Intelligent Vehicle Lane-Changing Strategy through Polynomial and Game Theory
Buwei Dang, Huanming Chen, Heng...
Dung Beetle Optimization Algorithm Based on Bounded Reflection Optimization and Multi-Strategy Fusion for Multi-UAV Trajectory Planning
Weicong Tan, Qiwu Wu, Lingzhi...

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

IRL-TP: Deep Inverse Reinforcement Learning-Based Trajectory Planning for UAVs in Complex and Interference-Constrained Environments

Abstract

Keywords

469

150

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link