Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.071319
Special Issues
Table of Content

Open Access

ARTICLE

Dynamic Integration of Q-Learning and A-APF for Efficient Path Planning in Complex Underground Mining Environments

Chang Su, Liangliang Zhao*, Dongbing Xiang
School of Mechanical and Electrical Engineering, Anhui University of Science and Technology, Huainan, 232000, China
* Corresponding Author: Liangliang Zhao. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.071319

Received 05 August 2025; Accepted 11 September 2025; Published online 20 October 2025

Abstract

To address low learning efficiency and inadequate path safety in spraying robot navigation within complex obstacle-rich environments—with dense, dynamic, unpredictable obstacles challenging conventional methods—this paper proposes a hybrid algorithm integrating Q-learning and improved A*-Artificial Potential Field (A-APF). Centered on the Q-learning framework, the algorithm leverages safety-oriented guidance generated by A-APF and employs a dynamic coordination mechanism that adaptively balances exploration and exploitation. The proposed system comprises four core modules: (1) an environment modeling module that constructs grid-based obstacle maps; (2) an A-APF module that combines heuristic search from A* algorithm with repulsive force strategies from APF to generate guidance; (3) a Q-learning module that learns optimal state-action values (Q-values) through spraying robot–environment interaction and a reward function emphasizing path optimality and safety; and (4) a dynamic optimization module that ensures adaptive cooperation between Q-learning and A-APF through exploration rate control and environment-aware constraints. Simulation results demonstrate that the proposed method significantly enhances path safety in complex underground mining environments. Quantitative results indicate that, compared to the traditional Q-learning algorithm, the proposed method shortens training time by 42.95% and achieves a reduction in training failures from 78 to just 3. Compared to the static fusion algorithm, it further reduces both training time (by 10.78%) and training failures (by 50%), thereby improving overall training efficiency.

Keywords

Q-learning; A* algorithm; artificial potential field; path planning; hybrid algorithm
  • 384

    View

  • 146

    Download

  • 0

    Like

Share Link