Open Access iconOpen Access

ARTICLE

Research on Adaptive Reward Optimization Method for Robot Navigation in Complex Dynamic Environment

Jie He, Dongmei Zhao, Tao Liu*, Qingfeng Zou, Jian’an Xie

School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, 621010, China

* Corresponding Author: Tao Liu. Email: email

Computers, Materials & Continua 2025, 84(2), 2733-2749. https://doi.org/10.32604/cmc.2025.065205

Abstract

Robot navigation in complex crowd service scenarios, such as medical logistics and commercial guidance, requires a dynamic balance between safety and efficiency, while the traditional fixed reward mechanism lacks environmental adaptability and struggles to adapt to the variability of crowd density and pedestrian motion patterns. This paper proposes a navigation method that integrates spatiotemporal risk field modeling and adaptive reward optimization, aiming to improve the robot’s decision-making ability in diverse crowd scenarios through dynamic risk assessment and nonlinear weight adjustment. We construct a spatiotemporal risk field model based on a Gaussian kernel function by combining crowd density, relative distance, and motion speed to quantify environmental complexity and realize crowd-density-sensitive risk assessment dynamically. We apply an exponential decay function to reward design to address the linear conflict problem of fixed weights in multi-objective optimization. We adaptively adjust weight allocation between safety constraints and navigation efficiency based on real-time risk values, prioritizing safety in highly dense areas and navigation efficiency in sparse areas. Experimental results show that our method improves the navigation success rate by 9.0% over state-of-the-art models in high-density scenarios, with a 10.7% reduction in intrusion time ratio. Simulation comparisons validate the risk field model’s ability to capture risk superposition effects in dense scenarios and the suppression of near-field dangerous behaviors by the exponential decay mechanism. Our parametric optimization paradigm establishes an explicit mapping between navigation objectives and risk parameters through rigorous mathematical formalization, providing an interpretable approach for safe deployment of service robots in dynamic environments.

Keywords

Machine learning; reinforcement learning; robots; autonomous navigation; reward shaping

Cite This Article

APA Style
He, J., Zhao, D., Liu, T., Zou, Q., Xie, J. (2025). Research on Adaptive Reward Optimization Method for Robot Navigation in Complex Dynamic Environment. Computers, Materials & Continua, 84(2), 2733–2749. https://doi.org/10.32604/cmc.2025.065205
Vancouver Style
He J, Zhao D, Liu T, Zou Q, Xie J. Research on Adaptive Reward Optimization Method for Robot Navigation in Complex Dynamic Environment. Comput Mater Contin. 2025;84(2):2733–2749. https://doi.org/10.32604/cmc.2025.065205
IEEE Style
J. He, D. Zhao, T. Liu, Q. Zou, and J. Xie, “Research on Adaptive Reward Optimization Method for Robot Navigation in Complex Dynamic Environment,” Comput. Mater. Contin., vol. 84, no. 2, pp. 2733–2749, 2025. https://doi.org/10.32604/cmc.2025.065205



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 913

    View

  • 463

    Download

  • 0

    Like

Share Link