Actor–Critic Trajectory Controller with Optimal Design for Nonlinear Robotic Systems
Nien-Tsu Hu1,*, Hsiang-Tung Kao1, Chin-Sheng Chen1, Shih-Hao Chang2
1 Graduate Institute of Automation Technology, National Taipei University of Technology, Taipei, 10608, Taiwan
2 Computer Science and Information Engineering, National Taipei University of Technology, Taipei, 10608, Taiwan
* Corresponding Author: Nien-Tsu Hu. Email:
Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.074993
Received 22 October 2025; Accepted 18 December 2025; Published online 04 January 2026
Abstract
Trajectory tracking for nonlinear robotic systems remains a fundamental yet challenging problem in control engineering, particularly when both precision and efficiency must be ensured. Conventional control methods are often effective for stabilization but may not directly optimize long-term performance. To address this limitation, this study develops an integrated framework that combines optimal control principles with reinforcement learning for a single-link robotic manipulator. The proposed scheme adopts an actor–critic structure, where the critic network approximates the value function associated with the Hamilton–Jacobi–Bellman equation, and the actor network generates near-optimal control signals in real time. This dual adaptation enables the controller to refine its policy online without explicit system knowledge. Stability of the closed-loop system is analyzed through Lyapunov theory, ensuring boundedness of the tracking error. Numerical simulations on the single-link manipulator demonstrate that the method achieves accurate trajectory following while maintaining low control effort. The results further show that the actor–critic learning mechanism accelerates convergence of the control policy compared with conventional optimization-based strategies. This work highlights the potential of reinforcement learning integrated with optimal control for robotic manipulators and provides a foundation for future extensions to more complex multi-degree-of-freedom systems. The proposed controller is further validated in a physics-based virtual Gazebo environment, demonstrating stable adaptation and real-time feasibility.
Keywords
Reinforcement learning; optimal control; actor–critic algorithm; trajectory tracking; nonlinear systems; robotic manipulator