• CN: 11-2187/TH
  • ISSN: 0577-6686

Journal of Mechanical Engineering ›› 2022, Vol. 58 ›› Issue (11): 72-87.doi: 10.3901/JME.2022.11.072

Previous Articles     Next Articles

Smoothed-shortcut Q-Learning Algorithm for Optimal Robot Agent Path Planning

DUAN Shuyong1, ZHANG Linxin1, HAN Xu1, LIU Guirong2   

  1. 1. State Key Laboratory of Reliability and Intelligence of Electrical Equipment, Hebei University of Technology, Tianjin 300401;
    2. Aeronautical Engineering and Mechanical Engineering, University of Cincinnati, Cincinnati 45221, USA
  • Received:2021-07-22 Revised:2022-01-24 Online:2022-06-05 Published:2022-08-08

Abstract: Quality path planning for a mobile robot in operation is the key to completing the task safely, efficiently and smoothly. Such a path planning often needs to be done base only on a given environment that is unknown to the Agent at the beginning, and an effective reinforcement learning is required. Smoothed-shortcut Q-learning (SSQL) Algorithm is presented that enable the Agent to learn and then figure out a smoothed short-cut path to the final goal that is initially unknown to the Agent in a given environment. The SSQL is proposed to solve practical problems for mobile robots effectively arrive at its goal in a strange environment, with a path that is a smooth and continuous curve of shortest distance. The SSQL algorithm consists three major ingredients. First, a virtual rectangular environment boundary of the environment is constructed, based on the pre-explored information. The Q values of guidance point for the virtual rectangular environment are increased to improve the learning efficiency of the Agent. Second, the path found by the Agent at the current time is then optimized by finding short-cuts along the path to eliminate the possible redundant paths and reduce the zig-zag segments, minimizing the total distance between the starting and target point. Third, at the turning positions on the path, the Bezier curve is used to further smooth the path, so as to improve the dynamics for the movement of the robot agent. The final path generated by our SSQL algorithm will be optimal in terms of fast convergence, smoothness and shortest distance. The SSQL algorithm is tested by comparison with the standard Q-Learning algorithm in different environments with various obstacle densities and learning rate. The results show that our SSQL algorithm has indeed achieved fast convergence, short and smooth paths, and with few turning points.

Key words: mobile robot, Q-Learning, Q value of the guidance, jump point optimization, Bezier curve, smooth path

CLC Number: