• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2023, Vol. 59 ›› Issue (13): 79-88.doi: 10.3901/JME.2023.13.079

• 机器人及机构学 • 上一篇    下一篇

扫码分享

基于深度强化学习的四足机器人跟随策略研究及系统实现

钟沛成, 骆德渊, 庞明君   

  1. 电子科技大学机械与电气工程学院 成都 611731
  • 收稿日期:2022-08-26 修回日期:2023-05-17 出版日期:2023-07-05 发布日期:2023-08-15
  • 通讯作者: 骆德渊(通信作者),男,1970年出生,博士,教授,硕士研究生导师。主要研究方向为智能机器人技术。E-mail:luodeyuan@163.com
  • 作者简介:钟沛成,男,1997年出生。主要研究方向为机器人环境感知与自主决策。E-mail:zhongpc2019@163.com

Research and System Implementation of Quadruped Robot Following Strategy Based on Deep Reinforcement Learning

ZHONG Peicheng, LUO Deyuan, PANG Mingjun   

  1. School of Mechanical and Electrical Engineering, University of Electrical Science and Technology of China, Chengdu 611731
  • Received:2022-08-26 Revised:2023-05-17 Online:2023-07-05 Published:2023-08-15

摘要: 目标跟随策略是四足机器人目标跟随系统的重要组成部分。针对跟随过程中目标运动随机因素多、系统决策复杂以及现实部署鲁棒性不足的问题,首先,提出一种基于深度强化学习的目标跟随策略,该策略根据输入的目标相对于机器人的空间位置信息,输出跟随动作指令,实现机器人对随机运动目标的跟随决策。然后,使用基于Actor-Critic框架的深度强化学习算法对机器人进行训练,并添加观测值噪声以获得更鲁棒的跟随策略和引入修正因子来减少仿真环境与真实环境中机器人的运动速度偏差,先在仿真平台上进行了初步验证,最后将跟随策略部署到四足机器人上进行实验验证。结果表明,系统跟随性能良好,满足大多数应用场景的需要。

关键词: 四足机器人, 目标跟随系统, 深度强化学习, 目标跟随策略

Abstract: The target following strategy is an important part of the target following system of the quadruped robot. Aiming at the problems of many random factors of target motion, complex system decision-making and insufficient robustness of real-world deployment in the following process, firstly, a target following strategy based on deep reinforcement learning is proposed, which is based on the spatial position information of the input target relative to the robot, output the following action command to realize the robot's following decision to the random moving target. Then, the robot is trained using a deep reinforcement learning algorithm based on the Actor-Critic framework, and observation noise is added to obtain a more robust following strategy and a correction factor is introduced to reduce the speed deviation of the robot in the simulated environment and the real environment. The initial verification is carried out on the simulation platform, and finally the following strategy is deployed on the quadruped robot for experimental verification. The results show that the system has good following performance and meets the needs of most application scenarios.

Key words: quadruped, object following system, deep reinforcement learning, object following strategy

中图分类号: