• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2020, Vol. 56 ›› Issue (3): 64-72.doi: 10.3901/JME.2020.03.064

• 机器人及机构学 • 上一篇    下一篇

可变环境下仿人机器人智能姿态控制

施群, 吕雷, 谢家骏   

  1. 上海大学机电工程与自动化学院 上海 200444
  • 收稿日期:2019-02-15 修回日期:2019-08-10 出版日期:2020-02-05 发布日期:2020-04-09
  • 通讯作者: 吕雷(通信作者),男,1994年出生,硕士研究生。主要研究方向为工业机器人运动控制算法,类人机器人运动控制。E-mail:940509@shu.edu.cn
  • 作者简介:施群,男,1972年出生,博士,副教授。主要研究方向为机器人运动控制算法,机器人操作系统,数控机床操作系统,智能控制。E-mail:shiqun@staff.shu.edu.cn;谢家骏,男,1992年出生,硕士研究生。主要研究方向为机器人控制系统。E-mail:724217648@qq.com

Intelligent Posture Control of Humanoid Robot in Variable Environment

SHI Qun, Lü Lei, XIE Jiajun   

  1. School of Mechanical and Electrical Engineering and Automation, Shanghai University, Shanghai 200444
  • Received:2019-02-15 Revised:2019-08-10 Online:2020-02-05 Published:2020-04-09

摘要: 为了解决仿人机器人运动控制精度和运动稳定性差等问题,提出智能运动姿态控制算法。将连续动作和连续状态空间的深度强化学习应用于姿态控制,建立机器人运动智能姿态控制器。并针对物理样机训练样本少、效率低等问题,提出使用机器人辨识模型对姿态控制器进行离线的预训练,作为真实物理环境下继续学习提升的先验知识,提高了后期训练效率。将优化后的机器人姿态控制器用于机器人的运动控制中,分别和加入PID控制器、MPC控制器、以及PID+MPC控制器的机器人运动相比,在环境过渡步行试验中机器人上身俯仰姿态轨迹跟踪残差标准差分别减少60.97%,46.36%,23.98%,在平地障碍物步行试验中机器人上身俯仰姿态轨迹跟踪残差标准差分别减少60.38%,26.38%,9.52%。

关键词: 双足步行, 深度强化学习, 运动控制

Abstract: To solve the problems of motion instability of humanoid robots in variable uncertain, unstructured terrain and the low accuracy motion control, intelligent posture motion control algorithm is proposed. The deep reinforcement learning based continuous motion and continuous state space is applied to posture control, and the humanoid robot motion intelligent posture controller is established. Aiming at the problems of less sample and low efficiency of physical prototype training, the identification robot model is present to perform offline pre-training of the posture controller as a prior knowledge for continuous learning and in the real physical environment, improve the training efficiency in the later stage. The optimized robot posture controller is applied to the motion control of the robot. Compared with the robot motion with PID controller, MPC controller and PID+MPC controller, the standard deviation of the upper body pitch posture trajectory tracking error of the robot is reduced by 60.97%, 46.36%, 23.98% in the environmental transitional walking test, respectively. In the walking test of ground obstacles, the standard deviations of the trajectory tracking errors of the robot's upper body pitching posture are reduced by 60.38%, 26.38% and 9.52%, respectively.

Key words: bipedal walking, deep reinforcement learning, motion control

中图分类号: