基于DQN强化学习的自动驾驶转向控制策略

doi:10.3901/JME.2023.16.315

机械工程学报 ›› 2023, Vol. 59 ›› Issue (16): 315-324.doi: 10.3901/JME.2023.16.315

扫码分享

基于DQN强化学习的自动驾驶转向控制策略

林歆悠, 叶卓明, 周斌豪

福州大学机械工程及自动化学院福州 350002

收稿日期:2022-07-30 修回日期:2022-11-12 出版日期:2023-08-20 发布日期:2023-11-15
通讯作者: 林歆悠(通信作者),男,1981年出生,博士,副教授。主要研究方向为新能源电动汽车能量管理控制策略与自动驾驶车路协同控制。E-mail:linxinyoou@fzu.edu.cn。
作者简介:叶卓明,男,1996年出生,主要研究方向为自动驾驶规划控制。E-mail:yipzom@163.com;周斌豪,男,1995年出生,主要研究方向为自动驾驶路径跟踪控制。E-mail:369728624@qq.com
基金资助:
国家自然科学基金(52272389);福建省自然科学基金(2020J01449);华东交通大学载运工具与装备教育部重点实验室开放课题(KLCF2022-08);安徽工程大学检测技术与节能装置安徽省重点实验室开放研究基金(JCKJ2021A04)资助项目。

DQN Reinforcement Learning-based Steering Control Strategy for Autonomous Driving

LIN Xinyou, YE Zhuoming, ZHOU Binhao

School of Mechanical Engineering and Automation, Fuzhou University, Fuzhou 350002

Received:2022-07-30 Revised:2022-11-12 Online:2023-08-20 Published:2023-11-15

摘要/Abstract

摘要： 为解决自动驾驶汽车的自主转向问题，大多研究主要基于模型预测控制（Modelpredictivecontrol,MPC）策略，而传统MPC策略需要被控对象精确的数学模型同时需要大量实时控制计算。为此，提出一种基于深度Q-Learning神经网络（Deep Q-Learning neural network,DQN）强化学习的转向控制策略，使自动驾驶汽车能够精准有效地进行路径跟踪，提高路径跟踪精度和稳定性。该策略基于DQN强化学习通过选取合适的学习率对智能体进行训练，使训练后的智能体能够自适应根据不同路况和车速得到最佳的前轮转角。仿真对比结果表明，与无约束的线性二次型调节器（Linear quadraticregulator,LQR）控制策略相比，基于DQN强化学习的控制策略的累计绝对横向位置偏差以及累计绝对横摆角度偏差都有较大的增加，但也在可接受的范围内，能有效提高路径跟踪的精度。最后的实车试验结果同样表明了所制定的控制策略的有效性。

关键词: 自动驾驶, 转向控制, 路径跟踪, 强化学习, 深度Q学习神经网络

Abstract: To solve the problem of autonomous steering of autonomous vehicles, most researches are mainly based on the model predictive control（MPC） strategy, while the traditional MPC strategy requires an accurate mathematical model of the controlled object and a lot of real-time control calculations. To this end, a steering control strategy based on deep Q-Learning neural network（DQN）reinforcement learning is proposed, which enables autonomous vehicles to track paths accurately and effectively, and improves the accuracy and stability of path tracking. The strategy is based on DQN reinforcement learning to train the agent by selecting an appropriate learning rate, so that the trained agent can adaptively obtain the best front wheel turning angle according to different road conditions and vehicle speeds. The simulation comparison results show that compared with the unconstrained linear quadratic regulator（LQR） control strategy, the cumulative absolute lateral position deviation and cumulative absolute yaw angle deviation of the control strategy based on DQN reinforcement learning have increased significantly. But it is also within an acceptable range, which can effectively improve the accuracy of path tracking. The final real vehicle test results also show the effectiveness of the proposed control strategy.

Key words: automatic driving, steering control, path tracking, reinforcement learning, deep Q-learning network

中图分类号:

U471

林歆悠, 叶卓明, 周斌豪. 基于DQN强化学习的自动驾驶转向控制策略[J]. 机械工程学报, 2023, 59(16): 315-324.

LIN Xinyou, YE Zhuoming, ZHOU Binhao. DQN Reinforcement Learning-based Steering Control Strategy for Autonomous Driving[J]. Journal of Mechanical Engineering, 2023, 59(16): 315-324.

参考文献

[1] 蔡英凤,李健,孙晓强,等. 智能汽车路径跟踪混合控制策略研究[J]. 中国机械工程, 2020, 31(3):289-298. CAI Yingfeng, LI Jian, SUN Xiaoqiang, et al. Research on hybrid control strategy for intelligent vehicle path tracking[J]. China Mechanical Engineering, 2020, 31(3):289-298.
[2] GUERRERO J, TORRES J, CREUZE V, et al. Saturation based nonlinear PID control for underwater vehicles:Design, stability analysis and experiments[J]. Mechatronics, 2019, 61:96-105.
[3] FARAG W. Complex trajectory tracking using PID control for autonomous driving[J]. International Journal of Intelligent Transportation Systems Research, 2020, 18(2):356-366.
[4] HAN G, FU W, WANG W, et al. The lateral tracking control for the intelligent vehicle based on adaptive PID neural network[J]. Sensors (Basel), 2017, 17(6):1-15.
[5] 段建民,杨晨,石慧. 基于Pure Pursuit算法的智能车路径跟踪[J]. 北京工业大学学报, 2016, 42(9):1301-1306. DUAN Jianmin, YANG Chen, SHI Hui. Path tracking based on pure pursuit algorithm for intelligent vehicles[J]. Journal of Beijing University of Technology, 2016, 42(9):1301-1306.
[6] YANG J, BAO H, MA N, et al. An algorithm of curved path tracking with prediction model for autonomous vehicle[C]//201713th International Conference on Computational Intelligence and Security (CIS), December 15-18, 2017, Hong Kong, China. New York:IEEE, 2017:405-408.
[7] FAN Z, CHEN H. Study on path following control method for automatic parking system based on LQR[J]. SAE International Journal of Passenger Cars-Electronic and Electrical Systems, 2017, 10(1):41-50.
[8] 熊璐,杨兴,卓桂荣,等. 无人驾驶车辆的运动控制发展现状综述[J]. 机械工程学报, 2020, 56(10):127-143. XIONG Lu, YANG Xing, ZHUO Guirong, et al. Review on motion control of autonomous vehicles[J]. Journal of Mechanical Engineering, 2020, 56(10):127-143
[9] 陈龙,邹凯,蔡英凤,等. 基于NMPC的智能汽车纵横向综合轨迹跟踪控制[J]. 汽车工程, 2021, 43(2):153-161. CHEN Long, ZOU Kai, CAI Yingfeng, et al. Longitudinal and lateral comprehensive trajectory tracking control of intelligent vehicles based on NMPC[J]. Automotive Engineering, 2021, 43(2):153-161.
[10] CHENG S, LI L, CHEN X, et al. Model-predictivecontrol based path tracking controller of autonomous vehicle considering parametric uncertainties and velocity-varying[J]. IEEE Transactions on Industrial Electronics, 2021, 68(9):8698-8707.
[11] 王艺,蔡英凤,陈龙,等. 基于模型预测控制的智能网联汽车路径跟踪控制器设计[J]. 机械工程学报, 2019, 55(8):136-144. WANG Yi, CAI Yingfeng, CHEN Long, et al. Design of intelligent and connected vehicle path tracking controller based on model predictive control[J]. Journal of Mechanical Engineering, 2019, 55(8):136-144, 153.
[12] 吴运雄,曾碧. 基于深度强化学习的移动机器人轨迹跟踪和动态避障[J]. 广东工业大学学报, 2019, 36(1):42-50. WU Yunxiong, ZENG Bi. Trajectory tracking and dynamic obstacle avoidance of mobile robot based on deep reinforcement learning[J]. Journal of Guangdong University of Technology, 2019, 36(1):42-50.
[13] 周楠, 陈刚. 机器人驾驶车辆深度强化学习换挡策略[J]. 汽车工程, 2020, 42(11):1473-1481. ZHOU Nan, CHEN Gang. Gearshifting strategy for robot-driven vehicles based on deep reinforcement learning[J]. Automotive Engineering, 2020, 42(11):1473-1481
[14] YU R, SHI Z, HUANG C, et al. Deep reinforcement learning based optimal trajectory tracking control of autonomous underwater vehicle[C]//201736th Chinese Control Conference (CCC), July 26-28, 2017, Dalian, China. New York:IEEE, 2017:4958-4965.
[15] SHAN Y, ZHENG B, CHEN L, et al. A reinforcement learning-based adaptive path tracking approach for autonomous driving[J]. IEEE Transactions on Vehicular Technology, 2020, 69(10):10581-10595.
[16] CHEN I, CHAN C. Deep reinforcement learning based path tracking controller for autonomous vehicle[J]. Proceedings of the Institution of Mechanical Engineers, Part D:Journal of Automobile Engineering, 2021, 235(2-3):541-551.
[17] 高阳,陈世福,陆鑫.强化学习研究综述[J].自动化学报, 2004, 30(1):86-100. GAO Yang, CHEN Shifu, LU Xin. Research on reinforcement learning technology:A review[J]. ACTA AUTOMATICA SINICA, 2004, 30(1):86-100.
[18] MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International conference on machine learning. New York:PMLR, 2016:1928-1937.
[19] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Playing atari with deep reinforcement learning[J]. arXiv 1312.5602, 2013.

基于DQN强化学习的自动驾驶转向控制策略

DQN Reinforcement Learning-based Steering Control Strategy for Autonomous Driving

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

编辑推荐

Metrics

本文评价

[1]	侯旭朝, 马越, 项昌乐. 电驱动履带车辆转向稳定性控制研究[J]. 机械工程学报, 2024, 60(8): 233-244.
[2]	王无印, 黄子钊, 庄子龙, 方怀瑾, 秦威. 基于深度强化学习的自动化码头堆场场桥调度方法[J]. 机械工程学报, 2024, 60(6): 44-57.
[3]	赵阔, 王皂琦, 潘臻信, 潘扬华, 张中飞, 屈挺. 大数据驱动的快消品终端拜访“云-边”联动决策与优化[J]. 机械工程学报, 2024, 60(6): 58-68.
[4]	张昊, 魏超, 胡纪滨, 陈泳丹. 基于转向模式切换的三轴独立转向车辆路径跟踪控制研究[J]. 机械工程学报, 2024, 60(2): 243-251.
[5]	郭景华, 李文昌, 王班, 王靖瑶. 基于深度强化学习的网联混合动力汽车队列控制[J]. 机械工程学报, 2024, 60(2): 262-271.
[6]	褚端峰, 刘鸿祥, 高博麟, 王金湘, 殷国栋. 车辆队列预测巡航控制研究综述[J]. 机械工程学报, 2024, 60(18): 218-246.
[7]	隗寒冰, 吴化腾, 徐进. 考虑驾驶员NMS特征的自动驾驶汽车人机共驾鲁棒横向控制[J]. 机械工程学报, 2024, 60(16): 280-290.
[8]	李文礼, 张祎楠, 石晓辉, 王梦昕. 基于博弈论的右转无信号交叉口行人行为模拟[J]. 机械工程学报, 2024, 60(10): 86-101.
[9]	高镇海, 于桐, 孙天骏. 考虑社会性行为的自动驾驶运动规划研究综述[J]. 机械工程学报, 2024, 60(10): 112-128.
[10]	戢杨杰, 张馨雨, 杨紫茹, 周上航, 黄岩军, 曹建永, 熊璐, 余卓平. 多智能网联汽车轨迹规划：现状与展望[J]. 机械工程学报, 2024, 60(10): 129-146.
[11]	梁凯冲, 赵治国, 颜丹姝, 赵坤. 基于动态运动基元的车辆高速公路换道轨迹规划[J]. 机械工程学报, 2024, 60(10): 192-206.
[12]	周洪龙, 裴晓飞, 刘一平, 赵柯帆. 面向动态不确定场景的自动驾驶车辆时空耦合分层轨迹规划研究[J]. 机械工程学报, 2024, 60(10): 222-234.
[13]	曾迪, 郑玲, 李以农, 杨显通. 自动驾驶奖励函数贝叶斯逆强化学习方法[J]. 机械工程学报, 2024, 60(10): 245-260.
[14]	聂士达, 刘辉, 廖志昊, 谢雨佳, 项昌乐, 韩立金, 林思豪. 考虑复杂地形的越野环境无人车辆路径规划研究[J]. 机械工程学报, 2024, 60(10): 261-272.
[15]	杨硕, 李时珍, 赵中原, 黄小鹏, 黄岩军. 基于时序差分学习模型预测控制的一体化自动驾驶换道策略[J]. 机械工程学报, 2024, 60(10): 329-338.