• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2025, Vol. 61 ›› Issue (2): 236-246.doi: 10.3901/JME.2025.02.236

• 运载工程 • 上一篇    

扫码分享

横纵向耦合跟车场景下基于多智能体深度强化学习的混合动力车队协同能量管理研究

唐小林1, 甘炯鹏1, 张振果2   

  1. 1. 重庆大学机械与运载工程学院 重庆 400044;
    2. 上海交通大学机械与动力工程学院 上海 200024
  • 收稿日期:2024-01-19 修回日期:2024-09-20 发布日期:2025-02-26
  • 作者简介:唐小林,男,1984年出生,博士,教授,博士研究生导师。主要研究方向为混合动力汽车动力学与节能控制。E-mail:tangxl0923@cqu.edu.cn;甘炯鹏,男,1997年出生。主要研究方向为混合动力汽车能量管理与深度强化学习。E-mail:ganjiongpeng@cqu.edu.cn;张振果(通信作者),男,1985年出生,博士,副教授,博士研究生导师。主要研究方向为机械系统振动噪声分析与控制。E-mail:zzgjtx@sjtu.edu.cn
  • 基金资助:
    国家自然科学基金(52222215,52072051)和重庆市杰出青年基金(2023NSCQ- JQX0009)资助项目。

Research on the Collaborative Energy Management Strategy of Hybrid Electric Vehicle Platoon Based on Multi-agent Deep Reinforcement Learning in the Transverse and Longitudinal Coupled Car-following Scenario

TANG Xiaolin1, GAN Jiongpeng1, ZHANG Zhenguo2   

  1. 1. College of Mechanical and Vehicle Engineering, Chongqing University, Chongqing 400044;
    2. School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200024
  • Received:2024-01-19 Revised:2024-09-20 Published:2025-02-26

摘要: 为了探索多智能体深度强化学习算法在混合动力汽车多目标协同控制中的应用,提出了一种基于多智能体深度确定性策略梯度算法的混合动力车队协同能量管理策略。首先,利用交通仿真软件搭建横纵向耦合跟车场景,以模拟车联网环境实现对车辆信息的准确获取。其次,设计了包含横向变道及纵向跟车的基于规则及网格搜索的横纵向耦合跟车策略,以实现更高的通行效率。最后,利用多智能体深度确定性策略梯度算法设计混合动力车队自适应协同能量管理策略,实现车队整体效益最大化,并通过随机车流初始位置获取随机车队需求功率序列,从而增加策略训练的随机性,提高策略对不同工况的适应性。结果表明,多智能体的车队协同能量管理策略与单智能体相比拥有更好的整体优化效果,并且经随机工况训练后,其工况适应性得到了一定的提升。

关键词: 横纵向耦合跟车, 多智能体深度强化学习, 混合动力车队, 协同能量管理

Abstract: To explore the application of the multi-agent deep reinforcement learning (DRL) algorithm in hybrid electric vehicle multi-objective cooperative control, a multi-agent deep deterministic strategy gradient (MADDPG) algorithm-based hybrid electric vehicle platoon collaborative energy management strategy was proposed. Firstly, the traffic simulation software is used to build a transverse and longitudinal coupled car-following scene to simulate the internet of vehicles environment to achieve accurate acquisition of vehicle information. Secondly, a transverse and longitudinal coupled car-following strategy based on rule and grid search was designed, including lateral lane change and longitudinal car following, to achieve higher traffic efficiency Finally, the MADDPG algorithm was used to design an adaptive collaborative energy management strategy for the hybrid electric vehicle platoon to maximize the overall benefit, and the random vehicle demand power sequence was obtained through the initial position of random traffic flow, thus increasing the randomness of strategy training and improving the adaptability of the strategy to different driving conditions. The results show that the multi-agent vehicle platoon collaborative energy management strategy has a better overall optimization effect than the single agent, and its adaptability to driving conditions has been improved to a certain extent after training in random driving conditions.

Key words: transverse and longitudinal coupled car-following, multi-agent deep reinforcement learning, hybrid electric vehicle platoon, cooperative energy management

中图分类号: