• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2025, Vol. 61 ›› Issue (22): 198-210.doi: 10.3901/JME.2025.22.198

• 运载工程 • 上一篇    

扫码分享

面向多场景高速公路的多专家学习自主换道决策方法

姚福星1,2, 李浩宇1, 冷江昊2, 杨雄基2, 孙超1,2   

  1. 1. 北京理工大学机械与车辆学院 北京 100081;
    2. 北京理工大学深圳汽车研究院 深圳 518057
  • 收稿日期:2024-12-25 修回日期:2025-06-16 发布日期:2026-01-10
  • 作者简介:姚福星,男,1994年出生,博士研究生。主要研究方向为自动驾驶决策。E-mail:dajiaotingqiao_23@126.com
    孙超(通信作者),男,1988年出生,博士,副教授,博士研究生导师。主要研究方向为自动驾驶与智能网联汽车,电动车辆节能管理。E-mail:chaosun@bit.edu.cn
  • 基金资助:
    面向智能汽车的数字路网关键技术研究与应用示范(2023B0909040001); 面向智能网联运营管理的车路云一体化数据底座关键技术与应用示范(KJZD20231023100304010)资助项目。

Multi-expert Learning Method for Autonomous Lane-changing Decision-making in Multi-scenario Highway Environments

YAO Fuxing1,2, LI Haoyu1, LENG Jianghao2, YANG Xiongji2, SUN Chao1,2   

  1. 1. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081;
    2. Shenzhen Automotive Research Institute, Beijing Institute of Technology, Shenzhen 518057
  • Received:2024-12-25 Revised:2025-06-16 Published:2026-01-10

摘要: 高速公路环境下自动驾驶汽车的自主换道决策过程对驾驶安全性与效率有很高要求,导致基于学习的决策算法训练耗时显著增加。研究提出一种多专家学习换道决策方法(Multi-experts learning method,MELM),该方法集成多个软演员评论家算法(Soft actor-critic,SAC)训练的专家模型,每个专家在预先划分的不同子场景约束条件下进行训练。子场景根据原始高速公路的换道特征定义,各专家在其对应子场景中控制车辆,并通过场景分类器识别当前子场景进行集成。相较于单一SAC模型,MELM能够显著降低SAC训练难度,使模型训练时间减少62.19%。仿真实验结果表明:在典型驾驶场景下MELM相比单一SAC模型驾驶效率提高27.06%,且在100回合测试(约10万步)中达成零碰撞与零出界事故的优秀安全性能。此外,通过不同道路条件设置下的仿真试验场景验证了MELM的高适应性。

关键词: 自动驾驶, 多专家学习方法, 自主换道决策

Abstract: The decision-making process of autonomous vehicles on highways involves a sequence of driving maneuvers aimed at improving safety and efficiency, which, however, results in considerable training time for the learning algorithm. This study proposes a multi-expert learning method(MELM) that integrates multiple actors(experts), each trained using the soft actor-critic(SAC) algorithm under constraints derived from distinct sub-layer scenarios. Each sub-layer scenario is defined according to the distinct properties of the original training scenario. Each expert controls the vehicle in its corresponding sub-layer scenario and is integrated via a classifier that identifies the applicable sub-scenario. As a result, the MELM significantly reduces the model’s training time by 62.19% compared to a single SAC model, while also improving driving safety and efficiency, attributed to a remarkable reduction in the training difficulty of SAC. The proposed MELM is compared against several state-of-the-art methods under representative driving scenarios. Simulation results show a 27.06% improvement in driving efficiency compared to the single SAC model, along with high safety performance characterized by zero collision and off-road incidents across 100 testing episodes(~100 000 timesteps). Furthermore, the adaptability of MELM is validated through simulation in a variety of scenarios with different condition settings.

Key words: autonomous driving, multi-experts learning method, lane changing decision-making

中图分类号: