• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2025, Vol. 61 ›› Issue (4): 380-388.doi: 10.3901/JME.2025.04.380

• 交叉与前沿 • 上一篇    

扫码分享

基于多模态的履带式挖掘机自动驾驶决策网络

陈其怀1,2, 林添良1,2, 马荣华1,2, 温建和1,2, 缪骋1,2, 任好玲1,2   

  1. 1. 华侨大学机电及自动化学院 泉州 361021;
    2. 福建省移动机械绿色智能驱动与传动重点实验室 泉州 361021
  • 收稿日期:2024-03-19 修回日期:2024-11-03 发布日期:2025-04-14
  • 作者简介:陈其怀,男,博士,副教授。主要研究方向为工程机械绿色化和智能化技术。E-mail:chen.qihuai@163.com
    林添良(通信作者),男,博士,教授,博士研究生导师。主要研究方向为工程机械的电动化和智能化技术。E-mail:ltlkxl@163.com
  • 基金资助:
    国家自然科学基金(52275055)、福建省高校产学研联合创新(2022H6007)和厦门市重大科技计划(3502Z20231013)资助项目。

Automatic Driving Decision Network of Crawler Excavator Based on Multi-modal

CHEN Qihuai1,2, LIN Tianliang1,2, MA Ronghua1,2, WEN Jianhe1,2, MIAO Cheng1,2, REN Haoling1,2   

  1. 1. College of Mechanical Engineering and Automation, Huaqiao University, Quanzhou 361021;
    2. Fujian Key Laboratory of Green Intelligent Drive and Transmission for Mobile Machinery, Quanzhou 361021
  • Received:2024-03-19 Revised:2024-11-03 Published:2025-04-14

摘要: 履带挖掘机应用场景复杂,工作环境无规则。当前基于深度学习的自动驾驶决策方法主要以单目相机RGB图像为输入,数据类型单一、预测精度较低、对驾驶场景理解不足,不能够完成履带式挖掘机自动驾驶的决策。为了更好地实现对履带式挖掘机自动驾驶的决策,提出将多种双目图像信息进行融合,同时引入注意力机制,构建基于多模态的履带式挖掘机决策网络,得到转向和速度多任务预测量。为了验证所提出方案的可行性,利用开源驾驶场景数据集和非结构化道路真实场地数据集进行测试,并开展实车试验。试验结果表明,综合多种双目图像信息的模型在履带式挖掘机转向角和速度预测方面,泛化能力有着明显的优势,能较好地完成履带挖掘机的自动行走,并进行自主避障。

关键词: 工程机械, 深度学习, 自动驾驶, 行为决策, 多模态

Abstract: The application scenario of crawler excavators is complex. The working environment is irregular. Current deep learning-based autonomous driving decision-making methods mainly take monocular camera RGB images as input, have single data types, low prediction accuracy and insufficient understanding of driving scenes, which are not sufficient to complete autonomous driving decisions of electric construction machinery. In order to better realize the decision of automatic driving of electric construction machinery, multiple binocular image information is fused, and attention mechanism is employed to construct a multi-modal behavior decision model of electric construction machinery, and the multi-task predictors of steering and speed are obtained. Conduct testing using open-source driving scenario datasets and unstructured road real site datasets, and conduct real vehicle testing. The experimental results show that the model that integrates multiple binocular image information has obvious advantages in generalization ability in predicting the steering angle and speed of tracked excavators, and can effectively complete the automatic walking of tracked excavators and perform autonomous obstacle avoidance.

Key words: construction machinery, deep learning, autonomous driving, behavior decision, multi-modal

中图分类号: