• CN: 11-2187/TH
  • ISSN: 0577-6686

Journal of Mechanical Engineering ›› 2026, Vol. 62 ›› Issue (5): 74-87.doi: 10.3901/JME.260229

Previous Articles    

Dynamic Scheduling Optimization of Island Assembly Lines Under Uncertain Disturbances by Multi-objective Deep Reinforcement Learning

HUANG Ming1, HUANG Sihan1,2, CHEN Jianpeng1, DONG Wei1, WANG Baicun3, RUAN Bing4, GAO Yunpeng5, WANG Guoxin1,2, YAN Yan1,2   

  1. 1. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 10081;
    2. Key Laboratory of Industry Knowledge & Data Fusion Technology and Application, Ministry of Industry and Information Technology, Beijing Institute of Technology, Beijing 100081;
    3. School of Mechanical Engineering, Zhejiang University, Hangzhou 310058;
    4. Automotive Engineering Corporation, Tianjin 300113;
    5. SINOMACH Intelligence Technology Research Institute Co., Ltd., Beijing 100013
  • Received:2025-02-25 Revised:2025-07-15 Published:2026-04-23

Abstract: With the rapid development of the new energy vehicle industry and the rise of diversified market demand and customization trends, an emerging island assembly mode has been introduced to address the lack of flexibility in the traditional automotive assembly line. Moreover, the frequent occurrence of uncertain events, such as emergency order insertion, severely restricts the stability and productivity of automotive final assembly in the actual assembly environment. Therefore, based on practical needs, dynamic scheduling optimization of island assembly lines under uncertain disturbances is conducted. First, a mixed-integer nonlinear programming model is formulated with the dual objectives of minimizing the maximum completion time and the order change index. Secondly, a multi-objective dueling double deep Q-network (MO-D3QN) is designed to solve this model. In this framework, state indicators and action scheduling rules are developed based on the features of assembly islands, assembly processes, assembly products, and production transportations in the island assembly scenario. Continuous immediate reward function components are constructed separately for dual optimization objectives, and reward aggregation is implemented by the weighted-sum scalarization method. Then, through the learning training for MO-D3QN network model to realize the selection of the optimized scheduling rules in different environment states. Finally, the computational experiment is conducted on three scaled instances. The results show that MO-D3QN outperforms the single scheduling rule, random selection strategy, and classical DQN, thereby verifying its effectiveness and competitiveness.

Key words: island assembly line, automotive assembly, uncertain disturbances, dynamic scheduling, multi-objective deep reinforcement learning

CLC Number: