• CN: 11-2187/TH
  • ISSN: 0577-6686

Journal of Mechanical Engineering ›› 2026, Vol. 62 ›› Issue (5): 12-25.doi: 10.3901/JME.260224

Previous Articles    

Real-time Scheduling Simulation Optimization Method for Smart Production Lines Based on Digital Twin and Reinforcement Learning

YANG Zehao1, DONG Wei1, HUANG Sihan1,2, YIN Yanchao3, DONG Liyang4, ZHENG Zujie5   

  1. 1. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081;
    2. Key Laboratory of Industry Knowledge & Data Fusion Technology and Application, Ministry of Industry and Information Technology, Beijing Institute of Technology, Beijing 100081;
    3. Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming 650550;
    4. COSMOPlat Industrial Intelligence Research Institute(Qingdao)Co., Ltd, Qingdao 266000;
    5. Shanghai Spaceflight Precision Machinery Institute, Shanghai 201600
  • Received:2025-08-05 Revised:2025-12-05 Published:2026-04-23

Abstract: Production scheduling remains a perpetual research hotspot in industry, serving as a critical metronome for the efficient operation of production lines. With the continuous evolution of intelligent manufacturing, smart production scheduling has emerged as a cutting-edge frontier. Multi-source stochastic disturbances, such as production task variations, the coupling of manufacturing resources, and others, pose a significant challenge in balancing scheduling efficiency and accuracy during dynamic production. To address this challenge, a real-time scheduling simulation optimization method based on digital twin (DT) and reinforcement learning (RL) is proposed. DT technology is used to construct high-fidelity models of production lines, establishing a hierarchical and high-fidelity virtual production simulation environment. An improved Q-Learning algorithm is developed to establish a scheduling optimization agent, incorporating triple state space reconstruction, a multi-dimensional reward function, and a dual exploration strategy to mitigate the curse of dimensionality and the robustness limitations inherent in traditional algorithms. Furthermore, a hierarchical execution control architecture is established based on perception-decision-execution loop throughout the production simulation process to achieve deep fusion between the DT and the intelligent simulation agent. A case study focusing on aerospace product final assembly line is provided to demonstrate the effectiveness of the proposed method. The result shows that the execution distances yielded by five other classical scheduling rules are 6.38% to 16.50% higher than those of the proposed method, signifying a substantial improvement in manufacturing resource collaborative efficiency.

Key words: smart production lines, production scheduling, digital twin, reinforcement learning, simulation optimization

CLC Number: