• CN:11-2187/TH
  • ISSN:0577-6686

机械工程学报 ›› 2026, Vol. 62 ›› Issue (5): 12-25.doi: 10.3901/JME.260224

• 特邀专栏:信息驱动的总装拉动生产模式、技术及应用 • 上一篇    

扫码分享

基于数字孪生和强化学习的智能生产线实时调度仿真优化方法

杨泽浩1, 董威1, 黄思翰1,2, 阴艳超3, 董李扬4, 郑祖杰5   

  1. 1. 北京理工大学机械与车辆学院 北京 100081;
    2. 北京理工大学知识与数据融合应用工业和信息化部重点实验室 北京 100081;
    3. 昆明理工大学机电工程学院 昆明 650550;
    4. 卡奥斯工业智能研究院(青岛)有限公司 青岛 266000;
    5. 上海航天精密机械研究所 上海 201600
  • 收稿日期:2025-08-05 修回日期:2025-12-05 发布日期:2026-04-23
  • 作者简介:杨泽浩,男,2002年出生。主要研究方向为机器人加工、生产仿真。E-mail:2391646254@qq.com
    董威,男,2002年出生。主要研究方向为智能仿真与优化。E-mail:1635611987@qq.com
    黄思翰(通信作者),男,1991年出生,博士,特聘研究员,博士研究生导师。主要研究方向为具身智能可重构制造、人本智造、数字孪生。E-mail:hsh@bit.edu.cn
  • 基金资助:
    北京市自然科学基金重点研究专题(L243009)和国家自然科学基金(52405530)资助项目。

Real-time Scheduling Simulation Optimization Method for Smart Production Lines Based on Digital Twin and Reinforcement Learning

YANG Zehao1, DONG Wei1, HUANG Sihan1,2, YIN Yanchao3, DONG Liyang4, ZHENG Zujie5   

  1. 1. School of Mechanical Engineering, Beijing Institute of Technology, Beijing 100081;
    2. Key Laboratory of Industry Knowledge & Data Fusion Technology and Application, Ministry of Industry and Information Technology, Beijing Institute of Technology, Beijing 100081;
    3. Faculty of Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming 650550;
    4. COSMOPlat Industrial Intelligence Research Institute(Qingdao)Co., Ltd, Qingdao 266000;
    5. Shanghai Spaceflight Precision Machinery Institute, Shanghai 201600
  • Received:2025-08-05 Revised:2025-12-05 Published:2026-04-23

摘要: 生产调度一直是制造领域的研究热点,是生产线高效运行的节拍器。当前,随着智能制造的深入发展,生产调度智能化逐渐成为领域前沿。智能生产线动态生产过程中面临着生产任务变更、制造资源耦合等多源不确定扰动,如何兼顾生产调度效率和准确性是核心挑战。因此,提出了基于数字孪生和强化学习的智能生产线实时调度仿真优化方法。利用数字孪生技术构建生产要素的几何-功能-状态高保真模型,组装形成层次化、高保真虚拟生产仿真环境;设计改进Q-Learning算法建立生产调度优化智能体,通过三元组状态空间重构、多维度奖励函数及双重探索策略,突破传统算法维数灾难与鲁棒性不足难题;建立分层式执行控制架构实现数字孪生和智能体的深度融合,保障生产仿真过程感知-决策-执行闭环协同。某航天产品总装生产线验证结果表明,其他五种经典调度规则执行距离较本文方法增加6.38%到16.50%,显著提升制造资源协同效率。

关键词: 智能生产线, 生产调度, 数字孪生, 强化学习, 仿真优化

Abstract: Production scheduling remains a perpetual research hotspot in industry, serving as a critical metronome for the efficient operation of production lines. With the continuous evolution of intelligent manufacturing, smart production scheduling has emerged as a cutting-edge frontier. Multi-source stochastic disturbances, such as production task variations, the coupling of manufacturing resources, and others, pose a significant challenge in balancing scheduling efficiency and accuracy during dynamic production. To address this challenge, a real-time scheduling simulation optimization method based on digital twin (DT) and reinforcement learning (RL) is proposed. DT technology is used to construct high-fidelity models of production lines, establishing a hierarchical and high-fidelity virtual production simulation environment. An improved Q-Learning algorithm is developed to establish a scheduling optimization agent, incorporating triple state space reconstruction, a multi-dimensional reward function, and a dual exploration strategy to mitigate the curse of dimensionality and the robustness limitations inherent in traditional algorithms. Furthermore, a hierarchical execution control architecture is established based on perception-decision-execution loop throughout the production simulation process to achieve deep fusion between the DT and the intelligent simulation agent. A case study focusing on aerospace product final assembly line is provided to demonstrate the effectiveness of the proposed method. The result shows that the execution distances yielded by five other classical scheduling rules are 6.38% to 16.50% higher than those of the proposed method, signifying a substantial improvement in manufacturing resource collaborative efficiency.

Key words: smart production lines, production scheduling, digital twin, reinforcement learning, simulation optimization

中图分类号: