基于深度强化学习的自动化码头堆场场桥调度方法

doi:10.3901/JME.2024.06.044

机械工程学报 ›› 2024, Vol. 60 ›› Issue (6): 44-57.doi: 10.3901/JME.2024.06.044

• 特邀专栏：数据-知识混合驱动的智能制造系统 • 上一篇下一篇

扫码分享

基于深度强化学习的自动化码头堆场场桥调度方法

王无印¹, 黄子钊¹, 庄子龙¹, 方怀瑾², 秦威¹

1. 上海交通大学工业工程与管理系上海 200240;
2. 上港国际港务(集团)股份有限公司上海 200080

收稿日期:2023-07-07 修回日期:2023-11-21 出版日期:2024-03-20 发布日期:2024-06-07
通讯作者: 秦威，男，1982年出生，博士，副教授，博士研究生导师。主要研究方向为复杂系统建模、控制与优化，机器智能理论、方法与应用。E-mail：wqin@sjtu.edu.cn
作者简介:王无印，女，1999年出生。主要研究方向为智能调度算法。E-mail：yiiiiiiner@sjtu.edu.cn;黄子钊，男，1996年出生，硕士。主要研究方向为生产计划与调度控制、智能优化算法。E-mail：huangzz96@sjtu.edu.cn;庄子龙，男，1995年出生，博士研究生。主要研究方向为复杂系统建模优化，机器学习与工业智能。E-mail：zhuangzl@sjtu.edu.cn;方怀瑾，男，1963年出生，硕士，高级经济师，上海国际港务(集团)股份有限公司副总裁，负责工程技术和科技创新工作。E-mail：fanghj@portshanghai.cn
基金资助:
国家重点研发计划资助项目(2019YFB1704401)。

Yard Crane Scheduling Method Based on Deep Reinforcement Learning for the Automated Container Terminal

WANG Wuyin¹, HUANG Zizhao¹, ZHUANG Zilong¹, FANG Huaijin², QIN Wei¹

1. Institute of Industrial Engineering and Management, Shanghai Jiao Tong University, Shanghai 200240;
2. Shanghai International Port (Group) Co., Ltd., Shanghai 200080

Received:2023-07-07 Revised:2023-11-21 Online:2024-03-20 Published:2024-06-07

摘要/Abstract

摘要： 场桥是自动化码头堆场中的核心作业机械，场桥的合理调度是集装箱作业效率提升的关键。针对场桥调度问题具有的复杂时空耦合特性和高度的动态性，以最小化自动导引车(Automatic guided vehicle，AGV)和外集卡的等待时间为优化目标构建数学规划模型，并提出一种新颖的深度强化学习方法进行求解。算法设计贴近实际堆场作业环境的智能体，并在智能体与环境的交互部分通过指针网络、注意力机制和演员-评论家(Actor-critic，A-C)架构的设计提高了获取状态中的隐藏模式的能力。在基于洋山四期自动化码头实际数据生成的不同规模的算例上展开试验，所提算法能实现场桥调度方案的高效输出，相较于一些启发式规则算法有17%左右的性能提升。试验结果表明所提调度方法是有效且优越的，能够在实际中为堆场作业提供动态决策支持。

关键词: 自动化集装箱码头, 堆场, 场桥调度, 深度强化学习

Abstract: As the core working machinery of automated terminal yard, the dispatching of yard crane is the key to improve the efficiency of container operation. In order to minimize the waiting time of AGVs and external container trucks, a mathematical programming model for the yard crane scheduling problem is established considering complex spatio-temporal coupling characteristics and high dynamic, and a novel deep reinforcement learning method is proposed to solve the problem. The algorithm describes the yard environment close to reality through the agent definition, and improves the ability of extracting hidden state patterns through pointer network, attention mechanism and A-C architecture in the interaction design between the agent and the environment. Experiments are carried out on examples of different scales based on the actual data of Yangshan Phase IV Automated Terminal. The results show that the proposed algorithm can provide an approximately optimal crane scheduling scheme in a relatively short time, and the performance of it is about 17% better compared with state-of-art heuristic rule algorithms. Therefore, the proposed scheduling method is effective and superior, and it can provide dynamic decision support for yard operation in practice.

Key words: automated container terminal, yard, yard crane scheduling, deep reinforcement learning

中图分类号:

U691

王无印, 黄子钊, 庄子龙, 方怀瑾, 秦威. 基于深度强化学习的自动化码头堆场场桥调度方法[J]. 机械工程学报, 2024, 60(6): 44-57.

WANG Wuyin, HUANG Zizhao, ZHUANG Zilong, FANG Huaijin, QIN Wei. Yard Crane Scheduling Method Based on Deep Reinforcement Learning for the Automated Container Terminal[J]. Journal of Mechanical Engineering, 2024, 60(6): 44-57.

参考文献

[1] 黄子钊，庄子龙，滕浩，等. 自动化码头出口箱箱位分配优化超启发式算法[J]. 计算机集成制造系统，2022，28(8)：2619-2632. HUANG Zizhao，ZHUANG Zilong，TENG Hao，et al. Optimization of outbound container space assignment in automated container terminals based on hyper-heuristic[J]. Computer Integrated Manufacturing Systems，2022，28(8)：2619-2632.
[2] BOYSEN N，BRISKORN D，MEISEL F. A generalized classification scheme for crane scheduling with interference[J]. European journal of operational research， 2017，258(1)：343-357.
[3] NOSSACK J，BRISKORN D，PESCH E. Container dispatching and conflict-free yard crane routing in an automated container terminal[J]. Transportation Science， 2018，52(5)：1059-1076.
[4] EILKEN A. A Decomposition-based approach to the scheduling of identical automated yard cranes at container terminals[J]. Journal of Scheduling，2019，22(5)：1-25.
[5] KRESS D，DORNSEIFER J，JAEHN F. An exact solution approach for scheduling cooperative gantry cranes[J]. European Journal of Operational Research，2019，273(1)：82-101.
[6] CORDEAU J F, LEGATO P, MAZZA R M, et al. Simulation-based optimization for housekeeping in a container transshipment terminal[J]. Computers & Operations Research, 2015, 53：81-95.
[7] GHAREHGOZLI A H，VERNOOIJ F G，ZAERPOUR N. A simulation study of the performance of twin automated stacking cranes at a seaport container terminal[J]. European Journal of Operational Research，2017，261(1)：108-128.
[8] JAEHN F，KRESS D. Scheduling cooperative gantry cranes with seaside and landside jobs[J]. Discrete Applied Mathematics，2018，242：53-68.
[9] EHLEITER A，JAEHN F. Scheduling crossover cranes at container terminals during seaside peak times[J]. Journal of Heuristics，2018，24(6)：899-932.
[10] GHAREHGOZLI A H，LAPORTE G，YU Y，et al. Scheduling twin yard cranes in a container block[J]. Transportation Science，2015，49(3)：686-705.
[11] HU Z H，SHEU J B，LUO J X. Sequencing twin automated stacking cranes in a block at automated container terminal[J]. Transportation Research Part C： Emerging Technologies，2016，69：208-227.
[12] LU H，WANG S. A Study on multi-ASC scheduling method of automated container terminals based on graph theory[J]. Computers & Industrial Engineering，2019，129：404-416.
[13] ZHENG F，MAN X，CHU F，et al. Two yard crane scheduling with dynamic processing time and interference[J]. IEEE Transactions on Intelligent Transportation Systems，2018，19(12)：3775-3784.
[14] HAN X，WANG Q，HUANG J. Scheduling cooperative twin automated stacking cranes in automated container terminals[J]. Computers & Industrial Engineering，2019，128：553-558.
[15] HE J，TAN C，ZHANG Y. Yard crane scheduling problem in a container terminal considering risk caused by uncertainty[J]. Advanced Engineering Informatics，2019，39：14-24.
[16] ZHENG F，MAN X，CHU F，et al. A two-stage stochastic programming for single yard crane scheduling with uncertain release times of retrieval tasks[J]. International Journal of Production Research，2019，57(13)：4132-4147.
[17] SUTTON R S，BARTO A G. Reinforcement learning： An introduction[M]. Cambridge：MIT Press，2018.
[18] LIN X，BELING P A，COGILL R，et al. Multiagent inverse reinforcement learning for two-person zero-sum games[J]. IEEE Transactions on Computational Intelligence and AI in Games，2017，10(1)：56-68.
[19] NAZARI M，OROOJLOOY A，SNYDER L V，et al. Reinforcement learning for solving the vehicle routing problem[C]// Advances in Neural Information Processing Systems 31. Montréal：Neural Information Processing Systems，2018：9861-9871.
[20] HWANG I，JANG Y J. Q(λ) learning-based dynamic route guidance algorithm for overhead hoist transport systems in semiconductor fabs[J]. International Journal of Production Research，2020，58(4)：1199-1221.
[21] 刘朝阳，穆朝絮，孙长银. 深度强化学习算法与应用研究现状综述[J]. 智能科学与技术学报，2020，2(4)：314-326. LIU Zhaoyang，MU Chaoxu，SUN Changyin. An overview on algorithms and applications of deep reinforcement learning[J]. Chinese Journal of Intelligent Science and Technology，2020，2(4)：314-326.
[22] 肖鹏飞，张超勇，孟磊磊，等. 基于深度强化学习的非置换流水车间调度问题[J]. 计算机集成制造系统，2021，27(1)：192-205. XIAO Pengfei，ZHANG Chaoyong，MENG Leilei，et al. Non-permutation flow shop scheduling problem based on deep reinforcement learning[J]. Computer Integrated Manufacturing Systems，2021，27(1)：192-205.
[23] VINYALS O，FORTUNATO M，JAITLY N. Pointer networks[C]//Advances in Neural Information Processing Systems 28：International Conference on Neural Information Processing Systems，2015：1-8.
[24] 范华丽，熊禾根，蒋国璋，等. 动态车间作业调度问题中调度规则算法研究综述[J]. 计算机应用研究，2016，33(3)：648-653. FAN Huali，XIONG Hegen，JIANG Guozhang，et al. A review of scheduling rule algorithms in dynamic job shop scheduling problem[J]. Computer Application Research， 2016，33(3)：648-653.
[25] LIU Wenqian，ZHU Xiaoning，WANG Li，et al. Optimization approach for yard crane scheduling problem with uncertain parameters in container terminals[J]. Journal of Advanced Transportation，2021(3)：1-15.
[26] EHLEITER A，JAEHN F. Scheduling crossover cranes at container terminals during seaside peak times[J]. J. Heuristics，2018，24：899-932.

基于深度强化学习的自动化码头堆场场桥调度方法

Yard Crane Scheduling Method Based on Deep Reinforcement Learning for the Automated Container Terminal

PDF

可视化

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 10

编辑推荐

Metrics

本文评价

[1]	郭景华, 李文昌, 王班, 王靖瑶. 基于深度强化学习的网联混合动力汽车队列控制[J]. 机械工程学报, 2024, 60(2): 262-271.
[2]	钟沛成, 骆德渊, 庞明君. 基于深度强化学习的四足机器人跟随策略研究及系统实现[J]. 机械工程学报, 2023, 59(13): 79-88.
[3]	陈睿奇, 黎雯馨, 王传洋, 杨宏兵. 基于深度强化学习的工序交互式智能体Job shop调度方法[J]. 机械工程学报, 2023, 59(12): 78-88.
[4]	唐小林, 陈佳信, 高博麟, 杨凯, 胡晓松, 李克强. 基于云控系统高精度地图驱动的深度强化学习型混合动力汽车集成控制[J]. 机械工程学报, 2022, 58(24): 163-177.
[5]	唐鑫, 欧阳权, 黄俍卉, 王志胜, 马瑞. 基于深度强化学习的锂电池快速充电控制策略[J]. 机械工程学报, 2022, 58(22): 69-78.
[6]	王辉, 徐佳文, 严如强. 基于多尺度注意力深度强化学习网络的行星齿轮箱智能诊断方法[J]. 机械工程学报, 2022, 58(11): 133-142.
[7]	唐小林, 陈佳信, 刘腾, 李佳承, 胡晓松. 基于深度强化学习的混合动力汽车智能跟车控制与能量管理策略研究[J]. 机械工程学报, 2021, 57(22): 237-246.
[8]	陈超逸, 鲁娟, 陈楷, 黎宇嘉, 马俊燕, 廖小平. 车削表面粗糙度解析模型与DDQN-SVR预测模型研究[J]. 机械工程学报, 2021, 57(13): 262-272.
[9]	施群, 吕雷, 谢家骏. 可变环境下仿人机器人智能姿态控制[J]. 机械工程学报, 2020, 56(3): 64-72.
[10]	郭鹏, 张新艳, 余建波. 基于深度强化学习与有限元仿真集成的拉深成形控制[J]. 机械工程学报, 2020, 56(20): 47-58.