Interactive Operation Agent Scheduling Method for Job Shop Based on Deep Reinforcement Learning

doi:10.3901/JME.2023.12.078

Abstract

Abstract: Job shop scheduling problem(JSSP) is difficult to obtain high-quality solution quickly due to NP hard attribute, and rescheduling occurs frequently due to the random disturbances of production scenarios. Based on deep reinforcement learning, a novel interactive operation agent(IOA) scheduling model framework is proposed. Through analysis of the constraint relationship between process route and processing equipment among operations, the processing processes in job shop are constructed as operation agents. The interaction mechanism between operation agents is designed, and each agent can interact with each other and update its own feature vector according to their relationship. Further, a deep neural network is constructed based on the operation characteristics and the earliest processing time to fit the action value function. As a result, the scheduling model can generate the scheduling strategy according to the system state and the characteristics of each operation agent. Double DQN algorithm is used to train IOA scheduling model, and the introduction of empirical playback mechanism effectively breaks the correlation between sequence training samples. The trained model can quickly generate high-quality scheduling scheme, and effectively execute rescheduling production strategy in case of machine failure. Experimental results show that the proposed IOA scheduling method is superior to greedy algorithm and heuristic scheduling rules, and has good robustness and generalization ability.

Key words: Job shop scheduling, deep reinforcement learning, operation agents, machine failure, double DQN

CLC Number:

TH166

CHEN Ruiqi, LI Wenxin, WANG Chuanyang, YANG Hongbing. Interactive Operation Agent Scheduling Method for Job Shop Based on Deep Reinforcement Learning[J]. Journal of Mechanical Engineering, 2023, 59(12): 78-88.

Add to citation manager EndNote|Reference Manager|ProCite|BibTeX|RefWorks

URL: http://www.cjmenet.com.cn/EN/10.3901/JME.2023.12.078

http://www.cjmenet.com.cn/EN/Y2023/V59/I12/78

References

[1] Zhang J,Ding G,Zou Y,et al. Review of job shop scheduling research and its new perspectives under Industry 4.0[J]. Journal of Intelligent Manufacturing,2017,30:1809-1830.
[2] 肖世昌,吴自高,孙树栋,等. 双资源约束的鲁棒Job Shop调度问题研究[J]. 机械工程学报,2021,57(4):227-239. XIAO Shichang,WU Zigao,SUN Shudong,et al. Research on the dual-resource constrained robust job shop scheduling problems[J]. Journal of Mechanical Engineering,2021,57(4):227-239.
[3] XANTHOPOULOS A,KOULOURIOTIS D E. Cluster analysis and neural network-based metamodeling of priority rules for dynamic sequencing[J]. Journal of Intelligent Manufacturing,2018,29(1):69-91.
[4] WANG C,JIANG P. Manifold learning based rescheduling decision mechanism for recessive disturbances in RFID-driven job shops[J]. Journal of Intelligent Manufacturing,2018,29(7):1485-1500.
[5] PENG B,LÜ Z,CHENG T. A tabu search/path relinking algorithm to solve the job shop scheduling problem[J]. Computers & Operations Research,2015,53:154-164.
[6] CROCE F,TADEI R,VOLTA G. A genetic algorithm for the job shop problem[J]. Computers & Operations Research,1995,22(1):15-24.
[7] Werner F,Winkler A. Insertion techniques for the heuristic solution of the job shop problem[J]. Discrete Applied Mathematics,1995,58(2):191-211.
[8] Adams J,Balas E,Zawack D. The shifting bottleneck procedure for job shop scheduling[J]. Management Science,1988,34(3):391-401.
[9] Zhang W,Wen J B,Zhu Y C,et al. Multi-objective scheduling simulation of flexible job-shop based on multi-population genetic algorithm[J]. International Journal of Simulation Modelling,2017,16(2):313-321.
[10] 赵诗奎. Job Shop基于无延迟调度路径重连与回溯禁忌搜索算法研究[J]. 机械工程学报,2021,57(14):291-303. ZHAO Shikui. Research on path relinking based on non-delay scheduling and backtracking tabu search algorithm of job shop scheduling problem[J]. Journal of Mechanical Engineering,2021,57(14):291-303.
[11] 孟磊磊,张彪,任亚平,等. 求解分布式柔性作业车间调度的混合蛙跳算法[J]. 机械工程学报,2021,57(17):263-272. MENG Leilei,ZHANG Biao,REN Yaping,et al. Hybrid shuffled frog-leaping algorithm for distributed flexible job shop scheduling[J]. Journal of Mechanical Engineering,2021,57(17):263-272.
[12] Muhammad K A,Shahid I B,RUBEENA K,et al. Recent research trends in genetic algorithm based flexible job shop scheduling problems[J]. Mathematical Problems in Engineering,2018(8):1-3.
[13] SHAKHLEVICH N,SOTSKOV Y N,WERNER F. Adaptive scheduling algorithm based on mixed graph model[J]. IEE Proceedings-Control Theory and Applications,1996,143(1):9-16.
[14] LEE K K. Fuzzy rule generation for adaptive scheduling in a dynamic manufacturing environment[J]. Applied Soft Computing,2008,8(4):1295-1304.
[15] Wang L,PAN Z X,WANG J J. A review of reinforcement learning based intelligent optimization for manufacturing scheduling[J]. Complex System Modeling and Simulation,2021,1(4):257-270.
[16] DRUGAN M M. Reinforcement learning versus evolutionary computation:A survey on hybrid algorithms[J]. Swarm and Evolutionary Computation,2019,44:228-246.
[17] Mnih V,Kavukcuoglu K,Silver D,et al. Human-level control through deep reinforcement learning[J]. Nature,2015,518(7540):529-533.
[18] Silver D,Huang A,Maddison C J,et al. Mastering the game of go with deep neural networks and tree search[J]. Nature,2016,529(7587):484-489.
[19] Silver D,Schrittwieser J,Simonyan K,et al. Mastering the game of go without human knowledge[J]. Nature,2017,550(7676):354-359.
[20] LIU C L,CHANG C C,TSENG C J. Actor-critic deep reinforcement learning for solving job shop scheduling problems[J]. IEEE Access,2020,8:71752-71762.
[21] 肖鹏飞,张超勇,孟磊磊,等. 基于深度强化学习的非置换流水车间调度问题[J]. 计算机集成制造系统,2021,27(1):192-205. XIAO Pengfei,ZHANG Chaoyong,MENG Leilei,et al. Non-permutation flow shop scheduling problem based on deep reinforcement learning[J]. Computer Integrated Manufacturing System,2021,27(1):192-205.
[22] 王凌,潘子肖. 基于深度强化学习与迭代贪婪的流水车间调度优化[J]. 控制与决策,2021,36(11):2609-2617. WANG Ling,PAN Zixiao. Scheduling optimization for flow-shop based on deep reinforcement learning and iterative greedy method[J]. Control and Decision,2021,36(11):2609-2617.
[23] Palombarini J A,MARTÍNEZ E C. Closed-loop rescheduling using deep reinforcement learning[J]. IFAC-PapersOnLine,2019,52(1):231-236.
[24] Han B A,Yang J J. Research on adaptive job shop scheduling problems based on dueling double DQN[J]. IEEE Access,2020,8:186474-186495.
[25] Luo S. Dynamic scheduling for flexible job shop with new job insertions by deep reinforcement learning[J]. Applied Soft Computing,2020,91:106208.
[26] ZHANG Y,ZHU H,TANG D,et al. Dynamic job shop scheduling based on deep reinforcement learning for multi-agent manufacturing systems[J]. Robotics and Computer-Integrated Manufacturing,2022,78:102412.
[27] LIU R,PIPLANI R,TORO C. Deep reinforcement learning for dynamic scheduling of a flexible job shop[J]. International Journal of Production Research,2022,60(13):4049-4069.
[28] DAI H,KHALIL E B,ZHANG Y,et al. Learning combinatorial optimization algorithms over graphs[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach,California,USA:2017:6351-6361.
[29] Hasselt H. Double Q-learning[J]. Advances in Neural Information Processing Systems,2010,23:2613-2621.
[30] Hasselt H V,Guez A,Silver D. Deep reinforcement learning with double q-learning[C]//Proceedings of the AAAI Conference on Artificial Intelligence. Phoenix,Arizona,USA:2016:2094-2100.
[31] BEASLEY J E. OR-Library:Distributing test problems by electronic mail[J]. Journal of the Operational Research Society,1990,41(11):1069-1072.