Combining Reinforcement Learning Algorithms with Graph Neural Networks to Solve Dynamic Job Shop Scheduling Problems

Smart factories have attracted a lot of attention from scholars for intelligent scheduling problems due to the complexity and dynamics of their production processes. The dynamic job shop scheduling problem (DJSP), as one of the intelligent scheduling problems, aims to make an optimized scheduling de...

Full description

Saved in:
Bibliographic Details
Published inProcesses Vol. 11; no. 5; p. 1571
Main Authors Yang, Zhong, Bi, Li, Jiao, Xiaogang
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 21.05.2023
Subjects
Online AccessGet full text
ISSN2227-9717
2227-9717
DOI10.3390/pr11051571

Cover

More Information
Summary:Smart factories have attracted a lot of attention from scholars for intelligent scheduling problems due to the complexity and dynamics of their production processes. The dynamic job shop scheduling problem (DJSP), as one of the intelligent scheduling problems, aims to make an optimized scheduling decision sequence based on the real-time dynamic job shop environment. The traditional reinforcement learning (RL) method converts the scheduling problem with a Markov process and combines its own reward method to obtain scheduling sequences in different real-time shop states. However, the definition of shop states often relies on the scheduling experience of the model constructor, which undoubtedly affects the optimization capability of the reinforcement learning model. In this paper, we combine graph neural network (GNN) and deep reinforcement learning (DRL) algorithm to solve DJSP. An agent model from job shop state analysis graph to scheduling rules is constructed, thus avoiding the problem that traditional reinforcement learning methods rely on scheduling experience to artificially set the state feature vectors. In addition, a new reward function is defined, and the experimental results prove that our proposed reward method is more effective. The effectiveness and feasibility of our model is demonstrated by comparing with general deep reinforcement learning algorithms on minimizing the earlier and later completion time, which also lays the foundation for solving the DJSP later.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2227-9717
2227-9717
DOI:10.3390/pr11051571