Reinforcement learning and stochastic dynamic programming for jointly scheduling jobs and preventive maintenance on a single machine to minimise earliness-tardiness

This paper addresses the problem of stochastic jointly scheduling of resumable jobs and preventive maintenance on a single machine, subject to random breakdowns, to minimise the earliness-tardiness cost. The main objective is to investigate using trending machine learning-based methods compared to s...

Full description

Saved in:

Bibliographic Details
Published in	International journal of production research Vol. 62; no. 3; pp. 705 - 719
Main Authors	Sabri, Abderrazzak, Allaoui, Hamid, Souissi, Omar
Format	Journal Article
Language	English
Published	London Taylor & Francis 01.02.2024 Taylor & Francis LLC
Subjects	Advanced planning and scheduling systems Algorithms Computer Science Computing time Deep learning Dynamic programming Lateness Machine learning Preventive maintenance Run time (computers) Scheduling Stochastic programming
Online Access	Get full text
ISSN	0020-7543 1366-588X
DOI	10.1080/00207543.2023.2172472

Cover

More Information
Summary:	This paper addresses the problem of stochastic jointly scheduling of resumable jobs and preventive maintenance on a single machine, subject to random breakdowns, to minimise the earliness-tardiness cost. The main objective is to investigate using trending machine learning-based methods compared to stochastic optimisation approaches. We propose two different methods from both fields as we solve the same problem firstly with a stochastic dynamic programming model in an approximation way, then with an attention-based deep reinforcement learning model. We conduct a detailed experimental study according to solution quality, run time, and robustness to analyse their performances compared to those of an existing approach in the literature as a baseline. Both algorithms outperform the baseline. Moreover, the machine learning-based algorithm outperforms the stochastic dynamic programming-based heuristic as we report up to 30.5% saving in total cost, a reduction of computational time from 67 min to less than $ 1s $ 1 s on big instances, and a better robustness. These facts highlight clearly its potential for solving such problems.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0020-7543 1366-588X
DOI:	10.1080/00207543.2023.2172472