A self-learning whale optimization algorithm based on reinforcement learning for a dual-resource flexible job shop scheduling problem

One of the key areas in which production systems researchers are working these days is to find advanced optimization algorithms to efficiently schedule activities in manufacturing systems, which requires more sophisticated models with increased computational complexity. Therefore, there has been gro...

Full description

Saved in:
Bibliographic Details
Published inApplied soft computing Vol. 180; p. 113436
Main Authors Manafi, Ehsan, Domenech, Bruno, Tavakkoli-Moghaddam, Reza, Ranaboldo, Matteo
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.08.2025
Subjects
Online AccessGet full text
ISSN1568-4946
DOI10.1016/j.asoc.2025.113436

Cover

More Information
Summary:One of the key areas in which production systems researchers are working these days is to find advanced optimization algorithms to efficiently schedule activities in manufacturing systems, which requires more sophisticated models with increased computational complexity. Therefore, there has been growing interest in this subject to improve the performance of meta-heuristics by incorporating reinforcement learning approaches. This paper deals with a dual-resource flexible job shop scheduling (DRFJSS) problem, in which each operation requires two resources (i.e., reconfigurable machine tool (RMT) and worker) to be processed. A mixed-integer linear programming (MILP) model is formulated to minimize the makespan. Since the proposed model cannot optimally solve most medium-sized instances, a self-learning whale optimization algorithm (SLWOA) is developed to deal efficiently with such a difficult problem. In the proposed SLWOA, an agent is trained by the state–action–reward–state–action (SARSA) algorithm to balance exploration and exploitation. The results show that the SLWOA has a stronger global search ability and faster convergence speed than the original whale optimization algorithm. [Display omitted] •Studying dual-resource scheduling in shop floors with reconfigurable machine tools.•Formulating a position-based MILP model for scheduling optimization.•Proposing a self-learning whale algorithm for large instance problems.•Designing states, actions, and rewards for reinforcement learning integration.•Developing a variable neighbourhood search to improve the local search.
ISSN:1568-4946
DOI:10.1016/j.asoc.2025.113436