Recurrent Reinforcement Learning Strategy with a Parameterized Agent for Online Scheduling of a State Task Network Under Uncertainty

This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and aleatoric uncertainty. The hybrid agent can perform multiple discrete or continuous decisions at every time interval. To approach the uncert...

Full description

Saved in:

Bibliographic Details
Published in	Industrial & engineering chemistry research Vol. 64; no. 13; pp. 7126 - 7140
Main Authors	Rangel-Martinez, Daniel, Ricardez-Sandoval, Luis A.
Format	Journal Article
Language	English
Published	American Chemical Society 02.04.2025
Subjects	chemistry hybrids Process Systems Engineering uncertainty
Online Access	Get full text
ISSN	0888-5885 1520-5045 1520-5045
DOI	10.1021/acs.iecr.4c04900

Cover

Abstract	This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and aleatoric uncertainty. The hybrid agent can perform multiple discrete or continuous decisions at every time interval. To approach the uncertainty in the scheduling process, the hybrid agent is augmented with a set of LSTM layers that integrate a sequence of observations. This feature allows for the consideration of previous information to make decisions in view of the realization and propagation of uncertainties throughout the plant. Moreover, the techniques required for an efficient training oriented toward the objective function are described. The method is implemented in two case studies for validation and testing of the agent subject to epistemic and aleatoric uncertainty. A similar hybrid agent without recurrence is used as a benchmark. The proposed hybrid agent accumulated larger rewards while minimizing the number of constraint violations in the process under uncertainty, thus, making this online scheduling agent attractive for industrial-scale applications.
AbstractList	This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and aleatoric uncertainty. The hybrid agent can perform multiple discrete or continuous decisions at every time interval. To approach the uncertainty in the scheduling process, the hybrid agent is augmented with a set of LSTM layers that integrate a sequence of observations. This feature allows for the consideration of previous information to make decisions in view of the realization and propagation of uncertainties throughout the plant. Moreover, the techniques required for an efficient training oriented toward the objective function are described. The method is implemented in two case studies for validation and testing of the agent subject to epistemic and aleatoric uncertainty. A similar hybrid agent without recurrence is used as a benchmark. The proposed hybrid agent accumulated larger rewards while minimizing the number of constraint violations in the process under uncertainty, thus, making this online scheduling agent attractive for industrial-scale applications.
Author	Ricardez-Sandoval, Luis A. Rangel-Martinez, Daniel
Author_xml	– sequence: 1 givenname: Daniel surname: Rangel-Martinez fullname: Rangel-Martinez, Daniel – sequence: 2 givenname: Luis A. orcidid: 0000-0001-9867-6778 surname: Ricardez-Sandoval fullname: Ricardez-Sandoval, Luis A. email: laricard@uwaterloo.ca
BookMark	eNp1kEFv2zAMRoWhBZa2u--o4w5zRtliIh-LYlsHBOvQZGdDlqlUrSN3lIwiO--H10Z63YUEwfcI8LsQZ3GIJMRHBUsFpfpiXVoGcrzUDnQN8E4sFJZQIGg8EwswxhRoDL4XFyk9AgCi1gvx757cyEwxy3sK0Q_s6DBPG7IcQ9zLbWabaX-ULyE_SCt_WbYHysThL3Xyej_DkybvYh8iya17oG7sZ3PwE77Nky13Nj3Jn5RfBn6Sv2NHPFVHnG2I-Xglzr3tE31465di9-3r7ua22Nx9_3FzvSlsuTK5wLZ2LXrVIVBtzBrrVq-9tqomvVJVpbuyahFRQeva0jpEX3uvKr1aY9dRdSk-nc4-8_BnpJSbQ0iO-t5GGsbUVCVAaUDVZkLhhDoeUmLyzTOHg-Vjo6CZ826mvJs57-Yt70n5fFLmzeMwcpxe-T_-Cgiah-s
Cites_doi	10.1016/j.dche.2022.100023 10.1016/j.compchemeng.2024.108924 10.1021/ie071431u 10.1016/j.cherd.2021.10.032 10.1007/s10479-021-04141-w 10.1016/j.compchemeng.2007.03.001 10.1016/0098-1354(93)80015-F 10.1080/14697688.2022.2135456 10.1021/ie000683r 10.1002/aic.16489 10.1016/j.procir.2018.03.212 10.1109/ACROSET62108.2024.10743847 10.1021/ie902009k 10.1016/j.compchemeng.2024.108783 10.1016/j.compchemeng.2020.106982 10.1021/ie950082d 10.1007/s10845-023-02094-4 10.32473/flairs.v35i.130584 10.1109/ASMC.2018.8373191 10.1016/j.ejor.2023.07.037 10.1016/S0377-2217(99)00486-5 10.1016/S0098-1354(97)00234-2 10.1016/S0927-0507(05)80172-0 10.1007/s11740-020-00967-8 10.1007/s10994-021-05946-3 10.1002/sam.11709 10.1016/S0377-2217(99)00311-2 10.1016/B978-0-443-15274-0.50263-8 10.1109/TII.2023.3272661 10.1016/S0098-1354(02)00221-1 10.1016/j.compchemeng.2024.108748 10.1016/j.physd.2022.133454 10.1021/ie401044q 10.1016/j.engappai.2023.107790 10.1002/aic.18584 10.1080/09511929208944524
ContentType	Journal Article
Copyright	2025 American Chemical Society
Copyright_xml	– notice: 2025 American Chemical Society
DBID	AAYXX CITATION 7S9 L.6
DOI	10.1021/acs.iecr.4c04900
DatabaseName	CrossRef AGRICOLA AGRICOLA - Academic
DatabaseTitle	CrossRef AGRICOLA AGRICOLA - Academic
DatabaseTitleList	AGRICOLA
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering Chemistry
EISSN	1520-5045
EndPage	7140
ExternalDocumentID	10_1021_acs_iecr_4c04900 a130066387
GroupedDBID	-~X .DC .K2 4.4 55A 5GY 5VS 6TJ 7~N AABXI ABMVS ABQRX ABUCX ACGFO ACJ ACS ADHLV AEESW AENEX AFEFF AGXLV AHGAQ ALMA_UNASSIGNED_HOLDINGS AQSVZ BAANH CS3 CUPRZ DU5 EBS ED~ F5P GGK GNL IH9 JG~ LG6 P2P ROL TAE TN5 UI2 VF5 VG9 W1F WH7 ~02 AAYXX ABBLG ABLBI CITATION 7S9 L.6
ID	FETCH-LOGICAL-a268t-5b9cb5f1d50e988759b47f4a19e461334d23b55510bcb2ac55f9ff134675dde3
IEDL.DBID	ACS
ISSN	0888-5885 1520-5045
IngestDate	Wed Jul 02 03:18:44 EDT 2025 Wed Oct 01 06:35:31 EDT 2025 Thu Apr 03 03:22:56 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	13
Language	English
License	https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 https://doi.org/10.15223/policy-045
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a268t-5b9cb5f1d50e988759b47f4a19e461334d23b55510bcb2ac55f9ff134675dde3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ORCID	0000-0001-9867-6778
PQID	3200280198
PQPubID	24069
PageCount	15
ParticipantIDs	proquest_miscellaneous_3200280198 crossref_primary_10_1021_acs_iecr_4c04900 acs_journals_10_1021_acs_iecr_4c04900
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2025-04-02
PublicationDateYYYYMMDD	2025-04-02
PublicationDate_xml	– month: 04 year: 2025 text: 2025-04-02 day: 02
PublicationDecade	2020
PublicationTitle	Industrial & engineering chemistry research
PublicationTitleAlternate	Ind. Eng. Chem. Res
PublicationYear	2025
Publisher	American Chemical Society
Publisher_xml	– name: American Chemical Society
References	ref9/cit9 ref45/cit45 ref3/cit3 ref27/cit27 ref16/cit16 ref52/cit52 Sutton R. S. (ref31/cit31) 2018 ref23/cit23 ref8/cit8 ref2/cit2 ref34/cit34 ref37/cit37 ref20/cit20 ref48/cit48 ref17/cit17 ref10/cit10 ref35/cit35 Géron A. (ref43/cit43) 2017 ref19/cit19 ref21/cit21 ref42/cit42 ref46/cit46 ref49/cit49 ref13/cit13 ref24/cit24 ref38/cit38 ref50/cit50 ref6/cit6 ref36/cit36 ref18/cit18 ref11/cit11 ref25/cit25 ref29/cit29 ref32/cit32 ref39/cit39 ref14/cit14 ref5/cit5 ref51/cit51 ref28/cit28 ref40/cit40 ref26/cit26 ref12/cit12 ref15/cit15 ref41/cit41 ref22/cit22 ref33/cit33 ref4/cit4 ref30/cit30 ref47/cit47 ref1/cit1 ref44/cit44 ref7/cit7
References_xml	– ident: ref9/cit9 doi: 10.1016/j.dche.2022.100023 – ident: ref11/cit11 doi: 10.1016/j.compchemeng.2024.108924 – ident: ref28/cit28 doi: 10.1021/ie071431u – ident: ref24/cit24 doi: 10.1016/j.cherd.2021.10.032 – ident: ref30/cit30 doi: 10.1007/s10479-021-04141-w – ident: ref44/cit44 – ident: ref33/cit33 – ident: ref5/cit5 doi: 10.1016/j.compchemeng.2007.03.001 – ident: ref27/cit27 doi: 10.1016/0098-1354(93)80015-F – ident: ref13/cit13 doi: 10.1080/14697688.2022.2135456 – ident: ref37/cit37 doi: 10.1021/ie000683r – ident: ref40/cit40 – ident: ref7/cit7 doi: 10.1002/aic.16489 – ident: ref16/cit16 doi: 10.1016/j.procir.2018.03.212 – ident: ref50/cit50 doi: 10.1109/ACROSET62108.2024.10743847 – ident: ref2/cit2 doi: 10.1021/ie902009k – ident: ref23/cit23 doi: 10.1016/j.compchemeng.2024.108783 – ident: ref15/cit15 doi: 10.1016/j.compchemeng.2020.106982 – ident: ref51/cit51 doi: 10.1021/ie950082d – ident: ref19/cit19 doi: 10.1007/s10845-023-02094-4 – ident: ref47/cit47 doi: 10.32473/flairs.v35i.130584 – ident: ref14/cit14 doi: 10.1016/j.compchemeng.2020.106982 – ident: ref18/cit18 doi: 10.1109/ASMC.2018.8373191 – ident: ref42/cit42 – ident: ref20/cit20 doi: 10.1016/j.ejor.2023.07.037 – ident: ref38/cit38 doi: 10.1016/S0377-2217(99)00486-5 – ident: ref29/cit29 doi: 10.1016/S0098-1354(97)00234-2 – ident: ref39/cit39 doi: 10.1016/S0927-0507(05)80172-0 – ident: ref17/cit17 doi: 10.1007/s11740-020-00967-8 – ident: ref1/cit1 doi: 10.1007/s10994-021-05946-3 – ident: ref21/cit21 – ident: ref49/cit49 doi: 10.1002/sam.11709 – ident: ref4/cit4 doi: 10.1016/S0377-2217(99)00311-2 – ident: ref32/cit32 – ident: ref10/cit10 – ident: ref35/cit35 – ident: ref45/cit45 – ident: ref46/cit46 – ident: ref34/cit34 – ident: ref12/cit12 doi: 10.1016/B978-0-443-15274-0.50263-8 – ident: ref25/cit25 doi: 10.1109/TII.2023.3272661 – ident: ref6/cit6 doi: 10.1016/S0098-1354(02)00221-1 – ident: ref26/cit26 doi: 10.1016/j.compchemeng.2024.108748 – ident: ref48/cit48 doi: 10.1016/j.physd.2022.133454 – volume-title: Reinforcement Learning: An Introduction year: 2018 ident: ref31/cit31 – ident: ref36/cit36 doi: 10.1021/ie401044q – ident: ref52/cit52 – ident: ref41/cit41 – volume-title: Hands-on Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems year: 2017 ident: ref43/cit43 – ident: ref22/cit22 doi: 10.1016/j.engappai.2023.107790 – ident: ref8/cit8 doi: 10.1002/aic.18584 – ident: ref3/cit3 doi: 10.1080/09511929208944524
SSID	ssj0005544
Score	2.4783015
Snippet	This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and...
SourceID	proquest crossref acs
SourceType	Aggregation Database Index Database Publisher
StartPage	7126
SubjectTerms	chemistry hybrids Process Systems Engineering uncertainty
Title	Recurrent Reinforcement Learning Strategy with a Parameterized Agent for Online Scheduling of a State Task Network Under Uncertainty
URI	http://dx.doi.org/10.1021/acs.iecr.4c04900 https://www.proquest.com/docview/3200280198
Volume	64
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVABC databaseName: American Chemical Society Journals customDbUrl: eissn: 1520-5045 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005544 issn: 0888-5885 databaseCode: ACS dateStart: 19870101 isFulltext: true titleUrlDefault: https://pubs.acs.org/action/showPublications?display=journals providerName: American Chemical Society
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELYQLDDwRrxlJBgYUmLHTuqxqkAVQ4XaInWLbMepEFKLSDrQmR_OnZOqFBDq4iFyHvLj7svd5-8IuQ6V0wDEk4DpKA6EZVEAbkEENgOsjfg3yjzLtxt3nsXjUA4XMjk_M_ic3WlbNF4AQjWExSwV_J5v8DhJkL7XavcXdA7pC7fCpsGTRE1ZpyT_egI6IlssO6JlO-ydy8NOVaWo8JqEyCl5bUxL07Cz34qNK3z3LtmuMSZtVYtij6y58T7Z-qY8eEA-exhnR2Um2nNePNX6OCGt9VZHtJat_aAYqaWaPmmkcaGy88xltIUnsijcRiutUtqHyc-Q1T6ikxy6exBLB7p4pd2KaU59iSVobcVCKD8OyeDhftDuBHVBhkDzuFkG0ihrZM4yGToF1kkqI5JcaKacAFgQiYxHRgIGC401XFspc5XnLAJjLMGMRkdkfTwZu2NCjQqNymymjRKCcw1tHppESNO0TCT8hNzA-KX1fipSnyrnLMWLOKhpPagn5HY-ielbJc_xT9-r-SynsIcwMaLHbjIt0oj7DDNTzdMV33tGNjkWAkYKDz8n6-X71F0AOinNpV-WX47J4Js
linkProvider	American Chemical Society
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA6iB_XgLtY1gh48TJ1kkrY5lqLUrUhbwduQZDJFCq0404Oe_eG-l5m6IaKXHEImE7K8fMn78j1CjkLlNADxesB0VAuEZVEA24IIbAJYG_FvlHiWb6fWvhOX9_J-hrDpWxhoRAY1Zd6J_6EuwE4x7wGQVFVYdFbBKX1O1gTD81az1ftgdUgfvxXWDj4oasjSM_lTDbgf2ezrfvTVHPs95nyZdN9b56klw-okN1X78k248V_NXyFLJeKkzWKKrJIZN1oji590CNfJaxdv3VGniXadl1K1_taQluqrA1qK2D5TvLelmt5qJHWhzvOLS2gT32dR-IwWyqW0B1MhQY77gI5TKO4hLe3rbEg7Be-c-oBLkNqCk5A_b5D--Vm_1Q7K8AyB5rVGHkijrJEpS2ToFNgqqYyop0Iz5QSAhEgkPDISEFlorOHaSpmqNGURmGYJRjXaJLOj8chtEWpUaFRiE22UEJxrSNPQ1IU0DctEnVfIMfRfXK6uLPaOc85izMROjctOrZCT6VjGj4VYxy9lD6eDHcOKQjeJHrnxJIsj7v3NTDW2__jfAzLf7t9cx9cXnasdssAxRDCSe_gumc2fJm4PcEtu9v1MfQP3B-j9
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JS8QwFA6iIHpwF3cj6MFDxyZNZiZHUQc3BtFRvJVsFRFmxHYOM2d_uO-lHTdE9JJDSNM0yUu-5n35HiG7sfIagHgjYjqpR8KyJIJtQUTWAdZG_Ju4wPJt109vxfm9vB8jcnQXBhqRQ015cOKjVT-7rFIYYAeY_whoqiYsOqzgT31C1sHSEREd3XwwO2SI4Qr2g5eKmrLyTv5UA-5JNv-6J31dksM-05old-8tDPSSp1q_MDU7_Cbe-O9PmCMzFfKkh-VUmSdjvrtApj_pES6S12s8fUe9Jnrtg6SqDaeHtFJhfaCVmO2A4vkt1fRKI7kL9Z6H3tFDvKdF4TFaKpjSG5gSDrnuD7SXQfEAbWlH50-0XfLPaQi8BKktuQnFYIl0Wiedo9OoCtMQaV5vFpE0yhqZMSdjr2DNksqIRiY0U14AWEiE44mRgMxiYw3XVspMZRlLYOAkLK7JMhnv9rp-hVCjYqOcddooITjXkGaxaQhpmpaJBl8le9B_aWVleRoc6JylmImdmladukr2R-OZPpeiHb-U3RkNeAqWhe4S3fW9fp4mPPidmWqu_fG922Ty6riVXp61L9bJFMdIwcjx4RtkvHjp-02AL4XZCpP1DYXX64A
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Recurrent+Reinforcement+Learning+Strategy+with+a+Parameterized+Agent+for+Online+Scheduling+of+a+State+Task+Network+Under+Uncertainty&rft.jtitle=Industrial+%26+engineering+chemistry+research&rft.au=Rangel-Martinez%2C+Daniel&rft.au=Ricardez-Sandoval%2C+Luis+A.&rft.date=2025-04-02&rft.issn=1520-5045&rft.volume=64&rft.issue=13&rft_id=info:doi/10.1021%2Facs.iecr.4c04900&rft.externalDBID=NO_FULL_TEXT
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0888-5885&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0888-5885&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0888-5885&client=summon