Recurrent Reinforcement Learning Strategy with a Parameterized Agent for Online Scheduling of a State Task Network Under Uncertainty
This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and aleatoric uncertainty. The hybrid agent can perform multiple discrete or continuous decisions at every time interval. To approach the uncert...
        Saved in:
      
    
          | Published in | Industrial & engineering chemistry research Vol. 64; no. 13; pp. 7126 - 7140 | 
|---|---|
| Main Authors | , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
            American Chemical Society
    
        02.04.2025
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 0888-5885 1520-5045 1520-5045  | 
| DOI | 10.1021/acs.iecr.4c04900 | 
Cover
| Abstract | This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and aleatoric uncertainty. The hybrid agent can perform multiple discrete or continuous decisions at every time interval. To approach the uncertainty in the scheduling process, the hybrid agent is augmented with a set of LSTM layers that integrate a sequence of observations. This feature allows for the consideration of previous information to make decisions in view of the realization and propagation of uncertainties throughout the plant. Moreover, the techniques required for an efficient training oriented toward the objective function are described. The method is implemented in two case studies for validation and testing of the agent subject to epistemic and aleatoric uncertainty. A similar hybrid agent without recurrence is used as a benchmark. The proposed hybrid agent accumulated larger rewards while minimizing the number of constraint violations in the process under uncertainty, thus, making this online scheduling agent attractive for industrial-scale applications. | 
    
|---|---|
| AbstractList | This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and aleatoric uncertainty. The hybrid agent can perform multiple discrete or continuous decisions at every time interval. To approach the uncertainty in the scheduling process, the hybrid agent is augmented with a set of LSTM layers that integrate a sequence of observations. This feature allows for the consideration of previous information to make decisions in view of the realization and propagation of uncertainties throughout the plant. Moreover, the techniques required for an efficient training oriented toward the objective function are described. The method is implemented in two case studies for validation and testing of the agent subject to epistemic and aleatoric uncertainty. A similar hybrid agent without recurrence is used as a benchmark. The proposed hybrid agent accumulated larger rewards while minimizing the number of constraint violations in the process under uncertainty, thus, making this online scheduling agent attractive for industrial-scale applications. | 
    
| Author | Ricardez-Sandoval, Luis A. Rangel-Martinez, Daniel  | 
    
| Author_xml | – sequence: 1 givenname: Daniel surname: Rangel-Martinez fullname: Rangel-Martinez, Daniel – sequence: 2 givenname: Luis A. orcidid: 0000-0001-9867-6778 surname: Ricardez-Sandoval fullname: Ricardez-Sandoval, Luis A. email: laricard@uwaterloo.ca  | 
    
| BookMark | eNp1kEFv2zAMRoWhBZa2u--o4w5zRtliIh-LYlsHBOvQZGdDlqlUrSN3lIwiO--H10Z63YUEwfcI8LsQZ3GIJMRHBUsFpfpiXVoGcrzUDnQN8E4sFJZQIGg8EwswxhRoDL4XFyk9AgCi1gvx757cyEwxy3sK0Q_s6DBPG7IcQ9zLbWabaX-ULyE_SCt_WbYHysThL3Xyej_DkybvYh8iya17oG7sZ3PwE77Nky13Nj3Jn5RfBn6Sv2NHPFVHnG2I-Xglzr3tE31465di9-3r7ua22Nx9_3FzvSlsuTK5wLZ2LXrVIVBtzBrrVq-9tqomvVJVpbuyahFRQeva0jpEX3uvKr1aY9dRdSk-nc4-8_BnpJSbQ0iO-t5GGsbUVCVAaUDVZkLhhDoeUmLyzTOHg-Vjo6CZ826mvJs57-Yt70n5fFLmzeMwcpxe-T_-Cgiah-s | 
    
| Cites_doi | 10.1016/j.dche.2022.100023 10.1016/j.compchemeng.2024.108924 10.1021/ie071431u 10.1016/j.cherd.2021.10.032 10.1007/s10479-021-04141-w 10.1016/j.compchemeng.2007.03.001 10.1016/0098-1354(93)80015-F 10.1080/14697688.2022.2135456 10.1021/ie000683r 10.1002/aic.16489 10.1016/j.procir.2018.03.212 10.1109/ACROSET62108.2024.10743847 10.1021/ie902009k 10.1016/j.compchemeng.2024.108783 10.1016/j.compchemeng.2020.106982 10.1021/ie950082d 10.1007/s10845-023-02094-4 10.32473/flairs.v35i.130584 10.1109/ASMC.2018.8373191 10.1016/j.ejor.2023.07.037 10.1016/S0377-2217(99)00486-5 10.1016/S0098-1354(97)00234-2 10.1016/S0927-0507(05)80172-0 10.1007/s11740-020-00967-8 10.1007/s10994-021-05946-3 10.1002/sam.11709 10.1016/S0377-2217(99)00311-2 10.1016/B978-0-443-15274-0.50263-8 10.1109/TII.2023.3272661 10.1016/S0098-1354(02)00221-1 10.1016/j.compchemeng.2024.108748 10.1016/j.physd.2022.133454 10.1021/ie401044q 10.1016/j.engappai.2023.107790 10.1002/aic.18584 10.1080/09511929208944524  | 
    
| ContentType | Journal Article | 
    
| Copyright | 2025 American Chemical Society | 
    
| Copyright_xml | – notice: 2025 American Chemical Society | 
    
| DBID | AAYXX CITATION 7S9 L.6  | 
    
| DOI | 10.1021/acs.iecr.4c04900 | 
    
| DatabaseName | CrossRef AGRICOLA AGRICOLA - Academic  | 
    
| DatabaseTitle | CrossRef AGRICOLA AGRICOLA - Academic  | 
    
| DatabaseTitleList | AGRICOLA  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Engineering Chemistry  | 
    
| EISSN | 1520-5045 | 
    
| EndPage | 7140 | 
    
| ExternalDocumentID | 10_1021_acs_iecr_4c04900 a130066387  | 
    
| GroupedDBID | -~X .DC .K2 4.4 55A 5GY 5VS 6TJ 7~N AABXI ABMVS ABQRX ABUCX ACGFO ACJ ACS ADHLV AEESW AENEX AFEFF AGXLV AHGAQ ALMA_UNASSIGNED_HOLDINGS AQSVZ BAANH CS3 CUPRZ DU5 EBS ED~ F5P GGK GNL IH9 JG~ LG6 P2P ROL TAE TN5 UI2 VF5 VG9 W1F WH7 ~02 AAYXX ABBLG ABLBI CITATION 7S9 L.6  | 
    
| ID | FETCH-LOGICAL-a268t-5b9cb5f1d50e988759b47f4a19e461334d23b55510bcb2ac55f9ff134675dde3 | 
    
| IEDL.DBID | ACS | 
    
| ISSN | 0888-5885 1520-5045  | 
    
| IngestDate | Wed Jul 02 03:18:44 EDT 2025 Wed Oct 01 06:35:31 EDT 2025 Thu Apr 03 03:22:56 EDT 2025  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 13 | 
    
| Language | English | 
    
| License | https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 https://doi.org/10.15223/policy-045  | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-a268t-5b9cb5f1d50e988759b47f4a19e461334d23b55510bcb2ac55f9ff134675dde3 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23  | 
    
| ORCID | 0000-0001-9867-6778 | 
    
| PQID | 3200280198 | 
    
| PQPubID | 24069 | 
    
| PageCount | 15 | 
    
| ParticipantIDs | proquest_miscellaneous_3200280198 crossref_primary_10_1021_acs_iecr_4c04900 acs_journals_10_1021_acs_iecr_4c04900  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2025-04-02 | 
    
| PublicationDateYYYYMMDD | 2025-04-02 | 
    
| PublicationDate_xml | – month: 04 year: 2025 text: 2025-04-02 day: 02  | 
    
| PublicationDecade | 2020 | 
    
| PublicationTitle | Industrial & engineering chemistry research | 
    
| PublicationTitleAlternate | Ind. Eng. Chem. Res | 
    
| PublicationYear | 2025 | 
    
| Publisher | American Chemical Society | 
    
| Publisher_xml | – name: American Chemical Society | 
    
| References | ref9/cit9 ref45/cit45 ref3/cit3 ref27/cit27 ref16/cit16 ref52/cit52 Sutton R. S. (ref31/cit31) 2018 ref23/cit23 ref8/cit8 ref2/cit2 ref34/cit34 ref37/cit37 ref20/cit20 ref48/cit48 ref17/cit17 ref10/cit10 ref35/cit35 Géron A. (ref43/cit43) 2017 ref19/cit19 ref21/cit21 ref42/cit42 ref46/cit46 ref49/cit49 ref13/cit13 ref24/cit24 ref38/cit38 ref50/cit50 ref6/cit6 ref36/cit36 ref18/cit18 ref11/cit11 ref25/cit25 ref29/cit29 ref32/cit32 ref39/cit39 ref14/cit14 ref5/cit5 ref51/cit51 ref28/cit28 ref40/cit40 ref26/cit26 ref12/cit12 ref15/cit15 ref41/cit41 ref22/cit22 ref33/cit33 ref4/cit4 ref30/cit30 ref47/cit47 ref1/cit1 ref44/cit44 ref7/cit7  | 
    
| References_xml | – ident: ref9/cit9 doi: 10.1016/j.dche.2022.100023 – ident: ref11/cit11 doi: 10.1016/j.compchemeng.2024.108924 – ident: ref28/cit28 doi: 10.1021/ie071431u – ident: ref24/cit24 doi: 10.1016/j.cherd.2021.10.032 – ident: ref30/cit30 doi: 10.1007/s10479-021-04141-w – ident: ref44/cit44 – ident: ref33/cit33 – ident: ref5/cit5 doi: 10.1016/j.compchemeng.2007.03.001 – ident: ref27/cit27 doi: 10.1016/0098-1354(93)80015-F – ident: ref13/cit13 doi: 10.1080/14697688.2022.2135456 – ident: ref37/cit37 doi: 10.1021/ie000683r – ident: ref40/cit40 – ident: ref7/cit7 doi: 10.1002/aic.16489 – ident: ref16/cit16 doi: 10.1016/j.procir.2018.03.212 – ident: ref50/cit50 doi: 10.1109/ACROSET62108.2024.10743847 – ident: ref2/cit2 doi: 10.1021/ie902009k – ident: ref23/cit23 doi: 10.1016/j.compchemeng.2024.108783 – ident: ref15/cit15 doi: 10.1016/j.compchemeng.2020.106982 – ident: ref51/cit51 doi: 10.1021/ie950082d – ident: ref19/cit19 doi: 10.1007/s10845-023-02094-4 – ident: ref47/cit47 doi: 10.32473/flairs.v35i.130584 – ident: ref14/cit14 doi: 10.1016/j.compchemeng.2020.106982 – ident: ref18/cit18 doi: 10.1109/ASMC.2018.8373191 – ident: ref42/cit42 – ident: ref20/cit20 doi: 10.1016/j.ejor.2023.07.037 – ident: ref38/cit38 doi: 10.1016/S0377-2217(99)00486-5 – ident: ref29/cit29 doi: 10.1016/S0098-1354(97)00234-2 – ident: ref39/cit39 doi: 10.1016/S0927-0507(05)80172-0 – ident: ref17/cit17 doi: 10.1007/s11740-020-00967-8 – ident: ref1/cit1 doi: 10.1007/s10994-021-05946-3 – ident: ref21/cit21 – ident: ref49/cit49 doi: 10.1002/sam.11709 – ident: ref4/cit4 doi: 10.1016/S0377-2217(99)00311-2 – ident: ref32/cit32 – ident: ref10/cit10 – ident: ref35/cit35 – ident: ref45/cit45 – ident: ref46/cit46 – ident: ref34/cit34 – ident: ref12/cit12 doi: 10.1016/B978-0-443-15274-0.50263-8 – ident: ref25/cit25 doi: 10.1109/TII.2023.3272661 – ident: ref6/cit6 doi: 10.1016/S0098-1354(02)00221-1 – ident: ref26/cit26 doi: 10.1016/j.compchemeng.2024.108748 – ident: ref48/cit48 doi: 10.1016/j.physd.2022.133454 – volume-title: Reinforcement Learning: An Introduction year: 2018 ident: ref31/cit31 – ident: ref36/cit36 doi: 10.1021/ie401044q – ident: ref52/cit52 – ident: ref41/cit41 – volume-title: Hands-on Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems year: 2017 ident: ref43/cit43 – ident: ref22/cit22 doi: 10.1016/j.engappai.2023.107790 – ident: ref8/cit8 doi: 10.1002/aic.18584 – ident: ref3/cit3 doi: 10.1080/09511929208944524  | 
    
| SSID | ssj0005544 | 
    
| Score | 2.4783015 | 
    
| Snippet | This study presents a framework for developing reinforcement learning hybrid agents that can build online schedules for state task networks under epistemic and... | 
    
| SourceID | proquest crossref acs  | 
    
| SourceType | Aggregation Database Index Database Publisher  | 
    
| StartPage | 7126 | 
    
| SubjectTerms | chemistry hybrids Process Systems Engineering uncertainty  | 
    
| Title | Recurrent Reinforcement Learning Strategy with a Parameterized Agent for Online Scheduling of a State Task Network Under Uncertainty | 
    
| URI | http://dx.doi.org/10.1021/acs.iecr.4c04900 https://www.proquest.com/docview/3200280198  | 
    
| Volume | 64 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVABC databaseName: American Chemical Society Journals customDbUrl: eissn: 1520-5045 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0005544 issn: 0888-5885 databaseCode: ACS dateStart: 19870101 isFulltext: true titleUrlDefault: https://pubs.acs.org/action/showPublications?display=journals providerName: American Chemical Society  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV07T8MwELYQLDDwRrxlJBgYUmLHTuqxqkAVQ4XaInWLbMepEFKLSDrQmR_OnZOqFBDq4iFyHvLj7svd5-8IuQ6V0wDEk4DpKA6EZVEAbkEENgOsjfg3yjzLtxt3nsXjUA4XMjk_M_ic3WlbNF4AQjWExSwV_J5v8DhJkL7XavcXdA7pC7fCpsGTRE1ZpyT_egI6IlssO6JlO-ydy8NOVaWo8JqEyCl5bUxL07Cz34qNK3z3LtmuMSZtVYtij6y58T7Z-qY8eEA-exhnR2Um2nNePNX6OCGt9VZHtJat_aAYqaWaPmmkcaGy88xltIUnsijcRiutUtqHyc-Q1T6ikxy6exBLB7p4pd2KaU59iSVobcVCKD8OyeDhftDuBHVBhkDzuFkG0ihrZM4yGToF1kkqI5JcaKacAFgQiYxHRgIGC401XFspc5XnLAJjLMGMRkdkfTwZu2NCjQqNymymjRKCcw1tHppESNO0TCT8hNzA-KX1fipSnyrnLMWLOKhpPagn5HY-ielbJc_xT9-r-SynsIcwMaLHbjIt0oj7DDNTzdMV33tGNjkWAkYKDz8n6-X71F0AOinNpV-WX47J4Js | 
    
| linkProvider | American Chemical Society | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA6iB_XgLtY1gh48TJ1kkrY5lqLUrUhbwduQZDJFCq0404Oe_eG-l5m6IaKXHEImE7K8fMn78j1CjkLlNADxesB0VAuEZVEA24IIbAJYG_FvlHiWb6fWvhOX9_J-hrDpWxhoRAY1Zd6J_6EuwE4x7wGQVFVYdFbBKX1O1gTD81az1ftgdUgfvxXWDj4oasjSM_lTDbgf2ezrfvTVHPs95nyZdN9b56klw-okN1X78k248V_NXyFLJeKkzWKKrJIZN1oji590CNfJaxdv3VGniXadl1K1_taQluqrA1qK2D5TvLelmt5qJHWhzvOLS2gT32dR-IwWyqW0B1MhQY77gI5TKO4hLe3rbEg7Be-c-oBLkNqCk5A_b5D--Vm_1Q7K8AyB5rVGHkijrJEpS2ToFNgqqYyop0Iz5QSAhEgkPDISEFlorOHaSpmqNGURmGYJRjXaJLOj8chtEWpUaFRiE22UEJxrSNPQ1IU0DctEnVfIMfRfXK6uLPaOc85izMROjctOrZCT6VjGj4VYxy9lD6eDHcOKQjeJHrnxJIsj7v3NTDW2__jfAzLf7t9cx9cXnasdssAxRDCSe_gumc2fJm4PcEtu9v1MfQP3B-j9 | 
    
| linkToPdf | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JS8QwFA6iIHpwF3cj6MFDxyZNZiZHUQc3BtFRvJVsFRFmxHYOM2d_uO-lHTdE9JJDSNM0yUu-5n35HiG7sfIagHgjYjqpR8KyJIJtQUTWAdZG_Ju4wPJt109vxfm9vB8jcnQXBhqRQ015cOKjVT-7rFIYYAeY_whoqiYsOqzgT31C1sHSEREd3XwwO2SI4Qr2g5eKmrLyTv5UA-5JNv-6J31dksM-05old-8tDPSSp1q_MDU7_Cbe-O9PmCMzFfKkh-VUmSdjvrtApj_pES6S12s8fUe9Jnrtg6SqDaeHtFJhfaCVmO2A4vkt1fRKI7kL9Z6H3tFDvKdF4TFaKpjSG5gSDrnuD7SXQfEAbWlH50-0XfLPaQi8BKktuQnFYIl0Wiedo9OoCtMQaV5vFpE0yhqZMSdjr2DNksqIRiY0U14AWEiE44mRgMxiYw3XVspMZRlLYOAkLK7JMhnv9rp-hVCjYqOcddooITjXkGaxaQhpmpaJBl8le9B_aWVleRoc6JylmImdmladukr2R-OZPpeiHb-U3RkNeAqWhe4S3fW9fp4mPPidmWqu_fG922Ty6riVXp61L9bJFMdIwcjx4RtkvHjp-02AL4XZCpP1DYXX64A | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Recurrent+Reinforcement+Learning+Strategy+with+a+Parameterized+Agent+for+Online+Scheduling+of+a+State+Task+Network+Under+Uncertainty&rft.jtitle=Industrial+%26+engineering+chemistry+research&rft.au=Rangel-Martinez%2C+Daniel&rft.au=Ricardez-Sandoval%2C+Luis+A.&rft.date=2025-04-02&rft.issn=1520-5045&rft.volume=64&rft.issue=13&rft_id=info:doi/10.1021%2Facs.iecr.4c04900&rft.externalDBID=NO_FULL_TEXT | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0888-5885&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0888-5885&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0888-5885&client=summon |