graphenv: a Python library for reinforcement learning on graph search spaces
Many important and challenging problems in combinatorial optimization (CO) can be expressed as graph search problems, in which graph vertices represent full or partial solutions and edges represent decisions that connect them. Graph structure not only introduces strong relational inductive biases fo...
Saved in:
| Published in | Journal of open source software Vol. 7; no. 77; p. 4621 |
|---|---|
| Main Authors | , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
Open Source Initiative - NumFOCUS
05.09.2022
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 2475-9066 2475-9066 |
| DOI | 10.21105/joss.04621 |
Cover
| Abstract | Many important and challenging problems in combinatorial optimization (CO) can be expressed as graph search problems, in which graph vertices represent full or partial solutions and edges represent decisions that connect them. Graph structure not only introduces strong relational inductive biases for learning (Battaglia et al., 2018) - in this context, by providing a way to explicitly model the value of transitioning (along edges) between one search state (vertex) and the next - but lends itself to problems both with and without clearly defined algebraic structure. For example, classic CO problems on graphs such as the Traveling Salesman Problem (TSP) can be expressed as either pure graph search or integer programs. Other problems, however, such as molecular optimization, do no have concise algebraic formulations and yet are readily implemented as a graph search (V. et al., 2022; Zhou et al., 2019). Such "model-free" problems constitute a large fraction of modern reinforcement learning (RL) research owing to the fact that it is often much easier to write a forward simulation that expresses all of the state transitions and rewards, than to write down the precise mathematical expression of the full optimization problem. In the case of molecular optimization, for example, one can use domain knowledge alongside existing software libraries to model the effect of adding a single bond or atom to an existing but incomplete molecule, and let the RL algorithm build a model of how good a given decision is by "experiencing" the simulated environment many times through. In contrast, a model-based mathematical formulation that fully expresses all the chemical and physical constraints is intractable. In recent years, RL has emerged as an effective paradigm for optimizing searches over graphs and led to state-of-the-art heuristics for games like Go and chess, as well as for classical CO problems such as the TSP. This combination of graph search and RL, while powerful, requires non-trivial software to execute, especially when combining advanced state representations such as Graph Neural Networks (GNN) with scalable RL algorithms. |
|---|---|
| AbstractList | Many important and challenging problems in combinatorial optimization (CO) can be expressed as graph search problems, in which graph vertices represent full or partial solutions and edges represent decisions that connect them. Graph structure not only introduces strong relational inductive biases for learning (Battaglia et al., 2018) - in this context, by providing a way to explicitly model the value of transitioning (along edges) between one search state (vertex) and the next - but lends itself to problems both with and without clearly defined algebraic structure. For example, classic CO problems on graphs such as the Traveling Salesman Problem (TSP) can be expressed as either pure graph search or integer programs. Other problems, however, such as molecular optimization, do no have concise algebraic formulations and yet are readily implemented as a graph search (V. et al., 2022; Zhou et al., 2019). Such "model-free" problems constitute a large fraction of modern reinforcement learning (RL) research owing to the fact that it is often much easier to write a forward simulation that expresses all of the state transitions and rewards, than to write down the precise mathematical expression of the full optimization problem. In the case of molecular optimization, for example, one can use domain knowledge alongside existing software libraries to model the effect of adding a single bond or atom to an existing but incomplete molecule, and let the RL algorithm build a model of how good a given decision is by "experiencing" the simulated environment many times through. In contrast, a model-based mathematical formulation that fully expresses all the chemical and physical constraints is intractable. In recent years, RL has emerged as an effective paradigm for optimizing searches over graphs and led to state-of-the-art heuristics for games like Go and chess, as well as for classical CO problems such as the TSP. This combination of graph search and RL, while powerful, requires non-trivial software to execute, especially when combining advanced state representations such as Graph Neural Networks (GNN) with scalable RL algorithms. |
| Author | Duplyakin, Dmitry Law, Jeffrey Tripp, Charles Edison Biagioni, David Clark, Struan John, Peter C. St |
| Author_xml | – sequence: 1 givenname: David orcidid: 0000-0001-6140-1957 surname: Biagioni fullname: Biagioni, David – sequence: 2 givenname: Charles Edison orcidid: 0000-0002-5867-3561 surname: Tripp fullname: Tripp, Charles Edison – sequence: 3 givenname: Struan orcidid: 0000-0003-0078-6560 surname: Clark fullname: Clark, Struan – sequence: 4 givenname: Dmitry orcidid: 0000-0001-5132-0168 surname: Duplyakin fullname: Duplyakin, Dmitry – sequence: 5 givenname: Jeffrey orcidid: 0000-0003-2828-1273 surname: Law fullname: Law, Jeffrey – sequence: 6 givenname: Peter C. St orcidid: 0000-0002-7928-3722 surname: John fullname: John, Peter C. St |
| BackLink | https://www.osti.gov/servlets/purl/1886880$$D View this record in Osti.gov |
| BookMark | eNp9kE9LAzEUxINUsNae_ALBq25N0k268Sal_oGCHvQcstmXNmWbXZKo9NsbWw8i6GmGx2-Gx5yige88IHROyYRRSvj1potxQkrB6BEasnLGC0mEGPzwJ2gc44YQQivBBKVDtFwF3a_Bv99gjZ93ad153Lo66LDDtgs4gPNZDWzBJ9yCDt75Fc7UPohjvpgsvTYQz9Cx1W2E8beO0Ovd4mX-UCyf7h_nt8vC0JLRAjgta5ClrKa2BqbFbNqQWhorpSWzmlCRDRDWMMsNcGMbazjnDSvBECbtdISuDr1vvte7D922qg9um39WlKj9GOprDLUfI-MXB7yLyaloXAKzNp33YJKiVSWqimSIHiATcjKAVZnTyXU-Be3aP4ovf2X-e-MTQaeAeg |
| CitedBy_id | crossref_primary_10_1021_jacsau_2c00540 |
| Cites_doi | 10.48550/arXiv.1606.01540 10.1038/s41598-019-47148-x 10.1016/j.patter.2021.100361 10.1038/s41467-020-16201-z 10.1038/s42256-022-00506-3 10.1039/d1sc02770k 10.1007/978-3-030-50426-7_33 10.48550/arXiv.1806.01261 10.48550/arXiv.2011.06069 10.1038/s41597-020-00588-x 10.48550/arXiv.1712.09381 |
| ContentType | Journal Article |
| CorporateAuthor | National Renewable Energy Lab. (NREL), Golden, CO (United States) |
| CorporateAuthor_xml | – name: National Renewable Energy Lab. (NREL), Golden, CO (United States) |
| DBID | AAYXX CITATION OIOZB OTOTI ADTOC UNPAY |
| DOI | 10.21105/joss.04621 |
| DatabaseName | CrossRef OSTI.GOV - Hybrid OSTI.GOV Unpaywall for CDI: Periodical Content Unpaywall |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2475-9066 |
| ExternalDocumentID | 10.21105/joss.04621 1886880 10_21105_joss_04621 |
| GroupedDBID | AAFWJ AAYXX ADBBV AFPKN ALMA_UNASSIGNED_HOLDINGS BCNDV CITATION GROUPED_DOAJ M~E OK1 OIOZB OTOTI ADTOC UNPAY |
| ID | FETCH-LOGICAL-c1421-e514be94983fbe2a673d0b9cf99f07b01699fe02d2f5ce5cfdfc555d24ec029f3 |
| IEDL.DBID | UNPAY |
| ISSN | 2475-9066 |
| IngestDate | Wed Oct 29 12:11:26 EDT 2025 Mon Sep 19 16:17:50 EDT 2022 Tue Jul 01 04:04:43 EDT 2025 Thu Apr 24 22:52:56 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 77 |
| Language | English |
| License | http://creativecommons.org/licenses/by/4.0 cc-by |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c1421-e514be94983fbe2a673d0b9cf99f07b01699fe02d2f5ce5cfdfc555d24ec029f3 |
| Notes | USDOE Advanced Research Projects Agency - Energy (ARPA-E) NREL/JA-2700-83459 AC36-08GO28308; AR0001205 |
| ORCID | 0000-0002-7928-3722 0000-0003-0078-6560 0000-0001-6140-1957 0000-0001-5132-0168 0000-0003-2828-1273 0000-0002-5867-3561 0000000258673561 0000000328281273 0000000161401957 0000000300786560 0000000279283722 0000000151320168 |
| OpenAccessLink | https://proxy.k.utb.cz/login?url=https://joss.theoj.org/papers/10.21105/joss.04621.pdf |
| ParticipantIDs | unpaywall_primary_10_21105_joss_04621 osti_scitechconnect_1886880 crossref_citationtrail_10_21105_joss_04621 crossref_primary_10_21105_joss_04621 |
| ProviderPackageCode | CITATION AAYXX |
| PublicationCentury | 2000 |
| PublicationDate | 2022-09-05 |
| PublicationDateYYYYMMDD | 2022-09-05 |
| PublicationDate_xml | – month: 09 year: 2022 text: 2022-09-05 day: 05 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Journal of open source software |
| PublicationYear | 2022 |
| Publisher | Open Source Initiative - NumFOCUS |
| Publisher_xml | – name: Open Source Initiative - NumFOCUS |
| References | Zhou (Zhou_2019) 2019; 9 Prouvost (prouvost2020ecole) 2020 Liang (liang2018rllib) 2018 Brockman (brockman2016openai) 2016 S. V. (Sowndarya_S_V_2021) 2021; 12 St. John (St_John_2020_a) 2020; 7 V. (sv2021multi) 2022 St. John (St_John_2020_b) 2020; 11 Zheng (Zheng_2020) 2020 Battaglia (battaglia2018relational) 2018 Pandey (Pandey_2021) 2021; 2 |
| References_xml | – year: 2016 ident: brockman2016openai article-title: Openai gym publication-title: arXiv preprint arXiv:1606.01540 doi: 10.48550/arXiv.1606.01540 – volume: 9 issue: 1 year: 2019 ident: Zhou_2019 article-title: Optimization of molecules via deep reinforcement learning publication-title: Scientific Reports doi: 10.1038/s41598-019-47148-x – volume: 2 issue: 11 year: 2021 ident: Pandey_2021 article-title: Predicting energy and stability of known and hypothetical crystals using graph neural network publication-title: Patterns doi: 10.1016/j.patter.2021.100361 – volume: 11 issue: 1 year: 2020 ident: St_John_2020_b article-title: Prediction of organic homolytic bond dissociation enthalpies at near chemical accuracy with sub-second computational cost publication-title: Nature Communications doi: 10.1038/s41467-020-16201-z – year: 2022 ident: sv2021multi article-title: Multi-objective goal-directed optimization of de novo stable organic radicals for aqueous redox flow batteries publication-title: Nature Machine Intelligence doi: 10.1038/s42256-022-00506-3 – volume: 12 issue: 39 year: 2021 ident: Sowndarya_S_V_2021 article-title: A quantitative metric for organic radical stability and persistence using thermodynamic and kinetic features publication-title: Chemical Science doi: 10.1039/d1sc02770k – year: 2020 ident: Zheng_2020 article-title: OpenGraphGym: A parallel reinforcement learning framework for graph optimization problems publication-title: Lecture notes in computer science doi: 10.1007/978-3-030-50426-7_33 – year: 2018 ident: battaglia2018relational article-title: Relational inductive biases, deep learning, and graph networks publication-title: arXiv preprint arXiv:1806.01261 doi: 10.48550/arXiv.1806.01261 – year: 2020 ident: prouvost2020ecole article-title: Ecole: A gym-like library for machine learning in combinatorial optimization solvers publication-title: Learning meets combinatorial algorithms at NeurIPS2020 doi: 10.48550/arXiv.2011.06069 – volume: 7 issue: 1 year: 2020 ident: St_John_2020_a article-title: Quantum chemical calculations for over 200,000 organic radical species and 40,000 associated closed-shell molecules publication-title: Scientific Data doi: 10.1038/s41597-020-00588-x – year: 2018 ident: liang2018rllib article-title: RLlib: Abstractions for distributed reinforcement learning publication-title: International conference on machine learning doi: 10.48550/arXiv.1712.09381 |
| SSID | ssj0001862611 |
| Score | 2.1961472 |
| Snippet | Many important and challenging problems in combinatorial optimization (CO) can be expressed as graph search problems, in which graph vertices represent full or... |
| SourceID | unpaywall osti crossref |
| SourceType | Open Access Repository Enrichment Source Index Database |
| StartPage | 4621 |
| SubjectTerms | machine learning MATHEMATICS AND COMPUTING reinforcement learning software engineering |
| Title | graphenv: a Python library for reinforcement learning on graph search spaces |
| URI | https://www.osti.gov/servlets/purl/1886880 https://joss.theoj.org/papers/10.21105/joss.04621.pdf |
| UnpaywallVersion | publishedVersion |
| Volume | 7 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2475-9066 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001862611 issn: 2475-9066 databaseCode: DOA dateStart: 20160101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2475-9066 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001862611 issn: 2475-9066 databaseCode: M~E dateStart: 20160101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_m9iA-OD9xfow8zBeho02bdvFtiGOIjj04mU-lSRNhjm6MTZl_vZe0GzpE9LmXEC65u1-uud8BNEzRgdRSOoKG0sEIjX5Q-57D0GFGHpNJqE1C_6EXdgfB3ZANS8BWtTAjU1pnavhG9jf-NJkiCrIPljFOsfyzqadEB5PqLaiEDCF4GSqDXr_9bBrJBRFzOMbRvBZvc9S36FOeoBXtwPYimybL92Q8_hJZOlV4Wq0pf1Dy2lzMRVN-bNA1_nvRe7BbYE3Szg_HPpRUdgDVVR8HUpj1Idxb1mqVvV2ThPSXhk2AFNkdgpiWzJSlV5U2k0iKPhMvBKXsQJJbC0HfhE7nCAad28ebrlN0WXCkF1DPUQiZhOIBb_laKJqEkZ-6gkvNuXYjYdhauFYuTalmUjGpUy0ZYykNlHQp1_4xlLNJpk6ApKh4PBSamjkZTXkauklIA6E1F0nUqsHVSvuxLCjITSeMcYxXEaur2OgqtrqqQWMtPM2ZN34WOzPbGCNgMKy30jwPkvPYa7VCdE01uFzv7m-znP5R7hzK89lCXSAQmYu6vcDXi8P3CW1q4bg |
| linkProvider | Unpaywall |
| linkToUnpaywall | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED_m9iA-OD9xfpGH-SJ0tGnTNr4NcQzRsQcn86kkaSLo6MbYlPnXe2m7oUNEn3sJ4ZK7--Wa-x1A0xYdKKOUI2moHIzQ6AeN7zkMHWbkMSVCYxP6972wOwhuh2xYAbashXmxpXW2hu8l_40_ERNEQfmDZYxTrPhs6ynRwaRmA2ohQwhehdqg128_2UZyQcQcjnG0qMVbH_Ut-lTHaEVbsDnPJmLxLkajL5GlU4fH5ZqKByWvrflMttTHGl3jvxe9A9sl1iTt4nDsQkVne1Bf9nEgpVnvw13OWq2ztysiSH9h2QRImd0hiGnJVOf0qirPJJKyz8QzQal8ICmshaBvQqdzAIPOzcN11ym7LDjKC6jnaIRMUvOAx76Rmoow8lNXcmU4N24kLVsLN9qlKTVMaaZMahRjLKWBVi7lxj-EajbO9BGQFBWPh8JQOyejKU9DV4Q0kMZwKaK4AZdL7SeqpCC3nTBGCV5Fcl0lVldJrqsGNFfCk4J542exE7uNCQIGy3qr7PMgNUu8OA7RNTXgYrW7v81y_Ee5U6jOpnN9hkBkJs_LY_cJKJrgww |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=graphenv%3A+a+Python+library+for+reinforcement+learning+on+graph+search+spaces&rft.jtitle=Journal+of+open+source+software&rft.au=Biagioni%2C+David&rft.au=Tripp%2C+Charles+Edison&rft.au=Clark%2C+Struan&rft.au=Duplyakin%2C+Dmitry&rft.date=2022-09-05&rft.pub=Open+Source+Initiative+-+NumFOCUS&rft.issn=2475-9066&rft.eissn=2475-9066&rft.volume=7&rft.issue=77&rft_id=info:doi/10.21105%2Fjoss.04621&rft.externalDocID=1886880 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2475-9066&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2475-9066&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2475-9066&client=summon |