Scalable multi-region perimeter metering control for urban networks: A multi-agent deep reinforcement learning approach

•A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban network•Learning and control efficay illustrated via comparisons to the MPC method•Resilience and transferability shown considering environment...

Full description

Saved in:

Bibliographic Details
Published in	Transportation research. Part C, Emerging technologies Vol. 148; p. 104033
Main Authors	Zhou, Dongqin, Gayah, Vikash V.
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.03.2023
Subjects	Macroscopic Fundamental Diagram (MFD) Model-free multi-agent reinforcement learning (MARL) Multi-region perimeter metering control Multi-region perimeter metering control Model-free multi-agent reinforcement learning (MARL) Macroscopic Fundamental Diagram (MFD)
Online Access	Get full text
ISSN	0968-090X 1879-2359
DOI	10.1016/j.trc.2023.104033

Cover

Abstract	•A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban network•Learning and control efficay illustrated via comparisons to the MPC method•Resilience and transferability shown considering environment uncertainties Perimeter metering control based on macroscopic fundamental diagrams has attracted increasing research interests over the past decade. This strategy provides a convenient way to mitigate urban congestion by manipulating vehicular movements across homogeneous regions without modeling the detailed behaviors and interactions involved with individual vehicle presence. In particular, multi-region perimeter metering control holds promise for efficient traffic management in large-scale urban networks. However, most existing methods for multi-region control require knowledge of either the environment traffic dynamics or network properties (i.e., the critical accumulations), whereas such information is generally difficult to obtain and subject to significant estimation errors. The recently developed model-free techniques, on the other hand, have not yet been shown scalable or applicable to large urban networks. To fill this gap, this paper proposes a scalable model-free scheme based on multi-agent deep reinforcement learning. The proposed scheme features value function decomposition in the paradigm of centralized training with decentralized execution, coupled with critical advances of single-agent deep reinforcement learning and problem reformulation guided by domain expertise. Comprehensive experiment results on a seven-region urban network suggest the scheme is: (a) effective, with consistent convergence to final control outcomes that are comparable to the model predictive control method; (b) resilient, with superior learning and control efficacy in the presence of inaccurate input information from the environment; and (c) transferable, with sufficient implementation prospect as well as real time applicability to unencountered environments featuring increased uncertainty.
AbstractList	•A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban network•Learning and control efficay illustrated via comparisons to the MPC method•Resilience and transferability shown considering environment uncertainties Perimeter metering control based on macroscopic fundamental diagrams has attracted increasing research interests over the past decade. This strategy provides a convenient way to mitigate urban congestion by manipulating vehicular movements across homogeneous regions without modeling the detailed behaviors and interactions involved with individual vehicle presence. In particular, multi-region perimeter metering control holds promise for efficient traffic management in large-scale urban networks. However, most existing methods for multi-region control require knowledge of either the environment traffic dynamics or network properties (i.e., the critical accumulations), whereas such information is generally difficult to obtain and subject to significant estimation errors. The recently developed model-free techniques, on the other hand, have not yet been shown scalable or applicable to large urban networks. To fill this gap, this paper proposes a scalable model-free scheme based on multi-agent deep reinforcement learning. The proposed scheme features value function decomposition in the paradigm of centralized training with decentralized execution, coupled with critical advances of single-agent deep reinforcement learning and problem reformulation guided by domain expertise. Comprehensive experiment results on a seven-region urban network suggest the scheme is: (a) effective, with consistent convergence to final control outcomes that are comparable to the model predictive control method; (b) resilient, with superior learning and control efficacy in the presence of inaccurate input information from the environment; and (c) transferable, with sufficient implementation prospect as well as real time applicability to unencountered environments featuring increased uncertainty.
ArticleNumber	104033
Author	Zhou, Dongqin Gayah, Vikash V.
Author_xml	– sequence: 1 givenname: Dongqin surname: Zhou fullname: Zhou, Dongqin email: dongqin.zhou@psu.edu – sequence: 2 givenname: Vikash V. orcidid: 0000-0002-0648-3360 surname: Gayah fullname: Gayah, Vikash V. email: gayah@engr.psu.edu
BookMark	eNp9kMtOwzAQRS1UJNrCB7DzD6T4kUcNq6riJVViAUjsrIkzKS6pHTkuFX9PQrti0c2MZjRnpHsmZOS8Q0KuOZtxxvObzSwGMxNMyH5OmZRnZMznhUqEzNSIjJnK5wlT7OOCTLpuwxjjKivGZP9qoIGyQbrdNdEmAdfWO9pisFuMGOhftW5NjXcx-IbWPtBdKMFRh3Hvw1d3SxdHGtboIq0QWxrQuv7U4HZYNQjBDV-gbYMH83lJzmtoOrw69il5f7h_Wz4lq5fH5-VilRihipjU1RwBqxIUqwXPeIq5ASwLyQGLPhIwkGlRqRSlEIL1WVnGKyVYLtOMl0JOCT_8NcF3XcBat30yCD-aMz2Y0xvdm9ODOX0w1zPFP8bYCNEOAsA2J8m7A4l9pG-LQXfGojNY2YAm6srbE_QvfwWNEw
CitedBy_id	crossref_primary_10_1016_j_eswa_2024_124627 crossref_primary_10_1016_j_trc_2024_104944 crossref_primary_10_1016_j_trc_2024_104725 crossref_primary_10_1109_MITS_2024_3418333 crossref_primary_10_1016_j_trc_2024_104739 crossref_primary_10_1177_03611981241262313 crossref_primary_10_1016_j_trc_2023_104440 crossref_primary_10_1007_s10489_023_04866_0 crossref_primary_10_1016_j_commtr_2023_100104 crossref_primary_10_1038_s44260_024_00014_y crossref_primary_10_1016_j_trb_2024_103016 crossref_primary_10_1080_21680566_2025_2475215 crossref_primary_10_1177_03611981241230313 crossref_primary_10_1016_j_trc_2023_104461 crossref_primary_10_1287_trsc_2024_0519
Cites_doi	10.1016/j.trb.2012.06.008 10.1016/j.trc.2013.07.002 10.1016/j.trb.2014.06.010 10.1016/B978-1-55860-307-3.50049-6 10.1016/j.trb.2010.06.006 10.1098/rsta.2010.0099 10.1109/TCST.2014.2330997 10.1016/j.tre.2017.03.006 10.1007/s10458-019-09421-1 10.1126/science.204.4389.148 10.1016/j.tra.2012.05.006 10.1109/9.580874 10.1016/j.trb.2017.08.015 10.1016/j.trb.2013.03.007 10.1016/j.trb.2018.05.019 10.1016/j.trb.2018.10.007 10.1016/j.trc.2015.05.014 10.1016/j.trc.2020.102709 10.1016/j.trc.2021.103043 10.1016/j.trb.2018.02.016 10.1016/j.trb.2012.08.005 10.1038/nature14236 10.1016/j.trc.2016.07.013 10.1016/j.trb.2014.12.010 10.3141/2622-06 10.1016/j.trc.2015.05.009 10.1109/TITS.2018.2873104 10.1109/TITS.2015.2399303 10.1109/25.69966 10.1109/TITS.2012.2216877 10.1016/j.trc.2015.08.015 10.1109/ACC.2012.6314693 10.1016/j.sbspro.2013.05.008 10.1016/j.trc.2019.04.024 10.1016/j.trc.2020.102618 10.1016/j.trb.2014.09.010 10.1016/j.conengprac.2021.104750 10.1016/j.trc.2016.12.002 10.1016/j.trc.2013.04.010 10.1016/j.trc.2022.103932 10.1016/j.trc.2013.08.014 10.1016/j.trb.2008.02.002 10.1016/j.trc.2021.103157 10.1016/j.trc.2020.102725 10.1007/978-1-4419-0820-9_11 10.3141/2124-12 10.1016/j.trc.2022.103759 10.3141/2301-09 10.1016/j.trb.2013.07.003 10.1109/TITS.2017.2716541 10.1016/j.trb.2016.03.010 10.1016/j.trb.2010.11.004 10.1016/j.trb.2010.11.006 10.1016/j.trb.2015.02.010 10.1016/j.conengprac.2017.01.010 10.1016/j.trb.2017.09.008 10.1287/trsc.18.4.362 10.1016/j.trb.2019.03.010 10.1023/A:1022628806385 10.1613/jair.2447 10.1016/j.trb.2015.09.002 10.1016/j.trc.2015.04.031 10.1016/j.trb.2016.10.016 10.1016/j.trb.2016.05.008 10.1016/j.trb.2006.03.001 10.3141/2623-11 10.1016/j.trc.2020.02.003 10.1016/j.trc.2020.102949 10.1016/j.trc.2021.103485 10.1016/j.trb.2012.04.004 10.1016/j.trc.2022.103961 10.3141/2421-01 10.1609/aaai.v32i1.11794 10.1080/21680566.2017.1337528 10.1016/j.trb.2017.08.021 10.1016/j.trc.2017.08.002 10.1016/j.trc.2020.102628
ContentType	Journal Article
Copyright	2023 Elsevier Ltd
Copyright_xml	– notice: 2023 Elsevier Ltd
DBID	AAYXX CITATION
DOI	10.1016/j.trc.2023.104033
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Economics Engineering
EISSN	1879-2359
ExternalDocumentID	10_1016_j_trc_2023_104033 S0968090X23000220
GroupedDBID	--K --M -~X .DC .~1 0R~ 123 1B1 1RT 1~. 1~5 29Q 4.4 457 4G. 5VS 7-5 71M 8P~ 9JN 9JO AAAKF AAAKG AACTN AAEDT AAEDW AAFJI AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARIN AAXUO AAYFN ABBOA ABLJU ABMAC ABMMH ABUCO ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV AKYCK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOMHK AOUOD APLSM ASPBG AVARZ AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HAMUX HMY HVGLF HZ~ H~9 IHE J1W JJJVA KOM LY1 LY7 M3Y M41 MO0 MS~ N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PRBVW Q38 R2- RIG ROL RPZ SDF SDG SDS SES SET SEW SPC SPCBC SSB SSD SSO SSS SST SSV SSZ T5K TN5 WUQ XPP ~G- AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ADNMO AEIPS AFJKZ AGQPQ AIIUN ANKPU APXCP CITATION EFKBS ~HD
ID	FETCH-LOGICAL-c297t-fd8eaedba90f21514e6caeb731ae7968a0a347d94e32220359051d92063451b23
IEDL.DBID	.~1
ISSN	0968-090X
IngestDate	Sat Oct 25 05:06:56 EDT 2025 Thu Apr 24 22:59:05 EDT 2025 Fri Feb 23 02:35:13 EST 2024
IsPeerReviewed	true
IsScholarly	true
Keywords	Multi-region perimeter metering control Model-free multi-agent reinforcement learning (MARL) Macroscopic Fundamental Diagram (MFD)
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c297t-fd8eaedba90f21514e6caeb731ae7968a0a347d94e32220359051d92063451b23
ORCID	0000-0002-0648-3360
ParticipantIDs	crossref_primary_10_1016_j_trc_2023_104033 crossref_citationtrail_10_1016_j_trc_2023_104033 elsevier_sciencedirect_doi_10_1016_j_trc_2023_104033
PublicationCentury	2000
PublicationDate	March 2023 2023-03-00
PublicationDateYYYYMMDD	2023-03-01
PublicationDate_xml	– month: 03 year: 2023 text: March 2023
PublicationDecade	2020
PublicationTitle	Transportation research. Part C, Emerging technologies
PublicationYear	2023
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Haddad, Mirkin (b0190) 2017; 77 Keyvan-Ekbatani, Papageorgiou, Knoop (b0265) 2015; 59 Rashid, T., Farquhar, G., Peng, B., Whiteson, S., 2020. Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning, in: Advances in Neural Information Processing Systems. Neural information processing systems foundation, pp. 10199–10210. https://doi.org/10.48550/arxiv.2006.10800. Saeedmanesh, Geroliminis (b0430) 2016; 91 Lopez, Krishnakumari, Leclercq, Chiabaut, van Lint (b0320) 2017; 2623 Yildirimoglu, Ramezani, Geroliminis (b0555) 2015; 59 Lei, Hou, Ren (b0295) 2019; 1–12 Son, K., Kim, D., Kang, W.J., Hostallero, D., Yi, Y., 2019. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning, in: 36th International Conference on Machine Learning. International Machine Learning Society (IMLS), pp. 5887–5896. https://doi.org/10.48550/arxiv.1905.05408. Wang, Schaul, Hessel, van Hasselt, Lanctot, de Freitas (b0535) 2015; ICML 2016 4 Goodfellow, Bengio, Courville (b0155) 2016 Choi, S., Yeung, D.Y., Zhang, N., 1999. An Environment Model for Nonstationary Reinforcement Learning, in: Advances in Neural Information Processing Systems 12. Sirmatel, Tsitsokas, Kouvelas, Geroliminis (b0455) 2021; 128 Yildirimoglu, Sirmatel, Geroliminis (b0560) 2018; 118 Saeedmanesh, Geroliminis (b0435) 2017; 105 Hernandez-Leal, Kartal, Taylor (b0225) 2018; 33 Tsitsiklis, Roy (b0505) 1997 Ni, Cassidy (b0375) 2020; 113 Gupta, Egorov, Kochenderfer (b0160) 2017; 10642 LNAI Geroliminis, Sun (b0145) 2011; 45 Hajiahmadi, Haddad, De Schutter, Geroliminis (b0210) 2015; 23 Daganzo, Gayah, Gonzales (b0070) 2011; 45 Keyvan-Ekbatani, Yildirimoglu, Geroliminis, Papageorgiou (b0270) 2015; 16 Aboudolas, Geroliminis (b0010) 2013; 55 Chen, Huang, Lam, Pan, Hsu, Sumalee, Zhong (b0040) 2022; 142 Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., van Hasselt, H., Silver, D., 2018. Distributed Prioritized Experience Replay. Gayah, Daganzo (b0115) 2011; 45 Gayah, Gao, Nagle (b0120) 2014; 70 Lowrie (b0330) 1982 Fu, Wang, Tang, Zheng, Geroliminis (b0100) 2020; 118 Sirmatel, Geroliminis (b0445) 2018; 19 Chang, Y.H., Ho, T., Kaelbling, L., 2003. All learning is local: Multi-agent learning in global reward games, in: Advances in Neural Information Processing Systems 16. Ren, Hou, Sirmatel, Geroliminis (b0420) 2020; 115 Geroliminis, N., Levinson, D.M., 2009. Cordon Pricing Consistent with the Physics of Overcrowding, Transportation and Traffic Theory 2009: Golden Jubilee. https://doi.org/10.1007/978-1-4419-0820-9_11. Godfrey (b0150) 1969; 11 Geroliminis, Haddad, Ramezani (b0140) 2013; 14 Leclercq, Geroliminis (b0290) 2013; 80 Zhong, Chen, Huang, Sumalee, Lam, Xu (b0570) 2018; 117 Zheng, Waraich, Axhausen, Geroliminis (b0565) 2012; 46 Hessel, Modayil, van Hasselt, Schaul, Ostrovski, Dabney, Horgan, Piot, Azar, Silver (b0230) 2017; 2018 Aalipour, Kebriaei, Ramezani (b0005) 2019; 20 Haddad, J., Ramezani, M., Geroliminis, N., 2012. Model predictive perimeter control for urban areas with macroscopic fundamental diagrams, in: Proceedings of the American Control Conference. pp. 5757–5762. https://doi.org/10.1109/acc.2012.6314693. Rashid, Samvelyan, de Witt, Farquhar, Foerster, Whiteson (b0415) 2018 van Hasselt, Guez, Silver (b0520) 2015; 2016 Araghi, Khosravi, Johnstone, Creighton (b0025) 2013 Mahmassani, H., Herman, R., 1984. Dynamic User Equilibrium Departure Time and Route Choice on Idealized Traffic Arterials. 18, 362–384. https://doi.org/10.1287/TRSC.18.4.362. Sunehag, P., Lever, G., Gruslys, A., Marian Czarnecki, W., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J.Z., Tuyls, K., Graepel, T., 2018. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward, in: 17th International Conference on Autonomous Agents and MultiAgent Systems. pp. 2085–2087. https://doi.org/10.5555/3237383.3238080. Haddad, Geroliminis (b0185) 2012; 46 Ji, Geroliminis (b0245) 2012; 46 Du, Rakha, Gayah (b0090) 2016; 66 Gao, Shirley, Gayah (b0105) 2018; 117 Ambühl, Menendez (b0015) 2016; 71 Daganzo, Lehe (b0075) 2015; 75 Laval, Castrillón (b0285) 2015; 81 Ramezani, Haddad, Geroliminis (b0405) 2015; 74 Chu, X., Ye, H., 2017. Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning. OroojlooyJadid, A., Hajinezhad, D., 2019. A Review of Cooperative Multi-Agent Deep Reinforcement Learning. Haddad, Shraiber (b0200) 2014; 68 Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I., 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Adv. Neural Inf. Process. Syst. 2017-Decem, 6380–6391. Schaul, Quan, Antonoglou, Silver (b0440) 2016 Williams, Mahmassani, Herman (b0550) 1987; 1112 Oliehoek, Spaan, Vlassis (b0380) 2008; 32 Moshahedi, Kattan (b0365) 2023; 146 Jin, C., Allen-Zhu, Z., Bubeck, S., Jordan, M.I., 2018. Is Q-Learning Provably Efficient?, in: Advances in Neural Information Processing Systems, 31. Keyvan-Ekbatani, Kouvelas, Papamichail, Papageorgiou (b0255) 2012; 46 Peng, Rashid, de Witt, Kamienny, Torr, Böhmer, Whiteson (b0400) 2021 Paipuri, Xu, González, Leclercq (b0395) 2020; 118 Iqbal, S., Sha, F., 2019. Actor-attention-critic for multi-agent reinforcement learning, in: 36th International Conference on Machine Learning, ICML 2019. International Machine Learning Society (IMLS), pp. 5261–5270. Small, Chu (b0460) 2003; 37 Haddad (b0175) 2017; 96 Foerster, J.N., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S., 2017. Counterfactual Multi-Agent Policy Gradients. 32nd AAAI Conf. Artif. Intell. AAAI 2018 2974–2982. https://doi.org/10.48550/arxiv.1705.08926. Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (b0355) 2015; 518 Sutton, Barto (b0480) 2018 Csikós, Charalambous, Farhadi, Kulcsár, Wymeersch (b0060) 2017; 83 Varaiya (b0525) 2013; 36 Wen, Y., Yang, Y., Luo, R., Wang, J., Pan, W., 2019. Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. 7th Int. Conf. Learn. Represent. https://doi.org/10.48550/arxiv.1901.09207. Daganzo, Lehe (b0080) 2016; 90 Christianos, F., Papoudakis, G., Rahman, A., Albrecht, S. V., 2021. Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing, in: 38th International Conference on Machine Learning. https://doi.org/10.48550/arxiv.2102.07475. Menelaou, Timotheou, Kolios, Panayiotou (b0350) 2021 Mazloumian, A., Geroliminis, N., Helbing, D., 2010. The spatial variability of vehicle densities as determinant of urban network capacity 368, 4627–4647. https://doi.org/10.1098/rsta.2010.0099. Tan, M., 1993. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, in: 10th International Conference on Machine Learning Proceedings. Elsevier, pp. 330–337. https://doi.org/10.1016/B978-1-55860-307-3.50049-6. Sirmatel, Geroliminis (b0450) 2021; 109 Keyvan-Ekbatani, Papageorgiou, Papamichail (b0260) 2013; 33 van Hasselt, H., Doron, Y., Strub, F., Hessel, M., Sonnerat, N., Modayil, J., 2018. Deep Reinforcement Learning and the Deadly Triad. Gayah, V., Daganzo, C., 2012. Analytical Capacity Comparison of One-Way and Two-Way Signalized Street Networks 2301, 76–85. https://doi.org/10.3141/2301-09. DePrator, Hitchcock, Gayah (b0085) 2017; 2622 Henderson, Islam, Bachman, Pineau, Precup, Meger (b0215) 2017; 2018 Haddad, Zheng (b0205) 2020; 137 Li, Ramezani (b0300) 2022; 145 Tieleman, Hinton (b0495) 2012; 4 Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings. International Conference on Learning Representations, ICLR. Genser, Kouvelas (b0125) 2022; 134 Robertson, Bretherton (b0425) 1991; 40 Mahmassani, Saberi, Zockaie (b0340) 2013; 36 Ortigosa, J., Gayah, V. V., Menendez, M., 2017. Analysis of one-way and two-way street configurations on urban grid networks. 7, 61–81. https://doi.org/10.1080/21680566.2017.1337528. Daganzo (b0065) 2007; 41 Geroliminis, Daganzo (b0135) 2008; 42 Haddad (b0170) 2017; 61 Tilg, Amini, Busch (b0500) 2020; 114 Herman, Prigogine (b0220) 1979; 204 Watkins, Dayan (b0540) 1992; 8 Lin (b0315) 1992; 8 Buisson, Ladier (b0030) 2009; 2124 Su, Chow, Zheng, Huang, Liang, Zhong (b0470) 2020; 116 Zhong, Huang, Chen, Lam, Xu, Sumalee (b0575) 2018; 111 Zhou, Gayah (b0580) 2021; 124 van Hasselt (b0510) 2010 Nagle, Gayah (b0370) 2014 Li, Yildirimoglu, Ramezani (b0305) 2021; 126 Terry, J.K., Grammel, N., Hari, A., Santos, L., 2020. Parameter Sharing is Surprisingly Useful for Multi-Agent Deep Reinforcement Learning. Mohajerpoor, Saberi, Vu, Garoni, Ramezani (b0360) 2020; 137 Amirgholy, Shahabi, Gao (b0020) 2017; 103 Wang, Han, Wang, Dong, Zhang (b0530) 2021 Haddad, Ramezani, Geroliminis (b0195) 2013; 54 Koller, D., Parr, R., 1999. Computing factored value functions for policies in structured MDPs, in: 16th International Joint Conference on Artificial Intelligence. pp. 1332–1339. Haddad (b0165) 2015; 59 Lauer, M., Riedmiller, M.A., 2000. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems, in: 17th International Conference on Machine Learning. pp. 535–542. Herman (10.1016/j.trc.2023.104033_b0220) 1979; 204 Leclercq (10.1016/j.trc.2023.104033_b0290) 2013; 80 10.1016/j.trc.2023.104033_b0410 Araghi (10.1016/j.trc.2023.104033_b0025) 2013 10.1016/j.trc.2023.104033_b0250 10.1016/j.trc.2023.104033_b0095 Watkins (10.1016/j.trc.2023.104033_b0540) 1992; 8 10.1016/j.trc.2023.104033_b0130 Henderson (10.1016/j.trc.2023.104033_b0215) 2017; 2018 Oliehoek (10.1016/j.trc.2023.104033_b0380) 2008; 32 Gao (10.1016/j.trc.2023.104033_b0105) 2018; 117 Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0265) 2015; 59 Nagle (10.1016/j.trc.2023.104033_b0370) 2014 DePrator (10.1016/j.trc.2023.104033_b0085) 2017; 2622 Haddad (10.1016/j.trc.2023.104033_b0200) 2014; 68 Robertson (10.1016/j.trc.2023.104033_b0425) 1991; 40 Wang (10.1016/j.trc.2023.104033_b0535) 2015; ICML 2016 4 van Hasselt (10.1016/j.trc.2023.104033_b0520) 2015; 2016 10.1016/j.trc.2023.104033_b0485 10.1016/j.trc.2023.104033_b0240 Ren (10.1016/j.trc.2023.104033_b0420) 2020; 115 10.1016/j.trc.2023.104033_b0490 Hernandez-Leal (10.1016/j.trc.2023.104033_b0225) 2018; 33 Geroliminis (10.1016/j.trc.2023.104033_b0140) 2013; 14 Haddad (10.1016/j.trc.2023.104033_b0170) 2017; 61 Amirgholy (10.1016/j.trc.2023.104033_b0020) 2017; 103 Li (10.1016/j.trc.2023.104033_b0305) 2021; 126 Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0255) 2012; 46 Mnih (10.1016/j.trc.2023.104033_b0355) 2015; 518 Goodfellow (10.1016/j.trc.2023.104033_b0155) 2016 Varaiya (10.1016/j.trc.2023.104033_b0525) 2013; 36 Lin (10.1016/j.trc.2023.104033_b0315) 1992; 8 Saeedmanesh (10.1016/j.trc.2023.104033_b0430) 2016; 91 Su (10.1016/j.trc.2023.104033_b0470) 2020; 116 Hajiahmadi (10.1016/j.trc.2023.104033_b0210) 2015; 23 Paipuri (10.1016/j.trc.2023.104033_b0395) 2020; 118 Geroliminis (10.1016/j.trc.2023.104033_b0145) 2011; 45 10.1016/j.trc.2023.104033_b0275 Peng (10.1016/j.trc.2023.104033_b0400) 2021 10.1016/j.trc.2023.104033_b0035 10.1016/j.trc.2023.104033_b0310 van Hasselt (10.1016/j.trc.2023.104033_b0510) 2010 Zhou (10.1016/j.trc.2023.104033_b0580) 2021; 124 10.1016/j.trc.2023.104033_b0280 Yildirimoglu (10.1016/j.trc.2023.104033_b0555) 2015; 59 Chen (10.1016/j.trc.2023.104033_b0040) 2022; 142 Genser (10.1016/j.trc.2023.104033_b0125) 2022; 134 Hessel (10.1016/j.trc.2023.104033_b0230) 2017; 2018 10.1016/j.trc.2023.104033_b0545 Tilg (10.1016/j.trc.2023.104033_b0500) 2020; 114 10.1016/j.trc.2023.104033_b0385 Zheng (10.1016/j.trc.2023.104033_b0565) 2012; 46 Sirmatel (10.1016/j.trc.2023.104033_b0450) 2021; 109 Zhong (10.1016/j.trc.2023.104033_b0570) 2018; 117 Csikós (10.1016/j.trc.2023.104033_b0060) 2017; 83 Williams (10.1016/j.trc.2023.104033_b0550) 1987; 1112 10.1016/j.trc.2023.104033_b0390 Ji (10.1016/j.trc.2023.104033_b0245) 2012; 46 Laval (10.1016/j.trc.2023.104033_b0285) 2015; 81 Li (10.1016/j.trc.2023.104033_b0300) 2022; 145 Haddad (10.1016/j.trc.2023.104033_b0185) 2012; 46 Daganzo (10.1016/j.trc.2023.104033_b0075) 2015; 75 Haddad (10.1016/j.trc.2023.104033_b0190) 2017; 77 10.1016/j.trc.2023.104033_b0335 Wang (10.1016/j.trc.2023.104033_b0530) 2021 10.1016/j.trc.2023.104033_b0055 10.1016/j.trc.2023.104033_b0180 Saeedmanesh (10.1016/j.trc.2023.104033_b0435) 2017; 105 Fu (10.1016/j.trc.2023.104033_b0100) 2020; 118 Lei (10.1016/j.trc.2023.104033_b0295) 2019; 1–12 Schaul (10.1016/j.trc.2023.104033_b0440) 2016 Buisson (10.1016/j.trc.2023.104033_b0030) 2009; 2124 Daganzo (10.1016/j.trc.2023.104033_b0070) 2011; 45 Haddad (10.1016/j.trc.2023.104033_b0175) 2017; 96 10.1016/j.trc.2023.104033_b0325 Sutton (10.1016/j.trc.2023.104033_b0480) 2018 10.1016/j.trc.2023.104033_b0045 Mahmassani (10.1016/j.trc.2023.104033_b0340) 2013; 36 Haddad (10.1016/j.trc.2023.104033_b0165) 2015; 59 10.1016/j.trc.2023.104033_b0050 Aalipour (10.1016/j.trc.2023.104033_b0005) 2019; 20 Aboudolas (10.1016/j.trc.2023.104033_b0010) 2013; 55 Haddad (10.1016/j.trc.2023.104033_b0195) 2013; 54 10.1016/j.trc.2023.104033_b0515 Small (10.1016/j.trc.2023.104033_b0460) 2003; 37 Geroliminis (10.1016/j.trc.2023.104033_b0135) 2008; 42 Daganzo (10.1016/j.trc.2023.104033_b0065) 2007; 41 10.1016/j.trc.2023.104033_b0235 Ramezani (10.1016/j.trc.2023.104033_b0405) 2015; 74 Godfrey (10.1016/j.trc.2023.104033_b0150) 1969; 11 10.1016/j.trc.2023.104033_b0110 10.1016/j.trc.2023.104033_b0475 Du (10.1016/j.trc.2023.104033_b0090) 2016; 66 Gayah (10.1016/j.trc.2023.104033_b0115) 2011; 45 Zhong (10.1016/j.trc.2023.104033_b0575) 2018; 111 Lowrie (10.1016/j.trc.2023.104033_b0330) 1982 Sirmatel (10.1016/j.trc.2023.104033_b0455) 2021; 128 Gayah (10.1016/j.trc.2023.104033_b0120) 2014; 70 Menelaou (10.1016/j.trc.2023.104033_b0350) 2021 Daganzo (10.1016/j.trc.2023.104033_b0080) 2016; 90 Ambühl (10.1016/j.trc.2023.104033_b0015) 2016; 71 Ni (10.1016/j.trc.2023.104033_b0375) 2020; 113 Rashid (10.1016/j.trc.2023.104033_b0415) 2018 10.1016/j.trc.2023.104033_b0345 Moshahedi (10.1016/j.trc.2023.104033_b0365) 2023; 146 Haddad (10.1016/j.trc.2023.104033_b0205) 2020; 137 Lopez (10.1016/j.trc.2023.104033_b0320) 2017; 2623 10.1016/j.trc.2023.104033_b0465 Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0270) 2015; 16 Sirmatel (10.1016/j.trc.2023.104033_b0445) 2018; 19 Yildirimoglu (10.1016/j.trc.2023.104033_b0560) 2018; 118 Mohajerpoor (10.1016/j.trc.2023.104033_b0360) 2020; 137 Gupta (10.1016/j.trc.2023.104033_b0160) 2017; 10642 LNAI Tsitsiklis (10.1016/j.trc.2023.104033_b0505) 1997 Tieleman (10.1016/j.trc.2023.104033_b0495) 2012; 4 Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0260) 2013; 33
References_xml	– reference: Chang, Y.H., Ho, T., Kaelbling, L., 2003. All learning is local: Multi-agent learning in global reward games, in: Advances in Neural Information Processing Systems 16. – volume: 42 start-page: 759 year: 2008 end-page: 770 ident: b0135 article-title: Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings publication-title: Transp. Res. Part B Methodol. – volume: 118 year: 2020 ident: b0395 article-title: Estimating MFDs, trip lengths and path flow distributions in a multi-region setting using mobile phone data publication-title: Transp. Res. Part C Emerg. Technol. – volume: 128 year: 2021 ident: b0455 article-title: Modeling, estimation, and control in large-scale urban road networks with remaining travel distance dynamics publication-title: Transp. Res. Part C Emerg. Technol. – reference: Ortigosa, J., Gayah, V. V., Menendez, M., 2017. Analysis of one-way and two-way street configurations on urban grid networks. 7, 61–81. https://doi.org/10.1080/21680566.2017.1337528. – start-page: 1 year: 2014 end-page: 11 ident: b0370 article-title: Accuracy of Networkwide Traffic States Estimated from Mobile Probe Data publication-title: Transp. Res. Rec. J. Transp. Res. Board – volume: 59 start-page: 404 year: 2015 end-page: 420 ident: b0555 article-title: Equilibrium analysis and route guidance in large-scale networks with MFD dynamics publication-title: Transp. Res. Part C Emerg. Technol. – volume: 8 start-page: 279 year: 1992 end-page: 292 ident: b0540 article-title: Q-learning publication-title: Mach. Learn. – volume: 145 year: 2022 ident: b0300 article-title: Quasi revenue-neutral congestion pricing in cities: Crediting drivers to avoid city centers publication-title: Transp. Res. Part C Emerg. Technol. – reference: Rashid, T., Farquhar, G., Peng, B., Whiteson, S., 2020. Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning, in: Advances in Neural Information Processing Systems. Neural information processing systems foundation, pp. 10199–10210. https://doi.org/10.48550/arxiv.2006.10800. – volume: 116 year: 2020 ident: b0470 article-title: Neuro-dynamic programming for optimal control of macroscopic fundamental diagram systems publication-title: Transp. Res. Part C Emerg. Technol. – volume: 2018 start-page: 3207 year: 2017 end-page: 3214 ident: b0215 article-title: Deep Reinforcement Learning that Matters publication-title: 32nd AAAI Conf. Artif. Intell. AAAI – volume: 59 start-page: 308 year: 2015 end-page: 322 ident: b0265 article-title: Controller design for gating traffic control in presence of time-delay in urban road networks publication-title: Transp. Res. Part C Emerg. Technol. – volume: 40 start-page: 11 year: 1991 end-page: 15 ident: b0425 article-title: Optimizing Networks of Traffic Signals in Real Time—The SCOOT Method publication-title: IEEE Trans. Veh. Technol. – volume: 146 year: 2023 ident: b0365 article-title: Alpha-fair large-scale urban network control: A perimeter control based on a macroscopic fundamental diagram publication-title: Transp. Res. Part C Emerg. Technol. – reference: Koller, D., Parr, R., 1999. Computing factored value functions for policies in structured MDPs, in: 16th International Joint Conference on Artificial Intelligence. pp. 1332–1339. – reference: Sunehag, P., Lever, G., Gruslys, A., Marian Czarnecki, W., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J.Z., Tuyls, K., Graepel, T., 2018. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward, in: 17th International Conference on Autonomous Agents and MultiAgent Systems. pp. 2085–2087. https://doi.org/10.5555/3237383.3238080. – volume: 33 start-page: 750 year: 2018 end-page: 797 ident: b0225 article-title: A Survey and Critique of Multiagent Deep Reinforcement Learning publication-title: Auton. Agent. Multi. Agent. Syst. – volume: 137 start-page: 47 year: 2020 end-page: 73 ident: b0360 article-title: H∞ robust perimeter flow control in urban networks with partial information feedback publication-title: Transp. Res. Part B Methodol. – volume: 54 start-page: 17 year: 2013 end-page: 36 ident: b0195 article-title: Cooperative traffic control of a mixed network with two urban regions and a freeway publication-title: Transp. Res. Part B Methodol. – year: 2021 ident: b0530 article-title: Off-Policy Multi-Agent Decomposed Policy Gradients publication-title: International Conference on Learning Representations – volume: 83 start-page: 120 year: 2017 end-page: 133 ident: b0060 article-title: Network traffic flow optimization under performance constraints publication-title: Transp. Res. Part C Emerg. Technol. – volume: 2623 start-page: 98 year: 2017 end-page: 107 ident: b0320 article-title: Spatiotemporal Partitioning of Transportation Network Using Travel Time Data publication-title: Transp. Res. Rec. J. Transp. Res. Board – reference: OroojlooyJadid, A., Hajinezhad, D., 2019. A Review of Cooperative Multi-Agent Deep Reinforcement Learning. – year: 1997 ident: b0505 article-title: An Analysis of Temporal-Difference Learning with Function Approximation publication-title: IEEE Trans. Autom. Control – volume: 37 start-page: 319 year: 2003 end-page: 352 ident: b0460 article-title: Hypercongestion. J. Transp. Econ publication-title: Policy – year: 2018 ident: b0480 article-title: Reinforcement learning: An introduction – reference: Jin, C., Allen-Zhu, Z., Bubeck, S., Jordan, M.I., 2018. Is Q-Learning Provably Efficient?, in: Advances in Neural Information Processing Systems, 31. – reference: Mazloumian, A., Geroliminis, N., Helbing, D., 2010. The spatial variability of vehicle densities as determinant of urban network capacity 368, 4627–4647. https://doi.org/10.1098/rsta.2010.0099. – volume: 46 start-page: 1291 year: 2012 end-page: 1303 ident: b0565 article-title: A dynamic cordon pricing scheme combining the Macroscopic Fundamental Diagram and an agent-based traffic model publication-title: Transp. Res. Part A Policy Pract. – volume: 118 year: 2020 ident: b0100 article-title: Empirical analysis of large-scale multimodal traffic with multi-sensor data publication-title: Transp. Res. Part C Emerg. Technol. – reference: Lauer, M., Riedmiller, M.A., 2000. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems, in: 17th International Conference on Machine Learning. pp. 535–542. – volume: 105 start-page: 193 year: 2017 end-page: 211 ident: b0435 article-title: Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks publication-title: Transp. Res. Part B Methodol. – volume: 91 start-page: 250 year: 2016 end-page: 269 ident: b0430 article-title: Clustering of heterogeneous networks with directional flows based on “Snake” similarities publication-title: Transp. Res. Part B Methodol. – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: b0355 article-title: Human-level control through deep reinforcement learning publication-title: Nature – volume: 115 year: 2020 ident: b0420 article-title: Data driven model free adaptive iterative learning perimeter control for large-scale urban road networks publication-title: Transp. Res. Part C Emerg. Technol. – volume: 19 start-page: 1112 year: 2018 end-page: 1121 ident: b0445 article-title: Economic Model Predictive Control of Large-Scale Urban Road Networks via Perimeter Control and Regional Route Guidance publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 75 start-page: 89 year: 2015 end-page: 99 ident: b0075 article-title: Distance-dependent congestion pricing for downtown zones publication-title: Transp. Res. Part B Methodol. – reference: Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings. International Conference on Learning Representations, ICLR. – volume: 134 year: 2022 ident: b0125 article-title: Dynamic optimal congestion pricing in multi-region urban networks by application of a Multi-Layer-Neural network publication-title: Transp. Res. Part C Emerg. Technol. – volume: 90 start-page: 56 year: 2016 end-page: 69 ident: b0080 article-title: Traffic flow on signalized streets publication-title: Transp. Res. Part B Methodol. – volume: 111 start-page: 327 year: 2018 end-page: 355 ident: b0575 article-title: Boundary conditions and behavior of the macroscopic fundamental diagram based network traffic dynamics: A control systems perspective publication-title: Transp. Res. Part B Methodol. – reference: Gayah, V., Daganzo, C., 2012. Analytical Capacity Comparison of One-Way and Two-Way Signalized Street Networks 2301, 76–85. https://doi.org/10.3141/2301-09. – volume: 96 start-page: 1 year: 2017 end-page: 25 ident: b0175 article-title: Optimal perimeter control synthesis for two urban regions with aggregate boundary queue dynamics publication-title: Transp. Res. Part B Methodol. – volume: 81 start-page: 904 year: 2015 end-page: 916 ident: b0285 article-title: Stochastic approximations for the macroscopic fundamental diagram of urban networks publication-title: Transp. Res. Part B Methodol. – volume: 8 start-page: 293 year: 1992 end-page: 321 ident: b0315 article-title: Self-improving reactive agents based on reinforcement learning, planning and teaching publication-title: Mach. Learn. – volume: 74 start-page: 1 year: 2015 end-page: 19 ident: b0405 article-title: Dynamics of heterogeneity in urban networks: Aggregated traffic modeling and hierarchical control publication-title: Transp. Res. Part B Methodol. – volume: 11 start-page: 323 year: 1969 end-page: 327 ident: b0150 article-title: The mechanism of a road network publication-title: Traffic Eng. Control – reference: Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I., 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Adv. Neural Inf. Process. Syst. 2017-Decem, 6380–6391. – reference: Haddad, J., Ramezani, M., Geroliminis, N., 2012. Model predictive perimeter control for urban areas with macroscopic fundamental diagrams, in: Proceedings of the American Control Conference. pp. 5757–5762. https://doi.org/10.1109/acc.2012.6314693. – reference: Chu, X., Ye, H., 2017. Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning. – volume: 16 start-page: 2141 year: 2015 end-page: 2154 ident: b0270 article-title: Multiple concentric gating traffic control in large-scale urban networks publication-title: IEEE Trans. Intell. Transp. Syst. – year: 2021 ident: b0350 article-title: Joint Route Guidance and Demand Management for Real-Time Control of Multi-Regional Traffic Networks publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 66 start-page: 136 year: 2016 end-page: 149 ident: b0090 article-title: Deriving macroscopic fundamental diagrams from probe data: Issues and proposed solutions publication-title: Transp. Res. Part C Emerg. Technol. – start-page: 1261 year: 2013 end-page: 1265 ident: b0025 article-title: Q-learning method for controlling traffic signal phase time in a single intersection publication-title: IEEE Conf. Intell. Transp. Syst. Proc., ITSC – volume: 36 start-page: 480 year: 2013 end-page: 497 ident: b0340 article-title: Urban network gridlock: Theory, characteristics, and dynamics publication-title: Transp. Res. Part C Emerg. Technol. – reference: Terry, J.K., Grammel, N., Hari, A., Santos, L., 2020. Parameter Sharing is Surprisingly Useful for Multi-Agent Deep Reinforcement Learning. – reference: Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., van Hasselt, H., Silver, D., 2018. Distributed Prioritized Experience Replay. – volume: 80 start-page: 99 year: 2013 end-page: 118 ident: b0290 article-title: Estimating MFDs in Simple Networks with Route Choice publication-title: Procedia - Soc. Behav. Sci. – volume: 71 start-page: 184 year: 2016 end-page: 197 ident: b0015 article-title: Data fusion algorithm for macroscopic fundamental diagram estimation publication-title: Transp. Res. Part C Emerg. Technol. – volume: 41 start-page: 49 year: 2007 end-page: 62 ident: b0065 article-title: Urban gridlock: Macroscopic modeling and mitigation approaches publication-title: Transp. Res. Part B Methodol. – volume: 45 start-page: 643 year: 2011 end-page: 655 ident: b0115 article-title: Clockwise hysteresis loops in the Macroscopic Fundamental Diagram: An effect of network instability publication-title: Transp. Res. Part B Methodol. – volume: 114 start-page: 1 year: 2020 end-page: 19 ident: b0500 article-title: Evaluation of analytical approximation methods for the macroscopic fundamental diagram publication-title: Transp. Res. Part C Emerg. Technol. – year: 2016 ident: b0155 article-title: Deep Learning – volume: 10642 LNAI start-page: 66 year: 2017 end-page: 83 ident: b0160 article-title: Cooperative Multi-agent Control Using Deep Reinforcement Learning publication-title: Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) – volume: 2018 start-page: 3215 year: 2017 end-page: 3222 ident: b0230 article-title: Rainbow: Combining Improvements in Deep Reinforcement Learning publication-title: 32nd AAAI Conf. Artif. Intell. AAAI – reference: Tan, M., 1993. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, in: 10th International Conference on Machine Learning Proceedings. Elsevier, pp. 330–337. https://doi.org/10.1016/B978-1-55860-307-3.50049-6. – reference: Wen, Y., Yang, Y., Luo, R., Wang, J., Pan, W., 2019. Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. 7th Int. Conf. Learn. Represent. https://doi.org/10.48550/arxiv.1901.09207. – volume: 46 start-page: 1639 year: 2012 end-page: 1656 ident: b0245 article-title: On the spatial partitioning of urban transportation networks publication-title: Transp. Res. Part B Methodol. – reference: Christianos, F., Papoudakis, G., Rahman, A., Albrecht, S. V., 2021. Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing, in: 38th International Conference on Machine Learning. https://doi.org/10.48550/arxiv.2102.07475. – volume: 46 start-page: 1393 year: 2012 end-page: 1403 ident: b0255 article-title: Exploiting the fundamental diagram of urban networks for feedback-based gating publication-title: Transp. Res. Part B Methodol. – volume: 1–12 year: 2019 ident: b0295 article-title: Data-Driven Model Free Adaptive Perimeter Control for Multi-Region Urban Traffic Networks With Route Choice publication-title: IEEE Trans. Intell. Transp. Syst. – reference: Geroliminis, N., Levinson, D.M., 2009. Cordon Pricing Consistent with the Physics of Overcrowding, Transportation and Traffic Theory 2009: Golden Jubilee. https://doi.org/10.1007/978-1-4419-0820-9_11. – volume: 124 year: 2021 ident: b0580 article-title: Model-free perimeter metering control for two-region urban networks using deep reinforcement learning publication-title: Transp. Res. Part C Emerg. Technol. – volume: 204 start-page: 148 year: 1979 end-page: 151 ident: b0220 article-title: A two-fluid approach to town traffic publication-title: Science (80-.) – volume: 118 start-page: 106 year: 2018 end-page: 123 ident: b0560 article-title: Hierarchical control of heterogeneous large-scale urban road networks via path assignment and regional route guidance publication-title: Transp. Res. Part B Methodol. – year: 2018 ident: b0415 article-title: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning publication-title: International Conference of Machine Learning – volume: 4 start-page: 26 year: 2012 end-page: 31 ident: b0495 article-title: Lecture 6.5-rmsprop Divide the Gradient by a Running Average of Its Recent Magnitude publication-title: COURSERA Neural Networks Mach. Learn. – start-page: 67 year: 1982 end-page: 70 ident: b0330 article-title: Scats: The Sydney coordinated adaptive traffic system - principles, methodology, algorithms publication-title: International Conference of Road Traffic Signal. – volume: 137 start-page: 133 year: 2020 end-page: 153 ident: b0205 article-title: Adaptive perimeter control for multi-region accumulation-based models with state delays publication-title: Transp. Res. Part B Methodol. – volume: 46 start-page: 1159 year: 2012 end-page: 1176 ident: b0185 article-title: On the stability of traffic perimeter control in two-region urban cities publication-title: Transp. Res. Part B Methodol. – volume: 32 start-page: 289 year: 2008 end-page: 353 ident: b0380 article-title: Optimal and Approximate Q-value Functions for Decentralized POMDPs publication-title: J. Artif. Intell. Res. – volume: 14 start-page: 348 year: 2013 end-page: 359 ident: b0140 article-title: Optimal perimeter control for two urban regions with macroscopic fundamental diagrams: A model predictive approach publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 142 year: 2022 ident: b0040 article-title: Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics publication-title: Transp. Res. Part C Emerg. Technol. – volume: 36 start-page: 177 year: 2013 end-page: 195 ident: b0525 article-title: Max pressure control of a network of signalized intersections publication-title: Transp. Res. Part C Emerg. Technol. – volume: 55 start-page: 265 year: 2013 end-page: 281 ident: b0010 article-title: Perimeter and boundary flow control in multi-reservoir heterogeneous networks publication-title: Transp. Res. Part B Methodol. – volume: 20 start-page: 3224 year: 2019 end-page: 3234 ident: b0005 article-title: Analytical Optimal Solution of Perimeter Traffic Flow Control Based on MFD Dynamics: A Pontryagin’s Maximum Principle Approach publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 103 start-page: 261 year: 2017 end-page: 285 ident: b0020 article-title: Optimal design of sustainable transit systems in congested urban networks: A macroscopic approach publication-title: Transp. Res. Part E Logist. Transp. Rev. – reference: Son, K., Kim, D., Kang, W.J., Hostallero, D., Yi, Y., 2019. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning, in: 36th International Conference on Machine Learning. International Machine Learning Society (IMLS), pp. 5887–5896. https://doi.org/10.48550/arxiv.1905.05408. – volume: 23 start-page: 464 year: 2015 end-page: 478 ident: b0210 article-title: Optimal hybrid perimeter and switching plans control for urban traffic networks publication-title: IEEE Trans. Control Syst. Technol. – volume: 70 start-page: 255 year: 2014 end-page: 268 ident: b0120 article-title: On the impacts of locally adaptive signal control on urban network stability and the macroscopic fundamental diagram publication-title: Transp. Res. Part B Methodol. – volume: 45 start-page: 278 year: 2011 end-page: 288 ident: b0070 article-title: Macroscopic relations of urban traffic variables: Bifurcations, multivaluedness and instability publication-title: Transp. Res. Part B Methodol. – volume: 113 start-page: 164 year: 2020 end-page: 175 ident: b0375 article-title: City-wide traffic control: Modeling impacts of cordon queues publication-title: Transp. Res. Part C Emerg. Technol. – volume: 117 start-page: 687 year: 2018 end-page: 707 ident: b0570 article-title: Robust perimeter control for two urban regions with macroscopic fundamental diagrams: A control-Lyapunov function approach publication-title: Transp. Res. Part B Methodol. – year: 2016 ident: b0440 article-title: Prioritized experience replay publication-title: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings. International Conference on Learning Representations – year: 2021 ident: b0400 article-title: FACMAC: Factored Multi-Agent Centralised Policy Gradients publication-title: In: The 35th Conference on Neural Information Processing Systems – reference: van Hasselt, H., Doron, Y., Strub, F., Hessel, M., Sonnerat, N., Modayil, J., 2018. Deep Reinforcement Learning and the Deadly Triad. – volume: 33 start-page: 74 year: 2013 end-page: 87 ident: b0260 article-title: Urban congestion gating control based on reduced operational network fundamental diagrams publication-title: Transp. Res. Part C Emerg. Technol. – volume: 126 year: 2021 ident: b0305 article-title: Robust perimeter control with cordon queues and heterogeneous transfer flows publication-title: Transp. Res. Part C Emerg. Technol. – volume: 1112 start-page: 78 year: 1987 end-page: 88 ident: b0550 article-title: Urban traffic network flow models publication-title: Transp. Res. Rec. – volume: 77 start-page: 495 year: 2017 end-page: 515 ident: b0190 article-title: Coordinated distributed adaptive perimeter control for large-scale urban road networks publication-title: Transp. Res. Part C Emerg. Technol. – volume: 68 start-page: 315 year: 2014 end-page: 332 ident: b0200 article-title: Robust perimeter control design for an urban region publication-title: Transp. Res. Part B Methodol. – volume: 45 start-page: 605 year: 2011 end-page: 617 ident: b0145 article-title: Properties of a well-defined macroscopic fundamental diagram for urban traffic publication-title: Transp. Res. Part B Methodol. – reference: Mahmassani, H., Herman, R., 1984. Dynamic User Equilibrium Departure Time and Route Choice on Idealized Traffic Arterials. 18, 362–384. https://doi.org/10.1287/TRSC.18.4.362. – volume: 109 year: 2021 ident: b0450 article-title: Stabilization of city-scale road traffic networks via macroscopic fundamental diagram-based model predictive perimeter control publication-title: Control Eng. Pract. – volume: 61 start-page: 134 year: 2017 end-page: 148 ident: b0170 article-title: Optimal coupled and decoupled perimeter control in one-region cities publication-title: Control Eng. Pract. – start-page: 2613 year: 2010 end-page: 2621 ident: b0510 article-title: Double Q-learning publication-title: Adv. Neural Inf. Proces. Syst. – reference: Choi, S., Yeung, D.Y., Zhang, N., 1999. An Environment Model for Nonstationary Reinforcement Learning, in: Advances in Neural Information Processing Systems 12. – reference: Iqbal, S., Sha, F., 2019. Actor-attention-critic for multi-agent reinforcement learning, in: 36th International Conference on Machine Learning, ICML 2019. International Machine Learning Society (IMLS), pp. 5261–5270. – reference: Foerster, J.N., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S., 2017. Counterfactual Multi-Agent Policy Gradients. 32nd AAAI Conf. Artif. Intell. AAAI 2018 2974–2982. https://doi.org/10.48550/arxiv.1705.08926. – volume: 2622 start-page: 58 year: 2017 end-page: 69 ident: b0085 article-title: Improving urban street network efficiency by prohibiting conflicting left turns at signalized intersections publication-title: Transp. Res. Rec. – volume: ICML 2016 4 start-page: 2939 year: 2015 end-page: 2947 ident: b0535 article-title: Dueling Network Architectures for Deep Reinforcement Learning publication-title: 33rd Int. Conf. Mach. Learn. – volume: 2124 start-page: 127 year: 2009 end-page: 136 ident: b0030 article-title: Exploring the Impact of Homogeneity of Traffic Measurements on the Existence of Macroscopic Fundamental Diagrams publication-title: Transp. Res. Rec. J. Transp. Res. Board – volume: 117 start-page: 660 year: 2018 end-page: 675 ident: b0105 article-title: An analytical framework to model uncertainty in urban network dynamics using Macroscopic Fundamental Diagrams publication-title: Transp. Res. Part B Methodol. – volume: 59 start-page: 323 year: 2015 end-page: 339 ident: b0165 article-title: Robust constrained control of uncertain macroscopic fundamental diagram networks publication-title: Transp. Res. Part C Emerg. Technol. – volume: 2016 start-page: 2094 year: 2015 end-page: 2100 ident: b0520 article-title: Deep Reinforcement Learning with Double Q-learning publication-title: 30th AAAI Conf. Artif. Intell. AAAI – volume: 46 start-page: 1393 year: 2012 ident: 10.1016/j.trc.2023.104033_b0255 article-title: Exploiting the fundamental diagram of urban networks for feedback-based gating publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2012.06.008 – volume: 36 start-page: 480 year: 2013 ident: 10.1016/j.trc.2023.104033_b0340 article-title: Urban network gridlock: Theory, characteristics, and dynamics publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2013.07.002 – volume: 68 start-page: 315 year: 2014 ident: 10.1016/j.trc.2023.104033_b0200 article-title: Robust perimeter control design for an urban region publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2014.06.010 – ident: 10.1016/j.trc.2023.104033_b0485 doi: 10.1016/B978-1-55860-307-3.50049-6 – volume: 45 start-page: 278 year: 2011 ident: 10.1016/j.trc.2023.104033_b0070 article-title: Macroscopic relations of urban traffic variables: Bifurcations, multivaluedness and instability publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2010.06.006 – start-page: 1261 year: 2013 ident: 10.1016/j.trc.2023.104033_b0025 article-title: Q-learning method for controlling traffic signal phase time in a single intersection publication-title: IEEE Conf. Intell. Transp. Syst. Proc., ITSC – ident: 10.1016/j.trc.2023.104033_b0345 doi: 10.1098/rsta.2010.0099 – volume: 23 start-page: 464 year: 2015 ident: 10.1016/j.trc.2023.104033_b0210 article-title: Optimal hybrid perimeter and switching plans control for urban traffic networks publication-title: IEEE Trans. Control Syst. Technol. doi: 10.1109/TCST.2014.2330997 – volume: 103 start-page: 261 year: 2017 ident: 10.1016/j.trc.2023.104033_b0020 article-title: Optimal design of sustainable transit systems in congested urban networks: A macroscopic approach publication-title: Transp. Res. Part E Logist. Transp. Rev. doi: 10.1016/j.tre.2017.03.006 – start-page: 2613 year: 2010 ident: 10.1016/j.trc.2023.104033_b0510 article-title: Double Q-learning publication-title: Adv. Neural Inf. Proces. Syst. – volume: 33 start-page: 750 year: 2018 ident: 10.1016/j.trc.2023.104033_b0225 article-title: A Survey and Critique of Multiagent Deep Reinforcement Learning publication-title: Auton. Agent. Multi. Agent. Syst. doi: 10.1007/s10458-019-09421-1 – ident: 10.1016/j.trc.2023.104033_b0515 – volume: 204 start-page: 148 issue: 4389 year: 1979 ident: 10.1016/j.trc.2023.104033_b0220 article-title: A two-fluid approach to town traffic publication-title: Science (80-.) doi: 10.1126/science.204.4389.148 – volume: 46 start-page: 1291 year: 2012 ident: 10.1016/j.trc.2023.104033_b0565 article-title: A dynamic cordon pricing scheme combining the Macroscopic Fundamental Diagram and an agent-based traffic model publication-title: Transp. Res. Part A Policy Pract. doi: 10.1016/j.tra.2012.05.006 – year: 1997 ident: 10.1016/j.trc.2023.104033_b0505 article-title: An Analysis of Temporal-Difference Learning with Function Approximation publication-title: IEEE Trans. Autom. Control doi: 10.1109/9.580874 – volume: 117 start-page: 660 year: 2018 ident: 10.1016/j.trc.2023.104033_b0105 article-title: An analytical framework to model uncertainty in urban network dynamics using Macroscopic Fundamental Diagrams publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2017.08.015 – volume: ICML 2016 4 start-page: 2939 year: 2015 ident: 10.1016/j.trc.2023.104033_b0535 article-title: Dueling Network Architectures for Deep Reinforcement Learning publication-title: 33rd Int. Conf. Mach. Learn. – ident: 10.1016/j.trc.2023.104033_b0410 – volume: 54 start-page: 17 year: 2013 ident: 10.1016/j.trc.2023.104033_b0195 article-title: Cooperative traffic control of a mixed network with two urban regions and a freeway publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2013.03.007 – volume: 137 start-page: 133 year: 2020 ident: 10.1016/j.trc.2023.104033_b0205 article-title: Adaptive perimeter control for multi-region accumulation-based models with state delays publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2018.05.019 – ident: 10.1016/j.trc.2023.104033_b0250 – ident: 10.1016/j.trc.2023.104033_b0275 – volume: 118 start-page: 106 year: 2018 ident: 10.1016/j.trc.2023.104033_b0560 article-title: Hierarchical control of heterogeneous large-scale urban road networks via path assignment and regional route guidance publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2018.10.007 – volume: 59 start-page: 323 year: 2015 ident: 10.1016/j.trc.2023.104033_b0165 article-title: Robust constrained control of uncertain macroscopic fundamental diagram networks publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2015.05.014 – volume: 118 year: 2020 ident: 10.1016/j.trc.2023.104033_b0395 article-title: Estimating MFDs, trip lengths and path flow distributions in a multi-region setting using mobile phone data publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2020.102709 – ident: 10.1016/j.trc.2023.104033_b0035 – ident: 10.1016/j.trc.2023.104033_b0310 – volume: 126 year: 2021 ident: 10.1016/j.trc.2023.104033_b0305 article-title: Robust perimeter control with cordon queues and heterogeneous transfer flows publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2021.103043 – ident: 10.1016/j.trc.2023.104033_b0465 – volume: 111 start-page: 327 year: 2018 ident: 10.1016/j.trc.2023.104033_b0575 article-title: Boundary conditions and behavior of the macroscopic fundamental diagram based network traffic dynamics: A control systems perspective publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2018.02.016 – ident: 10.1016/j.trc.2023.104033_b0545 – volume: 46 start-page: 1639 year: 2012 ident: 10.1016/j.trc.2023.104033_b0245 article-title: On the spatial partitioning of urban transportation networks publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2012.08.005 – volume: 37 start-page: 319 issue: 3 year: 2003 ident: 10.1016/j.trc.2023.104033_b0460 article-title: Hypercongestion. J. Transp. Econ publication-title: Policy – ident: 10.1016/j.trc.2023.104033_b0050 – volume: 518 start-page: 529 year: 2015 ident: 10.1016/j.trc.2023.104033_b0355 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – year: 2021 ident: 10.1016/j.trc.2023.104033_b0400 article-title: FACMAC: Factored Multi-Agent Centralised Policy Gradients – volume: 71 start-page: 184 year: 2016 ident: 10.1016/j.trc.2023.104033_b0015 article-title: Data fusion algorithm for macroscopic fundamental diagram estimation publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2016.07.013 – volume: 74 start-page: 1 year: 2015 ident: 10.1016/j.trc.2023.104033_b0405 article-title: Dynamics of heterogeneity in urban networks: Aggregated traffic modeling and hierarchical control publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2014.12.010 – volume: 4 start-page: 26 year: 2012 ident: 10.1016/j.trc.2023.104033_b0495 article-title: Lecture 6.5-rmsprop Divide the Gradient by a Running Average of Its Recent Magnitude publication-title: COURSERA Neural Networks Mach. Learn. – volume: 2622 start-page: 58 year: 2017 ident: 10.1016/j.trc.2023.104033_b0085 article-title: Improving urban street network efficiency by prohibiting conflicting left turns at signalized intersections publication-title: Transp. Res. Rec. doi: 10.3141/2622-06 – volume: 59 start-page: 404 year: 2015 ident: 10.1016/j.trc.2023.104033_b0555 article-title: Equilibrium analysis and route guidance in large-scale networks with MFD dynamics publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2015.05.009 – volume: 20 start-page: 3224 year: 2019 ident: 10.1016/j.trc.2023.104033_b0005 article-title: Analytical Optimal Solution of Perimeter Traffic Flow Control Based on MFD Dynamics: A Pontryagin’s Maximum Principle Approach publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2018.2873104 – volume: 16 start-page: 2141 year: 2015 ident: 10.1016/j.trc.2023.104033_b0270 article-title: Multiple concentric gating traffic control in large-scale urban networks publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2015.2399303 – volume: 2018 start-page: 3215 year: 2017 ident: 10.1016/j.trc.2023.104033_b0230 article-title: Rainbow: Combining Improvements in Deep Reinforcement Learning publication-title: 32nd AAAI Conf. Artif. Intell. AAAI – volume: 40 start-page: 11 year: 1991 ident: 10.1016/j.trc.2023.104033_b0425 article-title: Optimizing Networks of Traffic Signals in Real Time—The SCOOT Method publication-title: IEEE Trans. Veh. Technol. doi: 10.1109/25.69966 – volume: 14 start-page: 348 year: 2013 ident: 10.1016/j.trc.2023.104033_b0140 article-title: Optimal perimeter control for two urban regions with macroscopic fundamental diagrams: A model predictive approach publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2012.2216877 – volume: 66 start-page: 136 year: 2016 ident: 10.1016/j.trc.2023.104033_b0090 article-title: Deriving macroscopic fundamental diagrams from probe data: Issues and proposed solutions publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2015.08.015 – ident: 10.1016/j.trc.2023.104033_b0180 doi: 10.1109/ACC.2012.6314693 – volume: 80 start-page: 99 year: 2013 ident: 10.1016/j.trc.2023.104033_b0290 article-title: Estimating MFDs in Simple Networks with Route Choice publication-title: Procedia - Soc. Behav. Sci. doi: 10.1016/j.sbspro.2013.05.008 – volume: 113 start-page: 164 year: 2020 ident: 10.1016/j.trc.2023.104033_b0375 article-title: City-wide traffic control: Modeling impacts of cordon queues publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2019.04.024 – year: 2018 ident: 10.1016/j.trc.2023.104033_b0415 article-title: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning – volume: 115 year: 2020 ident: 10.1016/j.trc.2023.104033_b0420 article-title: Data driven model free adaptive iterative learning perimeter control for large-scale urban road networks publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2020.102618 – volume: 2016 start-page: 2094 year: 2015 ident: 10.1016/j.trc.2023.104033_b0520 article-title: Deep Reinforcement Learning with Double Q-learning publication-title: 30th AAAI Conf. Artif. Intell. AAAI – volume: 70 start-page: 255 year: 2014 ident: 10.1016/j.trc.2023.104033_b0120 article-title: On the impacts of locally adaptive signal control on urban network stability and the macroscopic fundamental diagram publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2014.09.010 – volume: 109 year: 2021 ident: 10.1016/j.trc.2023.104033_b0450 article-title: Stabilization of city-scale road traffic networks via macroscopic fundamental diagram-based model predictive perimeter control publication-title: Control Eng. Pract. doi: 10.1016/j.conengprac.2021.104750 – ident: 10.1016/j.trc.2023.104033_b0055 – volume: 77 start-page: 495 year: 2017 ident: 10.1016/j.trc.2023.104033_b0190 article-title: Coordinated distributed adaptive perimeter control for large-scale urban road networks publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2016.12.002 – volume: 33 start-page: 74 year: 2013 ident: 10.1016/j.trc.2023.104033_b0260 article-title: Urban congestion gating control based on reduced operational network fundamental diagrams publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2013.04.010 – volume: 145 year: 2022 ident: 10.1016/j.trc.2023.104033_b0300 article-title: Quasi revenue-neutral congestion pricing in cities: Crediting drivers to avoid city centers publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2022.103932 – volume: 36 start-page: 177 year: 2013 ident: 10.1016/j.trc.2023.104033_b0525 article-title: Max pressure control of a network of signalized intersections publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2013.08.014 – volume: 11 start-page: 323 year: 1969 ident: 10.1016/j.trc.2023.104033_b0150 article-title: The mechanism of a road network publication-title: Traffic Eng. Control – volume: 42 start-page: 759 year: 2008 ident: 10.1016/j.trc.2023.104033_b0135 article-title: Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2008.02.002 – volume: 128 year: 2021 ident: 10.1016/j.trc.2023.104033_b0455 article-title: Modeling, estimation, and control in large-scale urban road networks with remaining travel distance dynamics publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2021.103157 – volume: 118 year: 2020 ident: 10.1016/j.trc.2023.104033_b0100 article-title: Empirical analysis of large-scale multimodal traffic with multi-sensor data publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2020.102725 – ident: 10.1016/j.trc.2023.104033_b0130 doi: 10.1007/978-1-4419-0820-9_11 – volume: 2124 start-page: 127 year: 2009 ident: 10.1016/j.trc.2023.104033_b0030 article-title: Exploring the Impact of Homogeneity of Traffic Measurements on the Existence of Macroscopic Fundamental Diagrams publication-title: Transp. Res. Rec. J. Transp. Res. Board doi: 10.3141/2124-12 – volume: 142 year: 2022 ident: 10.1016/j.trc.2023.104033_b0040 article-title: Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2022.103759 – ident: 10.1016/j.trc.2023.104033_b0110 doi: 10.3141/2301-09 – volume: 55 start-page: 265 year: 2013 ident: 10.1016/j.trc.2023.104033_b0010 article-title: Perimeter and boundary flow control in multi-reservoir heterogeneous networks publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2013.07.003 – year: 2018 ident: 10.1016/j.trc.2023.104033_b0480 – ident: 10.1016/j.trc.2023.104033_b0280 – volume: 19 start-page: 1112 year: 2018 ident: 10.1016/j.trc.2023.104033_b0445 article-title: Economic Model Predictive Control of Large-Scale Urban Road Networks via Perimeter Control and Regional Route Guidance publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2017.2716541 – volume: 1112 start-page: 78 year: 1987 ident: 10.1016/j.trc.2023.104033_b0550 article-title: Urban traffic network flow models publication-title: Transp. Res. Rec. – volume: 90 start-page: 56 year: 2016 ident: 10.1016/j.trc.2023.104033_b0080 article-title: Traffic flow on signalized streets publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2016.03.010 – volume: 45 start-page: 605 year: 2011 ident: 10.1016/j.trc.2023.104033_b0145 article-title: Properties of a well-defined macroscopic fundamental diagram for urban traffic publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2010.11.004 – volume: 2018 start-page: 3207 year: 2017 ident: 10.1016/j.trc.2023.104033_b0215 article-title: Deep Reinforcement Learning that Matters publication-title: 32nd AAAI Conf. Artif. Intell. AAAI – volume: 45 start-page: 643 year: 2011 ident: 10.1016/j.trc.2023.104033_b0115 article-title: Clockwise hysteresis loops in the Macroscopic Fundamental Diagram: An effect of network instability publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2010.11.006 – volume: 75 start-page: 89 year: 2015 ident: 10.1016/j.trc.2023.104033_b0075 article-title: Distance-dependent congestion pricing for downtown zones publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2015.02.010 – volume: 61 start-page: 134 year: 2017 ident: 10.1016/j.trc.2023.104033_b0170 article-title: Optimal coupled and decoupled perimeter control in one-region cities publication-title: Control Eng. Pract. doi: 10.1016/j.conengprac.2017.01.010 – volume: 117 start-page: 687 year: 2018 ident: 10.1016/j.trc.2023.104033_b0570 article-title: Robust perimeter control for two urban regions with macroscopic fundamental diagrams: A control-Lyapunov function approach publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2017.09.008 – ident: 10.1016/j.trc.2023.104033_b0325 – ident: 10.1016/j.trc.2023.104033_b0335 doi: 10.1287/trsc.18.4.362 – volume: 137 start-page: 47 year: 2020 ident: 10.1016/j.trc.2023.104033_b0360 article-title: H∞ robust perimeter flow control in urban networks with partial information feedback publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2019.03.010 – ident: 10.1016/j.trc.2023.104033_b0240 – volume: 8 start-page: 293 year: 1992 ident: 10.1016/j.trc.2023.104033_b0315 article-title: Self-improving reactive agents based on reinforcement learning, planning and teaching publication-title: Mach. Learn. doi: 10.1023/A:1022628806385 – volume: 32 start-page: 289 year: 2008 ident: 10.1016/j.trc.2023.104033_b0380 article-title: Optimal and Approximate Q-value Functions for Decentralized POMDPs publication-title: J. Artif. Intell. Res. doi: 10.1613/jair.2447 – volume: 81 start-page: 904 year: 2015 ident: 10.1016/j.trc.2023.104033_b0285 article-title: Stochastic approximations for the macroscopic fundamental diagram of urban networks publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2015.09.002 – volume: 59 start-page: 308 year: 2015 ident: 10.1016/j.trc.2023.104033_b0265 article-title: Controller design for gating traffic control in presence of time-delay in urban road networks publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2015.04.031 – ident: 10.1016/j.trc.2023.104033_b0385 – ident: 10.1016/j.trc.2023.104033_b0490 – ident: 10.1016/j.trc.2023.104033_b0045 – volume: 96 start-page: 1 year: 2017 ident: 10.1016/j.trc.2023.104033_b0175 article-title: Optimal perimeter control synthesis for two urban regions with aggregate boundary queue dynamics publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2016.10.016 – volume: 8 start-page: 279 year: 1992 ident: 10.1016/j.trc.2023.104033_b0540 article-title: Q-learning publication-title: Mach. Learn. – volume: 91 start-page: 250 year: 2016 ident: 10.1016/j.trc.2023.104033_b0430 article-title: Clustering of heterogeneous networks with directional flows based on “Snake” similarities publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2016.05.008 – year: 2021 ident: 10.1016/j.trc.2023.104033_b0350 article-title: Joint Route Guidance and Demand Management for Real-Time Control of Multi-Regional Traffic Networks publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 41 start-page: 49 year: 2007 ident: 10.1016/j.trc.2023.104033_b0065 article-title: Urban gridlock: Macroscopic modeling and mitigation approaches publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2006.03.001 – start-page: 67 year: 1982 ident: 10.1016/j.trc.2023.104033_b0330 article-title: Scats: The Sydney coordinated adaptive traffic system - principles, methodology, algorithms publication-title: International Conference of Road Traffic Signal. – ident: 10.1016/j.trc.2023.104033_b0475 – volume: 2623 start-page: 98 year: 2017 ident: 10.1016/j.trc.2023.104033_b0320 article-title: Spatiotemporal Partitioning of Transportation Network Using Travel Time Data publication-title: Transp. Res. Rec. J. Transp. Res. Board doi: 10.3141/2623-11 – year: 2016 ident: 10.1016/j.trc.2023.104033_b0440 article-title: Prioritized experience replay – volume: 114 start-page: 1 year: 2020 ident: 10.1016/j.trc.2023.104033_b0500 article-title: Evaluation of analytical approximation methods for the macroscopic fundamental diagram publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2020.02.003 – volume: 124 year: 2021 ident: 10.1016/j.trc.2023.104033_b0580 article-title: Model-free perimeter metering control for two-region urban networks using deep reinforcement learning publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2020.102949 – volume: 134 year: 2022 ident: 10.1016/j.trc.2023.104033_b0125 article-title: Dynamic optimal congestion pricing in multi-region urban networks by application of a Multi-Layer-Neural network publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2021.103485 – volume: 46 start-page: 1159 year: 2012 ident: 10.1016/j.trc.2023.104033_b0185 article-title: On the stability of traffic perimeter control in two-region urban cities publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2012.04.004 – volume: 146 year: 2023 ident: 10.1016/j.trc.2023.104033_b0365 article-title: Alpha-fair large-scale urban network control: A perimeter control based on a macroscopic fundamental diagram publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2022.103961 – start-page: 1 year: 2014 ident: 10.1016/j.trc.2023.104033_b0370 article-title: Accuracy of Networkwide Traffic States Estimated from Mobile Probe Data publication-title: Transp. Res. Rec. J. Transp. Res. Board doi: 10.3141/2421-01 – ident: 10.1016/j.trc.2023.104033_b0095 doi: 10.1609/aaai.v32i1.11794 – ident: 10.1016/j.trc.2023.104033_b0390 doi: 10.1080/21680566.2017.1337528 – volume: 10642 LNAI start-page: 66 year: 2017 ident: 10.1016/j.trc.2023.104033_b0160 article-title: Cooperative Multi-agent Control Using Deep Reinforcement Learning publication-title: Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics) – year: 2016 ident: 10.1016/j.trc.2023.104033_b0155 – year: 2021 ident: 10.1016/j.trc.2023.104033_b0530 article-title: Off-Policy Multi-Agent Decomposed Policy Gradients – volume: 105 start-page: 193 year: 2017 ident: 10.1016/j.trc.2023.104033_b0435 article-title: Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2017.08.021 – volume: 83 start-page: 120 year: 2017 ident: 10.1016/j.trc.2023.104033_b0060 article-title: Network traffic flow optimization under performance constraints publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2017.08.002 – volume: 116 year: 2020 ident: 10.1016/j.trc.2023.104033_b0470 article-title: Neuro-dynamic programming for optimal control of macroscopic fundamental diagram systems publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2020.102628 – ident: 10.1016/j.trc.2023.104033_b0235 – volume: 1–12 year: 2019 ident: 10.1016/j.trc.2023.104033_b0295 article-title: Data-Driven Model Free Adaptive Perimeter Control for Multi-Region Urban Traffic Networks With Route Choice publication-title: IEEE Trans. Intell. Transp. Syst.
SSID	ssj0001957
Score	2.5246513
Snippet	•A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	104033
SubjectTerms	Macroscopic Fundamental Diagram (MFD) Model-free multi-agent reinforcement learning (MARL) Multi-region perimeter metering control
Title	Scalable multi-region perimeter metering control for urban networks: A multi-agent deep reinforcement learning approach
URI	https://dx.doi.org/10.1016/j.trc.2023.104033
Volume	148
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier) customDbUrl: eissn: 1879-2359 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001957 issn: 0968-090X databaseCode: GBLVA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier ScienceDirect (LUT) customDbUrl: eissn: 1879-2359 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001957 issn: 0968-090X databaseCode: ACRLP dateStart: 19950201 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals [SCFCJ] customDbUrl: eissn: 1879-2359 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001957 issn: 0968-090X databaseCode: AIKHN dateStart: 19950201 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier – providerCode: PRVESC databaseName: ScienceDirect (Elsevier) customDbUrl: eissn: 1879-2359 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0001957 issn: 0968-090X databaseCode: .~1 dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF5KPagH0apYH2UPnoS1eWyyWW-lWKpiL7WQW9jNbqQiaUlTvPnb3Uk2WkE9eAkkzECY2exMdr75BqFLmYkgVZIRycKMUKF8wiMpSCa9wPcU45pD7_DjJBzP6H0cxC00bHphAFZp9_56T692a_ukb63ZX87n_alJviOHO7FJoiESwX87pQymGFy_f8E8XF6zfRphOJOIm8pmhfEqC2Ax9HyodDq-_3Ns2og3o320ZxNFPKjf5QC1dN5B200f8aqDdjeoBA_R29QYG9qgcAURJDBwYZHjmr7f2A5XVyOKLTgdm2wVrwspcpzXUPDVDR5YbQENV1hpvcSFrrhV0-oYEdshE8-44SI_QrPR7dNwTOxQBZJ6nJUkU5EWWknBnQzCPdVhKrRkvis0M0YSjvApU5xqqMEAwZ_5bBX3TCpDA1d6_jFq54tcnyCsXZ3JwBFUAqk-oyJVQDTqRlEo0tQNu8hpzJmklnEcBl-8Jg207CUxHkjAA0ntgS66-lRZ1nQbfwnTxkfJtzWTmHDwu9rp_9TO0A7c1fizc9Qui7W-MAlJKXvViuuhrcHdw3jyAcfC4Vc
linkProvider	Elsevier
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELYqGAoDggKiPD0wIYXm4cQxW1VRFWi7tJWyRXbsoCKUVmkqNn47vsSBIgEDS4bEJ0V3tu_s--47hK5Fyv1ECmoJGqQW4dKzWCi4lQrX91xJmWJQOzwaB4MZeYz8qIF6dS0MwCrN3l_t6eVubd50jDY7y_m8M9HBd2gzO9JBNHgifW7fJr5L4QR2-_6F83BYRfepR8OlRFSnNkuQV5EDjaHrQarT9ryfndOGw-nvoz0TKeJu9TMHqKGyFmrWhcSrFtrd4BI8RG8TrW2og8IlRtCCjguLDFf8_Vp5uHzqodig07EOV_E6FzzDWYUFX93hrpHmUHGFpVJLnKuSXDUp7xGx6TLxjGsy8iM0699PewPLdFWwEpfRwkplqLiSgjM7BX9PVJBwJajncEW1krjNPUIlIwqSMMDwp9etZK6OZYjvCNc7RlvZIlMnCCtHpcK3ORHAqk8JTyQwjTphGPAkcYI2smt1xomhHIfOF69xjS17ibUFYrBAXFmgjW4-RZYV38Zfg0lto_jbpIm1P_hd7PR_YleoOZiOhvHwYfx0hnbgSwVGO0dbRb5WFzo6KcRlOfs-ABep4uw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+multi-region+perimeter+metering+control+for+urban+networks%3A+A+multi-agent+deep+reinforcement+learning+approach&rft.jtitle=Transportation+research.+Part+C%2C+Emerging+technologies&rft.au=Zhou%2C+Dongqin&rft.au=Gayah%2C+Vikash+V.&rft.date=2023-03-01&rft.issn=0968-090X&rft.volume=148&rft.spage=104033&rft_id=info:doi/10.1016%2Fj.trc.2023.104033&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_trc_2023_104033
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0968-090X&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0968-090X&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0968-090X&client=summon