Scalable multi-region perimeter metering control for urban networks: A multi-agent deep reinforcement learning approach

•A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban network•Learning and control efficay illustrated via comparisons to the MPC method•Resilience and transferability shown considering environment...

Full description

Saved in:
Bibliographic Details
Published inTransportation research. Part C, Emerging technologies Vol. 148; p. 104033
Main Authors Zhou, Dongqin, Gayah, Vikash V.
Format Journal Article
LanguageEnglish
Published Elsevier Ltd 01.03.2023
Subjects
Online AccessGet full text
ISSN0968-090X
1879-2359
DOI10.1016/j.trc.2023.104033

Cover

Abstract •A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban network•Learning and control efficay illustrated via comparisons to the MPC method•Resilience and transferability shown considering environment uncertainties Perimeter metering control based on macroscopic fundamental diagrams has attracted increasing research interests over the past decade. This strategy provides a convenient way to mitigate urban congestion by manipulating vehicular movements across homogeneous regions without modeling the detailed behaviors and interactions involved with individual vehicle presence. In particular, multi-region perimeter metering control holds promise for efficient traffic management in large-scale urban networks. However, most existing methods for multi-region control require knowledge of either the environment traffic dynamics or network properties (i.e., the critical accumulations), whereas such information is generally difficult to obtain and subject to significant estimation errors. The recently developed model-free techniques, on the other hand, have not yet been shown scalable or applicable to large urban networks. To fill this gap, this paper proposes a scalable model-free scheme based on multi-agent deep reinforcement learning. The proposed scheme features value function decomposition in the paradigm of centralized training with decentralized execution, coupled with critical advances of single-agent deep reinforcement learning and problem reformulation guided by domain expertise. Comprehensive experiment results on a seven-region urban network suggest the scheme is: (a) effective, with consistent convergence to final control outcomes that are comparable to the model predictive control method; (b) resilient, with superior learning and control efficacy in the presence of inaccurate input information from the environment; and (c) transferable, with sufficient implementation prospect as well as real time applicability to unencountered environments featuring increased uncertainty.
AbstractList •A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban network•Learning and control efficay illustrated via comparisons to the MPC method•Resilience and transferability shown considering environment uncertainties Perimeter metering control based on macroscopic fundamental diagrams has attracted increasing research interests over the past decade. This strategy provides a convenient way to mitigate urban congestion by manipulating vehicular movements across homogeneous regions without modeling the detailed behaviors and interactions involved with individual vehicle presence. In particular, multi-region perimeter metering control holds promise for efficient traffic management in large-scale urban networks. However, most existing methods for multi-region control require knowledge of either the environment traffic dynamics or network properties (i.e., the critical accumulations), whereas such information is generally difficult to obtain and subject to significant estimation errors. The recently developed model-free techniques, on the other hand, have not yet been shown scalable or applicable to large urban networks. To fill this gap, this paper proposes a scalable model-free scheme based on multi-agent deep reinforcement learning. The proposed scheme features value function decomposition in the paradigm of centralized training with decentralized execution, coupled with critical advances of single-agent deep reinforcement learning and problem reformulation guided by domain expertise. Comprehensive experiment results on a seven-region urban network suggest the scheme is: (a) effective, with consistent convergence to final control outcomes that are comparable to the model predictive control method; (b) resilient, with superior learning and control efficacy in the presence of inaccurate input information from the environment; and (c) transferable, with sufficient implementation prospect as well as real time applicability to unencountered environments featuring increased uncertainty.
ArticleNumber 104033
Author Zhou, Dongqin
Gayah, Vikash V.
Author_xml – sequence: 1
  givenname: Dongqin
  surname: Zhou
  fullname: Zhou, Dongqin
  email: dongqin.zhou@psu.edu
– sequence: 2
  givenname: Vikash V.
  orcidid: 0000-0002-0648-3360
  surname: Gayah
  fullname: Gayah, Vikash V.
  email: gayah@engr.psu.edu
BookMark eNp9kMtOwzAQRS1UJNrCB7DzD6T4kUcNq6riJVViAUjsrIkzKS6pHTkuFX9PQrti0c2MZjRnpHsmZOS8Q0KuOZtxxvObzSwGMxNMyH5OmZRnZMznhUqEzNSIjJnK5wlT7OOCTLpuwxjjKivGZP9qoIGyQbrdNdEmAdfWO9pisFuMGOhftW5NjXcx-IbWPtBdKMFRh3Hvw1d3SxdHGtboIq0QWxrQuv7U4HZYNQjBDV-gbYMH83lJzmtoOrw69il5f7h_Wz4lq5fH5-VilRihipjU1RwBqxIUqwXPeIq5ASwLyQGLPhIwkGlRqRSlEIL1WVnGKyVYLtOMl0JOCT_8NcF3XcBat30yCD-aMz2Y0xvdm9ODOX0w1zPFP8bYCNEOAsA2J8m7A4l9pG-LQXfGojNY2YAm6srbE_QvfwWNEw
CitedBy_id crossref_primary_10_1016_j_eswa_2024_124627
crossref_primary_10_1016_j_trc_2024_104944
crossref_primary_10_1016_j_trc_2024_104725
crossref_primary_10_1109_MITS_2024_3418333
crossref_primary_10_1016_j_trc_2024_104739
crossref_primary_10_1177_03611981241262313
crossref_primary_10_1016_j_trc_2023_104440
crossref_primary_10_1007_s10489_023_04866_0
crossref_primary_10_1016_j_commtr_2023_100104
crossref_primary_10_1038_s44260_024_00014_y
crossref_primary_10_1016_j_trb_2024_103016
crossref_primary_10_1080_21680566_2025_2475215
crossref_primary_10_1177_03611981241230313
crossref_primary_10_1016_j_trc_2023_104461
crossref_primary_10_1287_trsc_2024_0519
Cites_doi 10.1016/j.trb.2012.06.008
10.1016/j.trc.2013.07.002
10.1016/j.trb.2014.06.010
10.1016/B978-1-55860-307-3.50049-6
10.1016/j.trb.2010.06.006
10.1098/rsta.2010.0099
10.1109/TCST.2014.2330997
10.1016/j.tre.2017.03.006
10.1007/s10458-019-09421-1
10.1126/science.204.4389.148
10.1016/j.tra.2012.05.006
10.1109/9.580874
10.1016/j.trb.2017.08.015
10.1016/j.trb.2013.03.007
10.1016/j.trb.2018.05.019
10.1016/j.trb.2018.10.007
10.1016/j.trc.2015.05.014
10.1016/j.trc.2020.102709
10.1016/j.trc.2021.103043
10.1016/j.trb.2018.02.016
10.1016/j.trb.2012.08.005
10.1038/nature14236
10.1016/j.trc.2016.07.013
10.1016/j.trb.2014.12.010
10.3141/2622-06
10.1016/j.trc.2015.05.009
10.1109/TITS.2018.2873104
10.1109/TITS.2015.2399303
10.1109/25.69966
10.1109/TITS.2012.2216877
10.1016/j.trc.2015.08.015
10.1109/ACC.2012.6314693
10.1016/j.sbspro.2013.05.008
10.1016/j.trc.2019.04.024
10.1016/j.trc.2020.102618
10.1016/j.trb.2014.09.010
10.1016/j.conengprac.2021.104750
10.1016/j.trc.2016.12.002
10.1016/j.trc.2013.04.010
10.1016/j.trc.2022.103932
10.1016/j.trc.2013.08.014
10.1016/j.trb.2008.02.002
10.1016/j.trc.2021.103157
10.1016/j.trc.2020.102725
10.1007/978-1-4419-0820-9_11
10.3141/2124-12
10.1016/j.trc.2022.103759
10.3141/2301-09
10.1016/j.trb.2013.07.003
10.1109/TITS.2017.2716541
10.1016/j.trb.2016.03.010
10.1016/j.trb.2010.11.004
10.1016/j.trb.2010.11.006
10.1016/j.trb.2015.02.010
10.1016/j.conengprac.2017.01.010
10.1016/j.trb.2017.09.008
10.1287/trsc.18.4.362
10.1016/j.trb.2019.03.010
10.1023/A:1022628806385
10.1613/jair.2447
10.1016/j.trb.2015.09.002
10.1016/j.trc.2015.04.031
10.1016/j.trb.2016.10.016
10.1016/j.trb.2016.05.008
10.1016/j.trb.2006.03.001
10.3141/2623-11
10.1016/j.trc.2020.02.003
10.1016/j.trc.2020.102949
10.1016/j.trc.2021.103485
10.1016/j.trb.2012.04.004
10.1016/j.trc.2022.103961
10.3141/2421-01
10.1609/aaai.v32i1.11794
10.1080/21680566.2017.1337528
10.1016/j.trb.2017.08.021
10.1016/j.trc.2017.08.002
10.1016/j.trc.2020.102628
ContentType Journal Article
Copyright 2023 Elsevier Ltd
Copyright_xml – notice: 2023 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.trc.2023.104033
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Economics
Engineering
EISSN 1879-2359
ExternalDocumentID 10_1016_j_trc_2023_104033
S0968090X23000220
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29Q
4.4
457
4G.
5VS
7-5
71M
8P~
9JN
9JO
AAAKF
AAAKG
AACTN
AAEDT
AAEDW
AAFJI
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARIN
AAXUO
AAYFN
ABBOA
ABLJU
ABMAC
ABMMH
ABUCO
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
AKYCK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOMHK
AOUOD
APLSM
ASPBG
AVARZ
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
GBOLZ
HAMUX
HMY
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LY1
LY7
M3Y
M41
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PRBVW
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SDS
SES
SET
SEW
SPC
SPCBC
SSB
SSD
SSO
SSS
SST
SSV
SSZ
T5K
TN5
WUQ
XPP
~G-
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ADNMO
AEIPS
AFJKZ
AGQPQ
AIIUN
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c297t-fd8eaedba90f21514e6caeb731ae7968a0a347d94e32220359051d92063451b23
IEDL.DBID .~1
ISSN 0968-090X
IngestDate Sat Oct 25 05:06:56 EDT 2025
Thu Apr 24 22:59:05 EDT 2025
Fri Feb 23 02:35:13 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Multi-region perimeter metering control
Model-free multi-agent reinforcement learning (MARL)
Macroscopic Fundamental Diagram (MFD)
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c297t-fd8eaedba90f21514e6caeb731ae7968a0a347d94e32220359051d92063451b23
ORCID 0000-0002-0648-3360
ParticipantIDs crossref_primary_10_1016_j_trc_2023_104033
crossref_citationtrail_10_1016_j_trc_2023_104033
elsevier_sciencedirect_doi_10_1016_j_trc_2023_104033
PublicationCentury 2000
PublicationDate March 2023
2023-03-00
PublicationDateYYYYMMDD 2023-03-01
PublicationDate_xml – month: 03
  year: 2023
  text: March 2023
PublicationDecade 2020
PublicationTitle Transportation research. Part C, Emerging technologies
PublicationYear 2023
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Haddad, Mirkin (b0190) 2017; 77
Keyvan-Ekbatani, Papageorgiou, Knoop (b0265) 2015; 59
Rashid, T., Farquhar, G., Peng, B., Whiteson, S., 2020. Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning, in: Advances in Neural Information Processing Systems. Neural information processing systems foundation, pp. 10199–10210. https://doi.org/10.48550/arxiv.2006.10800.
Saeedmanesh, Geroliminis (b0430) 2016; 91
Lopez, Krishnakumari, Leclercq, Chiabaut, van Lint (b0320) 2017; 2623
Yildirimoglu, Ramezani, Geroliminis (b0555) 2015; 59
Lei, Hou, Ren (b0295) 2019; 1–12
Son, K., Kim, D., Kang, W.J., Hostallero, D., Yi, Y., 2019. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning, in: 36th International Conference on Machine Learning. International Machine Learning Society (IMLS), pp. 5887–5896. https://doi.org/10.48550/arxiv.1905.05408.
Wang, Schaul, Hessel, van Hasselt, Lanctot, de Freitas (b0535) 2015; ICML 2016 4
Goodfellow, Bengio, Courville (b0155) 2016
Choi, S., Yeung, D.Y., Zhang, N., 1999. An Environment Model for Nonstationary Reinforcement Learning, in: Advances in Neural Information Processing Systems 12.
Sirmatel, Tsitsokas, Kouvelas, Geroliminis (b0455) 2021; 128
Yildirimoglu, Sirmatel, Geroliminis (b0560) 2018; 118
Saeedmanesh, Geroliminis (b0435) 2017; 105
Hernandez-Leal, Kartal, Taylor (b0225) 2018; 33
Tsitsiklis, Roy (b0505) 1997
Ni, Cassidy (b0375) 2020; 113
Gupta, Egorov, Kochenderfer (b0160) 2017; 10642 LNAI
Geroliminis, Sun (b0145) 2011; 45
Hajiahmadi, Haddad, De Schutter, Geroliminis (b0210) 2015; 23
Daganzo, Gayah, Gonzales (b0070) 2011; 45
Keyvan-Ekbatani, Yildirimoglu, Geroliminis, Papageorgiou (b0270) 2015; 16
Aboudolas, Geroliminis (b0010) 2013; 55
Chen, Huang, Lam, Pan, Hsu, Sumalee, Zhong (b0040) 2022; 142
Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., van Hasselt, H., Silver, D., 2018. Distributed Prioritized Experience Replay.
Gayah, Daganzo (b0115) 2011; 45
Gayah, Gao, Nagle (b0120) 2014; 70
Lowrie (b0330) 1982
Fu, Wang, Tang, Zheng, Geroliminis (b0100) 2020; 118
Sirmatel, Geroliminis (b0445) 2018; 19
Chang, Y.H., Ho, T., Kaelbling, L., 2003. All learning is local: Multi-agent learning in global reward games, in: Advances in Neural Information Processing Systems 16.
Ren, Hou, Sirmatel, Geroliminis (b0420) 2020; 115
Geroliminis, N., Levinson, D.M., 2009. Cordon Pricing Consistent with the Physics of Overcrowding, Transportation and Traffic Theory 2009: Golden Jubilee. https://doi.org/10.1007/978-1-4419-0820-9_11.
Godfrey (b0150) 1969; 11
Geroliminis, Haddad, Ramezani (b0140) 2013; 14
Leclercq, Geroliminis (b0290) 2013; 80
Zhong, Chen, Huang, Sumalee, Lam, Xu (b0570) 2018; 117
Zheng, Waraich, Axhausen, Geroliminis (b0565) 2012; 46
Hessel, Modayil, van Hasselt, Schaul, Ostrovski, Dabney, Horgan, Piot, Azar, Silver (b0230) 2017; 2018
Aalipour, Kebriaei, Ramezani (b0005) 2019; 20
Haddad, J., Ramezani, M., Geroliminis, N., 2012. Model predictive perimeter control for urban areas with macroscopic fundamental diagrams, in: Proceedings of the American Control Conference. pp. 5757–5762. https://doi.org/10.1109/acc.2012.6314693.
Rashid, Samvelyan, de Witt, Farquhar, Foerster, Whiteson (b0415) 2018
van Hasselt, Guez, Silver (b0520) 2015; 2016
Araghi, Khosravi, Johnstone, Creighton (b0025) 2013
Mahmassani, H., Herman, R., 1984. Dynamic User Equilibrium Departure Time and Route Choice on Idealized Traffic Arterials. 18, 362–384. https://doi.org/10.1287/TRSC.18.4.362.
Sunehag, P., Lever, G., Gruslys, A., Marian Czarnecki, W., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J.Z., Tuyls, K., Graepel, T., 2018. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward, in: 17th International Conference on Autonomous Agents and MultiAgent Systems. pp. 2085–2087. https://doi.org/10.5555/3237383.3238080.
Haddad, Geroliminis (b0185) 2012; 46
Ji, Geroliminis (b0245) 2012; 46
Du, Rakha, Gayah (b0090) 2016; 66
Gao, Shirley, Gayah (b0105) 2018; 117
Ambühl, Menendez (b0015) 2016; 71
Daganzo, Lehe (b0075) 2015; 75
Laval, Castrillón (b0285) 2015; 81
Ramezani, Haddad, Geroliminis (b0405) 2015; 74
Chu, X., Ye, H., 2017. Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning.
OroojlooyJadid, A., Hajinezhad, D., 2019. A Review of Cooperative Multi-Agent Deep Reinforcement Learning.
Haddad, Shraiber (b0200) 2014; 68
Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I., 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Adv. Neural Inf. Process. Syst. 2017-Decem, 6380–6391.
Schaul, Quan, Antonoglou, Silver (b0440) 2016
Williams, Mahmassani, Herman (b0550) 1987; 1112
Oliehoek, Spaan, Vlassis (b0380) 2008; 32
Moshahedi, Kattan (b0365) 2023; 146
Jin, C., Allen-Zhu, Z., Bubeck, S., Jordan, M.I., 2018. Is Q-Learning Provably Efficient?, in: Advances in Neural Information Processing Systems, 31.
Keyvan-Ekbatani, Kouvelas, Papamichail, Papageorgiou (b0255) 2012; 46
Peng, Rashid, de Witt, Kamienny, Torr, Böhmer, Whiteson (b0400) 2021
Paipuri, Xu, González, Leclercq (b0395) 2020; 118
Iqbal, S., Sha, F., 2019. Actor-attention-critic for multi-agent reinforcement learning, in: 36th International Conference on Machine Learning, ICML 2019. International Machine Learning Society (IMLS), pp. 5261–5270.
Small, Chu (b0460) 2003; 37
Haddad (b0175) 2017; 96
Foerster, J.N., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S., 2017. Counterfactual Multi-Agent Policy Gradients. 32nd AAAI Conf. Artif. Intell. AAAI 2018 2974–2982. https://doi.org/10.48550/arxiv.1705.08926.
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (b0355) 2015; 518
Sutton, Barto (b0480) 2018
Csikós, Charalambous, Farhadi, Kulcsár, Wymeersch (b0060) 2017; 83
Varaiya (b0525) 2013; 36
Wen, Y., Yang, Y., Luo, R., Wang, J., Pan, W., 2019. Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. 7th Int. Conf. Learn. Represent. https://doi.org/10.48550/arxiv.1901.09207.
Daganzo, Lehe (b0080) 2016; 90
Christianos, F., Papoudakis, G., Rahman, A., Albrecht, S. V., 2021. Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing, in: 38th International Conference on Machine Learning. https://doi.org/10.48550/arxiv.2102.07475.
Menelaou, Timotheou, Kolios, Panayiotou (b0350) 2021
Mazloumian, A., Geroliminis, N., Helbing, D., 2010. The spatial variability of vehicle densities as determinant of urban network capacity 368, 4627–4647. https://doi.org/10.1098/rsta.2010.0099.
Tan, M., 1993. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, in: 10th International Conference on Machine Learning Proceedings. Elsevier, pp. 330–337. https://doi.org/10.1016/B978-1-55860-307-3.50049-6.
Sirmatel, Geroliminis (b0450) 2021; 109
Keyvan-Ekbatani, Papageorgiou, Papamichail (b0260) 2013; 33
van Hasselt, H., Doron, Y., Strub, F., Hessel, M., Sonnerat, N., Modayil, J., 2018. Deep Reinforcement Learning and the Deadly Triad.
Gayah, V., Daganzo, C., 2012. Analytical Capacity Comparison of One-Way and Two-Way Signalized Street Networks 2301, 76–85. https://doi.org/10.3141/2301-09.
DePrator, Hitchcock, Gayah (b0085) 2017; 2622
Henderson, Islam, Bachman, Pineau, Precup, Meger (b0215) 2017; 2018
Haddad, Zheng (b0205) 2020; 137
Li, Ramezani (b0300) 2022; 145
Tieleman, Hinton (b0495) 2012; 4
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings. International Conference on Learning Representations, ICLR.
Genser, Kouvelas (b0125) 2022; 134
Robertson, Bretherton (b0425) 1991; 40
Mahmassani, Saberi, Zockaie (b0340) 2013; 36
Ortigosa, J., Gayah, V. V., Menendez, M., 2017. Analysis of one-way and two-way street configurations on urban grid networks. 7, 61–81. https://doi.org/10.1080/21680566.2017.1337528.
Daganzo (b0065) 2007; 41
Geroliminis, Daganzo (b0135) 2008; 42
Haddad (b0170) 2017; 61
Tilg, Amini, Busch (b0500) 2020; 114
Herman, Prigogine (b0220) 1979; 204
Watkins, Dayan (b0540) 1992; 8
Lin (b0315) 1992; 8
Buisson, Ladier (b0030) 2009; 2124
Su, Chow, Zheng, Huang, Liang, Zhong (b0470) 2020; 116
Zhong, Huang, Chen, Lam, Xu, Sumalee (b0575) 2018; 111
Zhou, Gayah (b0580) 2021; 124
van Hasselt (b0510) 2010
Nagle, Gayah (b0370) 2014
Li, Yildirimoglu, Ramezani (b0305) 2021; 126
Terry, J.K., Grammel, N., Hari, A., Santos, L., 2020. Parameter Sharing is Surprisingly Useful for Multi-Agent Deep Reinforcement Learning.
Mohajerpoor, Saberi, Vu, Garoni, Ramezani (b0360) 2020; 137
Amirgholy, Shahabi, Gao (b0020) 2017; 103
Wang, Han, Wang, Dong, Zhang (b0530) 2021
Haddad, Ramezani, Geroliminis (b0195) 2013; 54
Koller, D., Parr, R., 1999. Computing factored value functions for policies in structured MDPs, in: 16th International Joint Conference on Artificial Intelligence. pp. 1332–1339.
Haddad (b0165) 2015; 59
Lauer, M., Riedmiller, M.A., 2000. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems, in: 17th International Conference on Machine Learning. pp. 535–542.
Herman (10.1016/j.trc.2023.104033_b0220) 1979; 204
Leclercq (10.1016/j.trc.2023.104033_b0290) 2013; 80
10.1016/j.trc.2023.104033_b0410
Araghi (10.1016/j.trc.2023.104033_b0025) 2013
10.1016/j.trc.2023.104033_b0250
10.1016/j.trc.2023.104033_b0095
Watkins (10.1016/j.trc.2023.104033_b0540) 1992; 8
10.1016/j.trc.2023.104033_b0130
Henderson (10.1016/j.trc.2023.104033_b0215) 2017; 2018
Oliehoek (10.1016/j.trc.2023.104033_b0380) 2008; 32
Gao (10.1016/j.trc.2023.104033_b0105) 2018; 117
Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0265) 2015; 59
Nagle (10.1016/j.trc.2023.104033_b0370) 2014
DePrator (10.1016/j.trc.2023.104033_b0085) 2017; 2622
Haddad (10.1016/j.trc.2023.104033_b0200) 2014; 68
Robertson (10.1016/j.trc.2023.104033_b0425) 1991; 40
Wang (10.1016/j.trc.2023.104033_b0535) 2015; ICML 2016 4
van Hasselt (10.1016/j.trc.2023.104033_b0520) 2015; 2016
10.1016/j.trc.2023.104033_b0485
10.1016/j.trc.2023.104033_b0240
Ren (10.1016/j.trc.2023.104033_b0420) 2020; 115
10.1016/j.trc.2023.104033_b0490
Hernandez-Leal (10.1016/j.trc.2023.104033_b0225) 2018; 33
Geroliminis (10.1016/j.trc.2023.104033_b0140) 2013; 14
Haddad (10.1016/j.trc.2023.104033_b0170) 2017; 61
Amirgholy (10.1016/j.trc.2023.104033_b0020) 2017; 103
Li (10.1016/j.trc.2023.104033_b0305) 2021; 126
Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0255) 2012; 46
Mnih (10.1016/j.trc.2023.104033_b0355) 2015; 518
Goodfellow (10.1016/j.trc.2023.104033_b0155) 2016
Varaiya (10.1016/j.trc.2023.104033_b0525) 2013; 36
Lin (10.1016/j.trc.2023.104033_b0315) 1992; 8
Saeedmanesh (10.1016/j.trc.2023.104033_b0430) 2016; 91
Su (10.1016/j.trc.2023.104033_b0470) 2020; 116
Hajiahmadi (10.1016/j.trc.2023.104033_b0210) 2015; 23
Paipuri (10.1016/j.trc.2023.104033_b0395) 2020; 118
Geroliminis (10.1016/j.trc.2023.104033_b0145) 2011; 45
10.1016/j.trc.2023.104033_b0275
Peng (10.1016/j.trc.2023.104033_b0400) 2021
10.1016/j.trc.2023.104033_b0035
10.1016/j.trc.2023.104033_b0310
van Hasselt (10.1016/j.trc.2023.104033_b0510) 2010
Zhou (10.1016/j.trc.2023.104033_b0580) 2021; 124
10.1016/j.trc.2023.104033_b0280
Yildirimoglu (10.1016/j.trc.2023.104033_b0555) 2015; 59
Chen (10.1016/j.trc.2023.104033_b0040) 2022; 142
Genser (10.1016/j.trc.2023.104033_b0125) 2022; 134
Hessel (10.1016/j.trc.2023.104033_b0230) 2017; 2018
10.1016/j.trc.2023.104033_b0545
Tilg (10.1016/j.trc.2023.104033_b0500) 2020; 114
10.1016/j.trc.2023.104033_b0385
Zheng (10.1016/j.trc.2023.104033_b0565) 2012; 46
Sirmatel (10.1016/j.trc.2023.104033_b0450) 2021; 109
Zhong (10.1016/j.trc.2023.104033_b0570) 2018; 117
Csikós (10.1016/j.trc.2023.104033_b0060) 2017; 83
Williams (10.1016/j.trc.2023.104033_b0550) 1987; 1112
10.1016/j.trc.2023.104033_b0390
Ji (10.1016/j.trc.2023.104033_b0245) 2012; 46
Laval (10.1016/j.trc.2023.104033_b0285) 2015; 81
Li (10.1016/j.trc.2023.104033_b0300) 2022; 145
Haddad (10.1016/j.trc.2023.104033_b0185) 2012; 46
Daganzo (10.1016/j.trc.2023.104033_b0075) 2015; 75
Haddad (10.1016/j.trc.2023.104033_b0190) 2017; 77
10.1016/j.trc.2023.104033_b0335
Wang (10.1016/j.trc.2023.104033_b0530) 2021
10.1016/j.trc.2023.104033_b0055
10.1016/j.trc.2023.104033_b0180
Saeedmanesh (10.1016/j.trc.2023.104033_b0435) 2017; 105
Fu (10.1016/j.trc.2023.104033_b0100) 2020; 118
Lei (10.1016/j.trc.2023.104033_b0295) 2019; 1–12
Schaul (10.1016/j.trc.2023.104033_b0440) 2016
Buisson (10.1016/j.trc.2023.104033_b0030) 2009; 2124
Daganzo (10.1016/j.trc.2023.104033_b0070) 2011; 45
Haddad (10.1016/j.trc.2023.104033_b0175) 2017; 96
10.1016/j.trc.2023.104033_b0325
Sutton (10.1016/j.trc.2023.104033_b0480) 2018
10.1016/j.trc.2023.104033_b0045
Mahmassani (10.1016/j.trc.2023.104033_b0340) 2013; 36
Haddad (10.1016/j.trc.2023.104033_b0165) 2015; 59
10.1016/j.trc.2023.104033_b0050
Aalipour (10.1016/j.trc.2023.104033_b0005) 2019; 20
Aboudolas (10.1016/j.trc.2023.104033_b0010) 2013; 55
Haddad (10.1016/j.trc.2023.104033_b0195) 2013; 54
10.1016/j.trc.2023.104033_b0515
Small (10.1016/j.trc.2023.104033_b0460) 2003; 37
Geroliminis (10.1016/j.trc.2023.104033_b0135) 2008; 42
Daganzo (10.1016/j.trc.2023.104033_b0065) 2007; 41
10.1016/j.trc.2023.104033_b0235
Ramezani (10.1016/j.trc.2023.104033_b0405) 2015; 74
Godfrey (10.1016/j.trc.2023.104033_b0150) 1969; 11
10.1016/j.trc.2023.104033_b0110
10.1016/j.trc.2023.104033_b0475
Du (10.1016/j.trc.2023.104033_b0090) 2016; 66
Gayah (10.1016/j.trc.2023.104033_b0115) 2011; 45
Zhong (10.1016/j.trc.2023.104033_b0575) 2018; 111
Lowrie (10.1016/j.trc.2023.104033_b0330) 1982
Sirmatel (10.1016/j.trc.2023.104033_b0455) 2021; 128
Gayah (10.1016/j.trc.2023.104033_b0120) 2014; 70
Menelaou (10.1016/j.trc.2023.104033_b0350) 2021
Daganzo (10.1016/j.trc.2023.104033_b0080) 2016; 90
Ambühl (10.1016/j.trc.2023.104033_b0015) 2016; 71
Ni (10.1016/j.trc.2023.104033_b0375) 2020; 113
Rashid (10.1016/j.trc.2023.104033_b0415) 2018
10.1016/j.trc.2023.104033_b0345
Moshahedi (10.1016/j.trc.2023.104033_b0365) 2023; 146
Haddad (10.1016/j.trc.2023.104033_b0205) 2020; 137
Lopez (10.1016/j.trc.2023.104033_b0320) 2017; 2623
10.1016/j.trc.2023.104033_b0465
Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0270) 2015; 16
Sirmatel (10.1016/j.trc.2023.104033_b0445) 2018; 19
Yildirimoglu (10.1016/j.trc.2023.104033_b0560) 2018; 118
Mohajerpoor (10.1016/j.trc.2023.104033_b0360) 2020; 137
Gupta (10.1016/j.trc.2023.104033_b0160) 2017; 10642 LNAI
Tsitsiklis (10.1016/j.trc.2023.104033_b0505) 1997
Tieleman (10.1016/j.trc.2023.104033_b0495) 2012; 4
Keyvan-Ekbatani (10.1016/j.trc.2023.104033_b0260) 2013; 33
References_xml – reference: Chang, Y.H., Ho, T., Kaelbling, L., 2003. All learning is local: Multi-agent learning in global reward games, in: Advances in Neural Information Processing Systems 16.
– volume: 42
  start-page: 759
  year: 2008
  end-page: 770
  ident: b0135
  article-title: Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings
  publication-title: Transp. Res. Part B Methodol.
– volume: 118
  year: 2020
  ident: b0395
  article-title: Estimating MFDs, trip lengths and path flow distributions in a multi-region setting using mobile phone data
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 128
  year: 2021
  ident: b0455
  article-title: Modeling, estimation, and control in large-scale urban road networks with remaining travel distance dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
– reference: Ortigosa, J., Gayah, V. V., Menendez, M., 2017. Analysis of one-way and two-way street configurations on urban grid networks. 7, 61–81. https://doi.org/10.1080/21680566.2017.1337528.
– start-page: 1
  year: 2014
  end-page: 11
  ident: b0370
  article-title: Accuracy of Networkwide Traffic States Estimated from Mobile Probe Data
  publication-title: Transp. Res. Rec. J. Transp. Res. Board
– volume: 59
  start-page: 404
  year: 2015
  end-page: 420
  ident: b0555
  article-title: Equilibrium analysis and route guidance in large-scale networks with MFD dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 8
  start-page: 279
  year: 1992
  end-page: 292
  ident: b0540
  article-title: Q-learning
  publication-title: Mach. Learn.
– volume: 145
  year: 2022
  ident: b0300
  article-title: Quasi revenue-neutral congestion pricing in cities: Crediting drivers to avoid city centers
  publication-title: Transp. Res. Part C Emerg. Technol.
– reference: Rashid, T., Farquhar, G., Peng, B., Whiteson, S., 2020. Weighted QMIX: Expanding Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning, in: Advances in Neural Information Processing Systems. Neural information processing systems foundation, pp. 10199–10210. https://doi.org/10.48550/arxiv.2006.10800.
– volume: 116
  year: 2020
  ident: b0470
  article-title: Neuro-dynamic programming for optimal control of macroscopic fundamental diagram systems
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 2018
  start-page: 3207
  year: 2017
  end-page: 3214
  ident: b0215
  article-title: Deep Reinforcement Learning that Matters
  publication-title: 32nd AAAI Conf. Artif. Intell. AAAI
– volume: 59
  start-page: 308
  year: 2015
  end-page: 322
  ident: b0265
  article-title: Controller design for gating traffic control in presence of time-delay in urban road networks
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 40
  start-page: 11
  year: 1991
  end-page: 15
  ident: b0425
  article-title: Optimizing Networks of Traffic Signals in Real Time—The SCOOT Method
  publication-title: IEEE Trans. Veh. Technol.
– volume: 146
  year: 2023
  ident: b0365
  article-title: Alpha-fair large-scale urban network control: A perimeter control based on a macroscopic fundamental diagram
  publication-title: Transp. Res. Part C Emerg. Technol.
– reference: Koller, D., Parr, R., 1999. Computing factored value functions for policies in structured MDPs, in: 16th International Joint Conference on Artificial Intelligence. pp. 1332–1339.
– reference: Sunehag, P., Lever, G., Gruslys, A., Marian Czarnecki, W., Zambaldi, V., Jaderberg, M., Lanctot, M., Sonnerat, N., Leibo, J.Z., Tuyls, K., Graepel, T., 2018. Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward, in: 17th International Conference on Autonomous Agents and MultiAgent Systems. pp. 2085–2087. https://doi.org/10.5555/3237383.3238080.
– volume: 33
  start-page: 750
  year: 2018
  end-page: 797
  ident: b0225
  article-title: A Survey and Critique of Multiagent Deep Reinforcement Learning
  publication-title: Auton. Agent. Multi. Agent. Syst.
– volume: 137
  start-page: 47
  year: 2020
  end-page: 73
  ident: b0360
  article-title: H∞ robust perimeter flow control in urban networks with partial information feedback
  publication-title: Transp. Res. Part B Methodol.
– volume: 54
  start-page: 17
  year: 2013
  end-page: 36
  ident: b0195
  article-title: Cooperative traffic control of a mixed network with two urban regions and a freeway
  publication-title: Transp. Res. Part B Methodol.
– year: 2021
  ident: b0530
  article-title: Off-Policy Multi-Agent Decomposed Policy Gradients
  publication-title: International Conference on Learning Representations
– volume: 83
  start-page: 120
  year: 2017
  end-page: 133
  ident: b0060
  article-title: Network traffic flow optimization under performance constraints
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 2623
  start-page: 98
  year: 2017
  end-page: 107
  ident: b0320
  article-title: Spatiotemporal Partitioning of Transportation Network Using Travel Time Data
  publication-title: Transp. Res. Rec. J. Transp. Res. Board
– reference: OroojlooyJadid, A., Hajinezhad, D., 2019. A Review of Cooperative Multi-Agent Deep Reinforcement Learning.
– year: 1997
  ident: b0505
  article-title: An Analysis of Temporal-Difference Learning with Function Approximation
  publication-title: IEEE Trans. Autom. Control
– volume: 37
  start-page: 319
  year: 2003
  end-page: 352
  ident: b0460
  article-title: Hypercongestion. J. Transp. Econ
  publication-title: Policy
– year: 2018
  ident: b0480
  article-title: Reinforcement learning: An introduction
– reference: Jin, C., Allen-Zhu, Z., Bubeck, S., Jordan, M.I., 2018. Is Q-Learning Provably Efficient?, in: Advances in Neural Information Processing Systems, 31.
– reference: Mazloumian, A., Geroliminis, N., Helbing, D., 2010. The spatial variability of vehicle densities as determinant of urban network capacity 368, 4627–4647. https://doi.org/10.1098/rsta.2010.0099.
– volume: 46
  start-page: 1291
  year: 2012
  end-page: 1303
  ident: b0565
  article-title: A dynamic cordon pricing scheme combining the Macroscopic Fundamental Diagram and an agent-based traffic model
  publication-title: Transp. Res. Part A Policy Pract.
– volume: 118
  year: 2020
  ident: b0100
  article-title: Empirical analysis of large-scale multimodal traffic with multi-sensor data
  publication-title: Transp. Res. Part C Emerg. Technol.
– reference: Lauer, M., Riedmiller, M.A., 2000. An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems, in: 17th International Conference on Machine Learning. pp. 535–542.
– volume: 105
  start-page: 193
  year: 2017
  end-page: 211
  ident: b0435
  article-title: Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks
  publication-title: Transp. Res. Part B Methodol.
– volume: 91
  start-page: 250
  year: 2016
  end-page: 269
  ident: b0430
  article-title: Clustering of heterogeneous networks with directional flows based on “Snake” similarities
  publication-title: Transp. Res. Part B Methodol.
– volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: b0355
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– volume: 115
  year: 2020
  ident: b0420
  article-title: Data driven model free adaptive iterative learning perimeter control for large-scale urban road networks
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 19
  start-page: 1112
  year: 2018
  end-page: 1121
  ident: b0445
  article-title: Economic Model Predictive Control of Large-Scale Urban Road Networks via Perimeter Control and Regional Route Guidance
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 75
  start-page: 89
  year: 2015
  end-page: 99
  ident: b0075
  article-title: Distance-dependent congestion pricing for downtown zones
  publication-title: Transp. Res. Part B Methodol.
– reference: Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D., 2016. Continuous control with deep reinforcement learning, in: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings. International Conference on Learning Representations, ICLR.
– volume: 134
  year: 2022
  ident: b0125
  article-title: Dynamic optimal congestion pricing in multi-region urban networks by application of a Multi-Layer-Neural network
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 90
  start-page: 56
  year: 2016
  end-page: 69
  ident: b0080
  article-title: Traffic flow on signalized streets
  publication-title: Transp. Res. Part B Methodol.
– volume: 111
  start-page: 327
  year: 2018
  end-page: 355
  ident: b0575
  article-title: Boundary conditions and behavior of the macroscopic fundamental diagram based network traffic dynamics: A control systems perspective
  publication-title: Transp. Res. Part B Methodol.
– reference: Gayah, V., Daganzo, C., 2012. Analytical Capacity Comparison of One-Way and Two-Way Signalized Street Networks 2301, 76–85. https://doi.org/10.3141/2301-09.
– volume: 96
  start-page: 1
  year: 2017
  end-page: 25
  ident: b0175
  article-title: Optimal perimeter control synthesis for two urban regions with aggregate boundary queue dynamics
  publication-title: Transp. Res. Part B Methodol.
– volume: 81
  start-page: 904
  year: 2015
  end-page: 916
  ident: b0285
  article-title: Stochastic approximations for the macroscopic fundamental diagram of urban networks
  publication-title: Transp. Res. Part B Methodol.
– volume: 8
  start-page: 293
  year: 1992
  end-page: 321
  ident: b0315
  article-title: Self-improving reactive agents based on reinforcement learning, planning and teaching
  publication-title: Mach. Learn.
– volume: 74
  start-page: 1
  year: 2015
  end-page: 19
  ident: b0405
  article-title: Dynamics of heterogeneity in urban networks: Aggregated traffic modeling and hierarchical control
  publication-title: Transp. Res. Part B Methodol.
– volume: 11
  start-page: 323
  year: 1969
  end-page: 327
  ident: b0150
  article-title: The mechanism of a road network
  publication-title: Traffic Eng. Control
– reference: Lowe, R., Wu, Y., Tamar, A., Harb, J., Abbeel, P., Mordatch, I., 2017. Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments. Adv. Neural Inf. Process. Syst. 2017-Decem, 6380–6391.
– reference: Haddad, J., Ramezani, M., Geroliminis, N., 2012. Model predictive perimeter control for urban areas with macroscopic fundamental diagrams, in: Proceedings of the American Control Conference. pp. 5757–5762. https://doi.org/10.1109/acc.2012.6314693.
– reference: Chu, X., Ye, H., 2017. Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning.
– volume: 16
  start-page: 2141
  year: 2015
  end-page: 2154
  ident: b0270
  article-title: Multiple concentric gating traffic control in large-scale urban networks
  publication-title: IEEE Trans. Intell. Transp. Syst.
– year: 2021
  ident: b0350
  article-title: Joint Route Guidance and Demand Management for Real-Time Control of Multi-Regional Traffic Networks
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 66
  start-page: 136
  year: 2016
  end-page: 149
  ident: b0090
  article-title: Deriving macroscopic fundamental diagrams from probe data: Issues and proposed solutions
  publication-title: Transp. Res. Part C Emerg. Technol.
– start-page: 1261
  year: 2013
  end-page: 1265
  ident: b0025
  article-title: Q-learning method for controlling traffic signal phase time in a single intersection
  publication-title: IEEE Conf. Intell. Transp. Syst. Proc., ITSC
– volume: 36
  start-page: 480
  year: 2013
  end-page: 497
  ident: b0340
  article-title: Urban network gridlock: Theory, characteristics, and dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
– reference: Terry, J.K., Grammel, N., Hari, A., Santos, L., 2020. Parameter Sharing is Surprisingly Useful for Multi-Agent Deep Reinforcement Learning.
– reference: Horgan, D., Quan, J., Budden, D., Barth-Maron, G., Hessel, M., van Hasselt, H., Silver, D., 2018. Distributed Prioritized Experience Replay.
– volume: 80
  start-page: 99
  year: 2013
  end-page: 118
  ident: b0290
  article-title: Estimating MFDs in Simple Networks with Route Choice
  publication-title: Procedia - Soc. Behav. Sci.
– volume: 71
  start-page: 184
  year: 2016
  end-page: 197
  ident: b0015
  article-title: Data fusion algorithm for macroscopic fundamental diagram estimation
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 41
  start-page: 49
  year: 2007
  end-page: 62
  ident: b0065
  article-title: Urban gridlock: Macroscopic modeling and mitigation approaches
  publication-title: Transp. Res. Part B Methodol.
– volume: 45
  start-page: 643
  year: 2011
  end-page: 655
  ident: b0115
  article-title: Clockwise hysteresis loops in the Macroscopic Fundamental Diagram: An effect of network instability
  publication-title: Transp. Res. Part B Methodol.
– volume: 114
  start-page: 1
  year: 2020
  end-page: 19
  ident: b0500
  article-title: Evaluation of analytical approximation methods for the macroscopic fundamental diagram
  publication-title: Transp. Res. Part C Emerg. Technol.
– year: 2016
  ident: b0155
  article-title: Deep Learning
– volume: 10642 LNAI
  start-page: 66
  year: 2017
  end-page: 83
  ident: b0160
  article-title: Cooperative Multi-agent Control Using Deep Reinforcement Learning
  publication-title: Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics)
– volume: 2018
  start-page: 3215
  year: 2017
  end-page: 3222
  ident: b0230
  article-title: Rainbow: Combining Improvements in Deep Reinforcement Learning
  publication-title: 32nd AAAI Conf. Artif. Intell. AAAI
– reference: Tan, M., 1993. Multi-Agent Reinforcement Learning: Independent vs. Cooperative Agents, in: 10th International Conference on Machine Learning Proceedings. Elsevier, pp. 330–337. https://doi.org/10.1016/B978-1-55860-307-3.50049-6.
– reference: Wen, Y., Yang, Y., Luo, R., Wang, J., Pan, W., 2019. Probabilistic Recursive Reasoning for Multi-Agent Reinforcement Learning. 7th Int. Conf. Learn. Represent. https://doi.org/10.48550/arxiv.1901.09207.
– volume: 46
  start-page: 1639
  year: 2012
  end-page: 1656
  ident: b0245
  article-title: On the spatial partitioning of urban transportation networks
  publication-title: Transp. Res. Part B Methodol.
– reference: Christianos, F., Papoudakis, G., Rahman, A., Albrecht, S. V., 2021. Scaling Multi-Agent Reinforcement Learning with Selective Parameter Sharing, in: 38th International Conference on Machine Learning. https://doi.org/10.48550/arxiv.2102.07475.
– volume: 46
  start-page: 1393
  year: 2012
  end-page: 1403
  ident: b0255
  article-title: Exploiting the fundamental diagram of urban networks for feedback-based gating
  publication-title: Transp. Res. Part B Methodol.
– volume: 1–12
  year: 2019
  ident: b0295
  article-title: Data-Driven Model Free Adaptive Perimeter Control for Multi-Region Urban Traffic Networks With Route Choice
  publication-title: IEEE Trans. Intell. Transp. Syst.
– reference: Geroliminis, N., Levinson, D.M., 2009. Cordon Pricing Consistent with the Physics of Overcrowding, Transportation and Traffic Theory 2009: Golden Jubilee. https://doi.org/10.1007/978-1-4419-0820-9_11.
– volume: 124
  year: 2021
  ident: b0580
  article-title: Model-free perimeter metering control for two-region urban networks using deep reinforcement learning
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 204
  start-page: 148
  year: 1979
  end-page: 151
  ident: b0220
  article-title: A two-fluid approach to town traffic
  publication-title: Science (80-.)
– volume: 118
  start-page: 106
  year: 2018
  end-page: 123
  ident: b0560
  article-title: Hierarchical control of heterogeneous large-scale urban road networks via path assignment and regional route guidance
  publication-title: Transp. Res. Part B Methodol.
– year: 2018
  ident: b0415
  article-title: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
  publication-title: International Conference of Machine Learning
– volume: 4
  start-page: 26
  year: 2012
  end-page: 31
  ident: b0495
  article-title: Lecture 6.5-rmsprop Divide the Gradient by a Running Average of Its Recent Magnitude
  publication-title: COURSERA Neural Networks Mach. Learn.
– start-page: 67
  year: 1982
  end-page: 70
  ident: b0330
  article-title: Scats: The Sydney coordinated adaptive traffic system - principles, methodology, algorithms
  publication-title: International Conference of Road Traffic Signal.
– volume: 137
  start-page: 133
  year: 2020
  end-page: 153
  ident: b0205
  article-title: Adaptive perimeter control for multi-region accumulation-based models with state delays
  publication-title: Transp. Res. Part B Methodol.
– volume: 46
  start-page: 1159
  year: 2012
  end-page: 1176
  ident: b0185
  article-title: On the stability of traffic perimeter control in two-region urban cities
  publication-title: Transp. Res. Part B Methodol.
– volume: 32
  start-page: 289
  year: 2008
  end-page: 353
  ident: b0380
  article-title: Optimal and Approximate Q-value Functions for Decentralized POMDPs
  publication-title: J. Artif. Intell. Res.
– volume: 14
  start-page: 348
  year: 2013
  end-page: 359
  ident: b0140
  article-title: Optimal perimeter control for two urban regions with macroscopic fundamental diagrams: A model predictive approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 142
  year: 2022
  ident: b0040
  article-title: Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 36
  start-page: 177
  year: 2013
  end-page: 195
  ident: b0525
  article-title: Max pressure control of a network of signalized intersections
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 55
  start-page: 265
  year: 2013
  end-page: 281
  ident: b0010
  article-title: Perimeter and boundary flow control in multi-reservoir heterogeneous networks
  publication-title: Transp. Res. Part B Methodol.
– volume: 20
  start-page: 3224
  year: 2019
  end-page: 3234
  ident: b0005
  article-title: Analytical Optimal Solution of Perimeter Traffic Flow Control Based on MFD Dynamics: A Pontryagin’s Maximum Principle Approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 103
  start-page: 261
  year: 2017
  end-page: 285
  ident: b0020
  article-title: Optimal design of sustainable transit systems in congested urban networks: A macroscopic approach
  publication-title: Transp. Res. Part E Logist. Transp. Rev.
– reference: Son, K., Kim, D., Kang, W.J., Hostallero, D., Yi, Y., 2019. QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning, in: 36th International Conference on Machine Learning. International Machine Learning Society (IMLS), pp. 5887–5896. https://doi.org/10.48550/arxiv.1905.05408.
– volume: 23
  start-page: 464
  year: 2015
  end-page: 478
  ident: b0210
  article-title: Optimal hybrid perimeter and switching plans control for urban traffic networks
  publication-title: IEEE Trans. Control Syst. Technol.
– volume: 70
  start-page: 255
  year: 2014
  end-page: 268
  ident: b0120
  article-title: On the impacts of locally adaptive signal control on urban network stability and the macroscopic fundamental diagram
  publication-title: Transp. Res. Part B Methodol.
– volume: 45
  start-page: 278
  year: 2011
  end-page: 288
  ident: b0070
  article-title: Macroscopic relations of urban traffic variables: Bifurcations, multivaluedness and instability
  publication-title: Transp. Res. Part B Methodol.
– volume: 113
  start-page: 164
  year: 2020
  end-page: 175
  ident: b0375
  article-title: City-wide traffic control: Modeling impacts of cordon queues
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 117
  start-page: 687
  year: 2018
  end-page: 707
  ident: b0570
  article-title: Robust perimeter control for two urban regions with macroscopic fundamental diagrams: A control-Lyapunov function approach
  publication-title: Transp. Res. Part B Methodol.
– year: 2016
  ident: b0440
  article-title: Prioritized experience replay
  publication-title: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings. International Conference on Learning Representations
– year: 2021
  ident: b0400
  article-title: FACMAC: Factored Multi-Agent Centralised Policy Gradients
  publication-title: In: The 35th Conference on Neural Information Processing Systems
– reference: van Hasselt, H., Doron, Y., Strub, F., Hessel, M., Sonnerat, N., Modayil, J., 2018. Deep Reinforcement Learning and the Deadly Triad.
– volume: 33
  start-page: 74
  year: 2013
  end-page: 87
  ident: b0260
  article-title: Urban congestion gating control based on reduced operational network fundamental diagrams
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 126
  year: 2021
  ident: b0305
  article-title: Robust perimeter control with cordon queues and heterogeneous transfer flows
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 1112
  start-page: 78
  year: 1987
  end-page: 88
  ident: b0550
  article-title: Urban traffic network flow models
  publication-title: Transp. Res. Rec.
– volume: 77
  start-page: 495
  year: 2017
  end-page: 515
  ident: b0190
  article-title: Coordinated distributed adaptive perimeter control for large-scale urban road networks
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 68
  start-page: 315
  year: 2014
  end-page: 332
  ident: b0200
  article-title: Robust perimeter control design for an urban region
  publication-title: Transp. Res. Part B Methodol.
– volume: 45
  start-page: 605
  year: 2011
  end-page: 617
  ident: b0145
  article-title: Properties of a well-defined macroscopic fundamental diagram for urban traffic
  publication-title: Transp. Res. Part B Methodol.
– reference: Mahmassani, H., Herman, R., 1984. Dynamic User Equilibrium Departure Time and Route Choice on Idealized Traffic Arterials. 18, 362–384. https://doi.org/10.1287/TRSC.18.4.362.
– volume: 109
  year: 2021
  ident: b0450
  article-title: Stabilization of city-scale road traffic networks via macroscopic fundamental diagram-based model predictive perimeter control
  publication-title: Control Eng. Pract.
– volume: 61
  start-page: 134
  year: 2017
  end-page: 148
  ident: b0170
  article-title: Optimal coupled and decoupled perimeter control in one-region cities
  publication-title: Control Eng. Pract.
– start-page: 2613
  year: 2010
  end-page: 2621
  ident: b0510
  article-title: Double Q-learning
  publication-title: Adv. Neural Inf. Proces. Syst.
– reference: Choi, S., Yeung, D.Y., Zhang, N., 1999. An Environment Model for Nonstationary Reinforcement Learning, in: Advances in Neural Information Processing Systems 12.
– reference: Iqbal, S., Sha, F., 2019. Actor-attention-critic for multi-agent reinforcement learning, in: 36th International Conference on Machine Learning, ICML 2019. International Machine Learning Society (IMLS), pp. 5261–5270.
– reference: Foerster, J.N., Farquhar, G., Afouras, T., Nardelli, N., Whiteson, S., 2017. Counterfactual Multi-Agent Policy Gradients. 32nd AAAI Conf. Artif. Intell. AAAI 2018 2974–2982. https://doi.org/10.48550/arxiv.1705.08926.
– volume: 2622
  start-page: 58
  year: 2017
  end-page: 69
  ident: b0085
  article-title: Improving urban street network efficiency by prohibiting conflicting left turns at signalized intersections
  publication-title: Transp. Res. Rec.
– volume: ICML 2016 4
  start-page: 2939
  year: 2015
  end-page: 2947
  ident: b0535
  article-title: Dueling Network Architectures for Deep Reinforcement Learning
  publication-title: 33rd Int. Conf. Mach. Learn.
– volume: 2124
  start-page: 127
  year: 2009
  end-page: 136
  ident: b0030
  article-title: Exploring the Impact of Homogeneity of Traffic Measurements on the Existence of Macroscopic Fundamental Diagrams
  publication-title: Transp. Res. Rec. J. Transp. Res. Board
– volume: 117
  start-page: 660
  year: 2018
  end-page: 675
  ident: b0105
  article-title: An analytical framework to model uncertainty in urban network dynamics using Macroscopic Fundamental Diagrams
  publication-title: Transp. Res. Part B Methodol.
– volume: 59
  start-page: 323
  year: 2015
  end-page: 339
  ident: b0165
  article-title: Robust constrained control of uncertain macroscopic fundamental diagram networks
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 2016
  start-page: 2094
  year: 2015
  end-page: 2100
  ident: b0520
  article-title: Deep Reinforcement Learning with Double Q-learning
  publication-title: 30th AAAI Conf. Artif. Intell. AAAI
– volume: 46
  start-page: 1393
  year: 2012
  ident: 10.1016/j.trc.2023.104033_b0255
  article-title: Exploiting the fundamental diagram of urban networks for feedback-based gating
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2012.06.008
– volume: 36
  start-page: 480
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0340
  article-title: Urban network gridlock: Theory, characteristics, and dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2013.07.002
– volume: 68
  start-page: 315
  year: 2014
  ident: 10.1016/j.trc.2023.104033_b0200
  article-title: Robust perimeter control design for an urban region
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2014.06.010
– ident: 10.1016/j.trc.2023.104033_b0485
  doi: 10.1016/B978-1-55860-307-3.50049-6
– volume: 45
  start-page: 278
  year: 2011
  ident: 10.1016/j.trc.2023.104033_b0070
  article-title: Macroscopic relations of urban traffic variables: Bifurcations, multivaluedness and instability
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2010.06.006
– start-page: 1261
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0025
  article-title: Q-learning method for controlling traffic signal phase time in a single intersection
  publication-title: IEEE Conf. Intell. Transp. Syst. Proc., ITSC
– ident: 10.1016/j.trc.2023.104033_b0345
  doi: 10.1098/rsta.2010.0099
– volume: 23
  start-page: 464
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0210
  article-title: Optimal hybrid perimeter and switching plans control for urban traffic networks
  publication-title: IEEE Trans. Control Syst. Technol.
  doi: 10.1109/TCST.2014.2330997
– volume: 103
  start-page: 261
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0020
  article-title: Optimal design of sustainable transit systems in congested urban networks: A macroscopic approach
  publication-title: Transp. Res. Part E Logist. Transp. Rev.
  doi: 10.1016/j.tre.2017.03.006
– start-page: 2613
  year: 2010
  ident: 10.1016/j.trc.2023.104033_b0510
  article-title: Double Q-learning
  publication-title: Adv. Neural Inf. Proces. Syst.
– volume: 33
  start-page: 750
  year: 2018
  ident: 10.1016/j.trc.2023.104033_b0225
  article-title: A Survey and Critique of Multiagent Deep Reinforcement Learning
  publication-title: Auton. Agent. Multi. Agent. Syst.
  doi: 10.1007/s10458-019-09421-1
– ident: 10.1016/j.trc.2023.104033_b0515
– volume: 204
  start-page: 148
  issue: 4389
  year: 1979
  ident: 10.1016/j.trc.2023.104033_b0220
  article-title: A two-fluid approach to town traffic
  publication-title: Science (80-.)
  doi: 10.1126/science.204.4389.148
– volume: 46
  start-page: 1291
  year: 2012
  ident: 10.1016/j.trc.2023.104033_b0565
  article-title: A dynamic cordon pricing scheme combining the Macroscopic Fundamental Diagram and an agent-based traffic model
  publication-title: Transp. Res. Part A Policy Pract.
  doi: 10.1016/j.tra.2012.05.006
– year: 1997
  ident: 10.1016/j.trc.2023.104033_b0505
  article-title: An Analysis of Temporal-Difference Learning with Function Approximation
  publication-title: IEEE Trans. Autom. Control
  doi: 10.1109/9.580874
– volume: 117
  start-page: 660
  year: 2018
  ident: 10.1016/j.trc.2023.104033_b0105
  article-title: An analytical framework to model uncertainty in urban network dynamics using Macroscopic Fundamental Diagrams
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2017.08.015
– volume: ICML 2016 4
  start-page: 2939
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0535
  article-title: Dueling Network Architectures for Deep Reinforcement Learning
  publication-title: 33rd Int. Conf. Mach. Learn.
– ident: 10.1016/j.trc.2023.104033_b0410
– volume: 54
  start-page: 17
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0195
  article-title: Cooperative traffic control of a mixed network with two urban regions and a freeway
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2013.03.007
– volume: 137
  start-page: 133
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0205
  article-title: Adaptive perimeter control for multi-region accumulation-based models with state delays
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2018.05.019
– ident: 10.1016/j.trc.2023.104033_b0250
– ident: 10.1016/j.trc.2023.104033_b0275
– volume: 118
  start-page: 106
  year: 2018
  ident: 10.1016/j.trc.2023.104033_b0560
  article-title: Hierarchical control of heterogeneous large-scale urban road networks via path assignment and regional route guidance
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2018.10.007
– volume: 59
  start-page: 323
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0165
  article-title: Robust constrained control of uncertain macroscopic fundamental diagram networks
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2015.05.014
– volume: 118
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0395
  article-title: Estimating MFDs, trip lengths and path flow distributions in a multi-region setting using mobile phone data
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2020.102709
– ident: 10.1016/j.trc.2023.104033_b0035
– ident: 10.1016/j.trc.2023.104033_b0310
– volume: 126
  year: 2021
  ident: 10.1016/j.trc.2023.104033_b0305
  article-title: Robust perimeter control with cordon queues and heterogeneous transfer flows
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2021.103043
– ident: 10.1016/j.trc.2023.104033_b0465
– volume: 111
  start-page: 327
  year: 2018
  ident: 10.1016/j.trc.2023.104033_b0575
  article-title: Boundary conditions and behavior of the macroscopic fundamental diagram based network traffic dynamics: A control systems perspective
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2018.02.016
– ident: 10.1016/j.trc.2023.104033_b0545
– volume: 46
  start-page: 1639
  year: 2012
  ident: 10.1016/j.trc.2023.104033_b0245
  article-title: On the spatial partitioning of urban transportation networks
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2012.08.005
– volume: 37
  start-page: 319
  issue: 3
  year: 2003
  ident: 10.1016/j.trc.2023.104033_b0460
  article-title: Hypercongestion. J. Transp. Econ
  publication-title: Policy
– ident: 10.1016/j.trc.2023.104033_b0050
– volume: 518
  start-page: 529
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0355
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– year: 2021
  ident: 10.1016/j.trc.2023.104033_b0400
  article-title: FACMAC: Factored Multi-Agent Centralised Policy Gradients
– volume: 71
  start-page: 184
  year: 2016
  ident: 10.1016/j.trc.2023.104033_b0015
  article-title: Data fusion algorithm for macroscopic fundamental diagram estimation
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2016.07.013
– volume: 74
  start-page: 1
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0405
  article-title: Dynamics of heterogeneity in urban networks: Aggregated traffic modeling and hierarchical control
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2014.12.010
– volume: 4
  start-page: 26
  year: 2012
  ident: 10.1016/j.trc.2023.104033_b0495
  article-title: Lecture 6.5-rmsprop Divide the Gradient by a Running Average of Its Recent Magnitude
  publication-title: COURSERA Neural Networks Mach. Learn.
– volume: 2622
  start-page: 58
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0085
  article-title: Improving urban street network efficiency by prohibiting conflicting left turns at signalized intersections
  publication-title: Transp. Res. Rec.
  doi: 10.3141/2622-06
– volume: 59
  start-page: 404
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0555
  article-title: Equilibrium analysis and route guidance in large-scale networks with MFD dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2015.05.009
– volume: 20
  start-page: 3224
  year: 2019
  ident: 10.1016/j.trc.2023.104033_b0005
  article-title: Analytical Optimal Solution of Perimeter Traffic Flow Control Based on MFD Dynamics: A Pontryagin’s Maximum Principle Approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2018.2873104
– volume: 16
  start-page: 2141
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0270
  article-title: Multiple concentric gating traffic control in large-scale urban networks
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2015.2399303
– volume: 2018
  start-page: 3215
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0230
  article-title: Rainbow: Combining Improvements in Deep Reinforcement Learning
  publication-title: 32nd AAAI Conf. Artif. Intell. AAAI
– volume: 40
  start-page: 11
  year: 1991
  ident: 10.1016/j.trc.2023.104033_b0425
  article-title: Optimizing Networks of Traffic Signals in Real Time—The SCOOT Method
  publication-title: IEEE Trans. Veh. Technol.
  doi: 10.1109/25.69966
– volume: 14
  start-page: 348
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0140
  article-title: Optimal perimeter control for two urban regions with macroscopic fundamental diagrams: A model predictive approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2012.2216877
– volume: 66
  start-page: 136
  year: 2016
  ident: 10.1016/j.trc.2023.104033_b0090
  article-title: Deriving macroscopic fundamental diagrams from probe data: Issues and proposed solutions
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2015.08.015
– ident: 10.1016/j.trc.2023.104033_b0180
  doi: 10.1109/ACC.2012.6314693
– volume: 80
  start-page: 99
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0290
  article-title: Estimating MFDs in Simple Networks with Route Choice
  publication-title: Procedia - Soc. Behav. Sci.
  doi: 10.1016/j.sbspro.2013.05.008
– volume: 113
  start-page: 164
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0375
  article-title: City-wide traffic control: Modeling impacts of cordon queues
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2019.04.024
– year: 2018
  ident: 10.1016/j.trc.2023.104033_b0415
  article-title: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
– volume: 115
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0420
  article-title: Data driven model free adaptive iterative learning perimeter control for large-scale urban road networks
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2020.102618
– volume: 2016
  start-page: 2094
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0520
  article-title: Deep Reinforcement Learning with Double Q-learning
  publication-title: 30th AAAI Conf. Artif. Intell. AAAI
– volume: 70
  start-page: 255
  year: 2014
  ident: 10.1016/j.trc.2023.104033_b0120
  article-title: On the impacts of locally adaptive signal control on urban network stability and the macroscopic fundamental diagram
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2014.09.010
– volume: 109
  year: 2021
  ident: 10.1016/j.trc.2023.104033_b0450
  article-title: Stabilization of city-scale road traffic networks via macroscopic fundamental diagram-based model predictive perimeter control
  publication-title: Control Eng. Pract.
  doi: 10.1016/j.conengprac.2021.104750
– ident: 10.1016/j.trc.2023.104033_b0055
– volume: 77
  start-page: 495
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0190
  article-title: Coordinated distributed adaptive perimeter control for large-scale urban road networks
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2016.12.002
– volume: 33
  start-page: 74
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0260
  article-title: Urban congestion gating control based on reduced operational network fundamental diagrams
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2013.04.010
– volume: 145
  year: 2022
  ident: 10.1016/j.trc.2023.104033_b0300
  article-title: Quasi revenue-neutral congestion pricing in cities: Crediting drivers to avoid city centers
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2022.103932
– volume: 36
  start-page: 177
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0525
  article-title: Max pressure control of a network of signalized intersections
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2013.08.014
– volume: 11
  start-page: 323
  year: 1969
  ident: 10.1016/j.trc.2023.104033_b0150
  article-title: The mechanism of a road network
  publication-title: Traffic Eng. Control
– volume: 42
  start-page: 759
  year: 2008
  ident: 10.1016/j.trc.2023.104033_b0135
  article-title: Existence of urban-scale macroscopic fundamental diagrams: Some experimental findings
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2008.02.002
– volume: 128
  year: 2021
  ident: 10.1016/j.trc.2023.104033_b0455
  article-title: Modeling, estimation, and control in large-scale urban road networks with remaining travel distance dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2021.103157
– volume: 118
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0100
  article-title: Empirical analysis of large-scale multimodal traffic with multi-sensor data
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2020.102725
– ident: 10.1016/j.trc.2023.104033_b0130
  doi: 10.1007/978-1-4419-0820-9_11
– volume: 2124
  start-page: 127
  year: 2009
  ident: 10.1016/j.trc.2023.104033_b0030
  article-title: Exploring the Impact of Homogeneity of Traffic Measurements on the Existence of Macroscopic Fundamental Diagrams
  publication-title: Transp. Res. Rec. J. Transp. Res. Board
  doi: 10.3141/2124-12
– volume: 142
  year: 2022
  ident: 10.1016/j.trc.2023.104033_b0040
  article-title: Data efficient reinforcement learning and adaptive optimal perimeter control of network traffic dynamics
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2022.103759
– ident: 10.1016/j.trc.2023.104033_b0110
  doi: 10.3141/2301-09
– volume: 55
  start-page: 265
  year: 2013
  ident: 10.1016/j.trc.2023.104033_b0010
  article-title: Perimeter and boundary flow control in multi-reservoir heterogeneous networks
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2013.07.003
– year: 2018
  ident: 10.1016/j.trc.2023.104033_b0480
– ident: 10.1016/j.trc.2023.104033_b0280
– volume: 19
  start-page: 1112
  year: 2018
  ident: 10.1016/j.trc.2023.104033_b0445
  article-title: Economic Model Predictive Control of Large-Scale Urban Road Networks via Perimeter Control and Regional Route Guidance
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2017.2716541
– volume: 1112
  start-page: 78
  year: 1987
  ident: 10.1016/j.trc.2023.104033_b0550
  article-title: Urban traffic network flow models
  publication-title: Transp. Res. Rec.
– volume: 90
  start-page: 56
  year: 2016
  ident: 10.1016/j.trc.2023.104033_b0080
  article-title: Traffic flow on signalized streets
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2016.03.010
– volume: 45
  start-page: 605
  year: 2011
  ident: 10.1016/j.trc.2023.104033_b0145
  article-title: Properties of a well-defined macroscopic fundamental diagram for urban traffic
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2010.11.004
– volume: 2018
  start-page: 3207
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0215
  article-title: Deep Reinforcement Learning that Matters
  publication-title: 32nd AAAI Conf. Artif. Intell. AAAI
– volume: 45
  start-page: 643
  year: 2011
  ident: 10.1016/j.trc.2023.104033_b0115
  article-title: Clockwise hysteresis loops in the Macroscopic Fundamental Diagram: An effect of network instability
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2010.11.006
– volume: 75
  start-page: 89
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0075
  article-title: Distance-dependent congestion pricing for downtown zones
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2015.02.010
– volume: 61
  start-page: 134
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0170
  article-title: Optimal coupled and decoupled perimeter control in one-region cities
  publication-title: Control Eng. Pract.
  doi: 10.1016/j.conengprac.2017.01.010
– volume: 117
  start-page: 687
  year: 2018
  ident: 10.1016/j.trc.2023.104033_b0570
  article-title: Robust perimeter control for two urban regions with macroscopic fundamental diagrams: A control-Lyapunov function approach
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2017.09.008
– ident: 10.1016/j.trc.2023.104033_b0325
– ident: 10.1016/j.trc.2023.104033_b0335
  doi: 10.1287/trsc.18.4.362
– volume: 137
  start-page: 47
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0360
  article-title: H∞ robust perimeter flow control in urban networks with partial information feedback
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2019.03.010
– ident: 10.1016/j.trc.2023.104033_b0240
– volume: 8
  start-page: 293
  year: 1992
  ident: 10.1016/j.trc.2023.104033_b0315
  article-title: Self-improving reactive agents based on reinforcement learning, planning and teaching
  publication-title: Mach. Learn.
  doi: 10.1023/A:1022628806385
– volume: 32
  start-page: 289
  year: 2008
  ident: 10.1016/j.trc.2023.104033_b0380
  article-title: Optimal and Approximate Q-value Functions for Decentralized POMDPs
  publication-title: J. Artif. Intell. Res.
  doi: 10.1613/jair.2447
– volume: 81
  start-page: 904
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0285
  article-title: Stochastic approximations for the macroscopic fundamental diagram of urban networks
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2015.09.002
– volume: 59
  start-page: 308
  year: 2015
  ident: 10.1016/j.trc.2023.104033_b0265
  article-title: Controller design for gating traffic control in presence of time-delay in urban road networks
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2015.04.031
– ident: 10.1016/j.trc.2023.104033_b0385
– ident: 10.1016/j.trc.2023.104033_b0490
– ident: 10.1016/j.trc.2023.104033_b0045
– volume: 96
  start-page: 1
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0175
  article-title: Optimal perimeter control synthesis for two urban regions with aggregate boundary queue dynamics
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2016.10.016
– volume: 8
  start-page: 279
  year: 1992
  ident: 10.1016/j.trc.2023.104033_b0540
  article-title: Q-learning
  publication-title: Mach. Learn.
– volume: 91
  start-page: 250
  year: 2016
  ident: 10.1016/j.trc.2023.104033_b0430
  article-title: Clustering of heterogeneous networks with directional flows based on “Snake” similarities
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2016.05.008
– year: 2021
  ident: 10.1016/j.trc.2023.104033_b0350
  article-title: Joint Route Guidance and Demand Management for Real-Time Control of Multi-Regional Traffic Networks
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 41
  start-page: 49
  year: 2007
  ident: 10.1016/j.trc.2023.104033_b0065
  article-title: Urban gridlock: Macroscopic modeling and mitigation approaches
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2006.03.001
– start-page: 67
  year: 1982
  ident: 10.1016/j.trc.2023.104033_b0330
  article-title: Scats: The Sydney coordinated adaptive traffic system - principles, methodology, algorithms
  publication-title: International Conference of Road Traffic Signal.
– ident: 10.1016/j.trc.2023.104033_b0475
– volume: 2623
  start-page: 98
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0320
  article-title: Spatiotemporal Partitioning of Transportation Network Using Travel Time Data
  publication-title: Transp. Res. Rec. J. Transp. Res. Board
  doi: 10.3141/2623-11
– year: 2016
  ident: 10.1016/j.trc.2023.104033_b0440
  article-title: Prioritized experience replay
– volume: 114
  start-page: 1
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0500
  article-title: Evaluation of analytical approximation methods for the macroscopic fundamental diagram
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2020.02.003
– volume: 124
  year: 2021
  ident: 10.1016/j.trc.2023.104033_b0580
  article-title: Model-free perimeter metering control for two-region urban networks using deep reinforcement learning
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2020.102949
– volume: 134
  year: 2022
  ident: 10.1016/j.trc.2023.104033_b0125
  article-title: Dynamic optimal congestion pricing in multi-region urban networks by application of a Multi-Layer-Neural network
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2021.103485
– volume: 46
  start-page: 1159
  year: 2012
  ident: 10.1016/j.trc.2023.104033_b0185
  article-title: On the stability of traffic perimeter control in two-region urban cities
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2012.04.004
– volume: 146
  year: 2023
  ident: 10.1016/j.trc.2023.104033_b0365
  article-title: Alpha-fair large-scale urban network control: A perimeter control based on a macroscopic fundamental diagram
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2022.103961
– start-page: 1
  year: 2014
  ident: 10.1016/j.trc.2023.104033_b0370
  article-title: Accuracy of Networkwide Traffic States Estimated from Mobile Probe Data
  publication-title: Transp. Res. Rec. J. Transp. Res. Board
  doi: 10.3141/2421-01
– ident: 10.1016/j.trc.2023.104033_b0095
  doi: 10.1609/aaai.v32i1.11794
– ident: 10.1016/j.trc.2023.104033_b0390
  doi: 10.1080/21680566.2017.1337528
– volume: 10642 LNAI
  start-page: 66
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0160
  article-title: Cooperative Multi-agent Control Using Deep Reinforcement Learning
  publication-title: Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics)
– year: 2016
  ident: 10.1016/j.trc.2023.104033_b0155
– year: 2021
  ident: 10.1016/j.trc.2023.104033_b0530
  article-title: Off-Policy Multi-Agent Decomposed Policy Gradients
– volume: 105
  start-page: 193
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0435
  article-title: Dynamic clustering and propagation of congestion in heterogeneously congested urban traffic networks
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2017.08.021
– volume: 83
  start-page: 120
  year: 2017
  ident: 10.1016/j.trc.2023.104033_b0060
  article-title: Network traffic flow optimization under performance constraints
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2017.08.002
– volume: 116
  year: 2020
  ident: 10.1016/j.trc.2023.104033_b0470
  article-title: Neuro-dynamic programming for optimal control of macroscopic fundamental diagram systems
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2020.102628
– ident: 10.1016/j.trc.2023.104033_b0235
– volume: 1–12
  year: 2019
  ident: 10.1016/j.trc.2023.104033_b0295
  article-title: Data-Driven Model Free Adaptive Perimeter Control for Multi-Region Urban Traffic Networks With Route Choice
  publication-title: IEEE Trans. Intell. Transp. Syst.
SSID ssj0001957
Score 2.5246513
Snippet •A model-free control scheme proposed for multi-region perimeter metering control•Scalability demonstrated via numerical simulations on a seven-region urban...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 104033
SubjectTerms Macroscopic Fundamental Diagram (MFD)
Model-free multi-agent reinforcement learning (MARL)
Multi-region perimeter metering control
Title Scalable multi-region perimeter metering control for urban networks: A multi-agent deep reinforcement learning approach
URI https://dx.doi.org/10.1016/j.trc.2023.104033
Volume 148
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Baden-Württemberg Complete Freedom Collection (Elsevier)
  customDbUrl:
  eissn: 1879-2359
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001957
  issn: 0968-090X
  databaseCode: GBLVA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier ScienceDirect (LUT)
  customDbUrl:
  eissn: 1879-2359
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001957
  issn: 0968-090X
  databaseCode: ACRLP
  dateStart: 19950201
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals [SCFCJ]
  customDbUrl:
  eissn: 1879-2359
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001957
  issn: 0968-090X
  databaseCode: AIKHN
  dateStart: 19950201
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
– providerCode: PRVESC
  databaseName: ScienceDirect (Elsevier)
  customDbUrl:
  eissn: 1879-2359
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0001957
  issn: 0968-090X
  databaseCode: .~1
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LS8NAEF5KPagH0apYH2UPnoS1eWyyWW-lWKpiL7WQW9jNbqQiaUlTvPnb3Uk2WkE9eAkkzECY2exMdr75BqFLmYkgVZIRycKMUKF8wiMpSCa9wPcU45pD7_DjJBzP6H0cxC00bHphAFZp9_56T692a_ukb63ZX87n_alJviOHO7FJoiESwX87pQymGFy_f8E8XF6zfRphOJOIm8pmhfEqC2Ax9HyodDq-_3Ns2og3o320ZxNFPKjf5QC1dN5B200f8aqDdjeoBA_R29QYG9qgcAURJDBwYZHjmr7f2A5XVyOKLTgdm2wVrwspcpzXUPDVDR5YbQENV1hpvcSFrrhV0-oYEdshE8-44SI_QrPR7dNwTOxQBZJ6nJUkU5EWWknBnQzCPdVhKrRkvis0M0YSjvApU5xqqMEAwZ_5bBX3TCpDA1d6_jFq54tcnyCsXZ3JwBFUAqk-oyJVQDTqRlEo0tQNu8hpzJmklnEcBl-8Jg207CUxHkjAA0ntgS66-lRZ1nQbfwnTxkfJtzWTmHDwu9rp_9TO0A7c1fizc9Qui7W-MAlJKXvViuuhrcHdw3jyAcfC4Vc
linkProvider Elsevier
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELYqGAoDggKiPD0wIYXm4cQxW1VRFWi7tJWyRXbsoCKUVmkqNn47vsSBIgEDS4bEJ0V3tu_s--47hK5Fyv1ECmoJGqQW4dKzWCi4lQrX91xJmWJQOzwaB4MZeYz8qIF6dS0MwCrN3l_t6eVubd50jDY7y_m8M9HBd2gzO9JBNHgifW7fJr5L4QR2-_6F83BYRfepR8OlRFSnNkuQV5EDjaHrQarT9ryfndOGw-nvoz0TKeJu9TMHqKGyFmrWhcSrFtrd4BI8RG8TrW2og8IlRtCCjguLDFf8_Vp5uHzqodig07EOV_E6FzzDWYUFX93hrpHmUHGFpVJLnKuSXDUp7xGx6TLxjGsy8iM0699PewPLdFWwEpfRwkplqLiSgjM7BX9PVJBwJajncEW1krjNPUIlIwqSMMDwp9etZK6OZYjvCNc7RlvZIlMnCCtHpcK3ORHAqk8JTyQwjTphGPAkcYI2smt1xomhHIfOF69xjS17ibUFYrBAXFmgjW4-RZYV38Zfg0lto_jbpIm1P_hd7PR_YleoOZiOhvHwYfx0hnbgSwVGO0dbRb5WFzo6KcRlOfs-ABep4uw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Scalable+multi-region+perimeter+metering+control+for+urban+networks%3A+A+multi-agent+deep+reinforcement+learning+approach&rft.jtitle=Transportation+research.+Part+C%2C+Emerging+technologies&rft.au=Zhou%2C+Dongqin&rft.au=Gayah%2C+Vikash+V.&rft.date=2023-03-01&rft.issn=0968-090X&rft.volume=148&rft.spage=104033&rft_id=info:doi/10.1016%2Fj.trc.2023.104033&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_trc_2023_104033
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0968-090X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0968-090X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0968-090X&client=summon