Impact of Deep RL-based Traffic Signal Control on Air Quality

One major source of air pollution is automobile emissions in urban areas. Although hybrid and fully electric vehicles are started to gain popularity, the majority of vehicles are still fuel-based. With the rapid advancement of artificial intelligence (AI) and automation based controllers, there have...

Full description

Saved in:
Bibliographic Details
Published inIEEE Vehicular Technology Conference pp. 1 - 6
Main Authors Haydari, Ammar, Zhang, Michael, Chuah, Chen-Nee, Ghosal, Dipak
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2021
Subjects
Online AccessGet full text
ISSN2577-2465
DOI10.1109/VTC2021-Spring51267.2021.9448639

Cover

More Information
Summary:One major source of air pollution is automobile emissions in urban areas. Although hybrid and fully electric vehicles are started to gain popularity, the majority of vehicles are still fuel-based. With the rapid advancement of artificial intelligence (AI) and automation based controllers, there have been numerous studies applying such learning-based techniques to Intelligent Transportation Systems (ITS). Combining deep neural networks with reinforcement learning (RL) models called DRL has shown promising results when applied to urban Traffic Signal Control (TSC) for adaptive adjustment of traffic light schedules. Centralized and decentralized DRL-based controller models are proposed in literature to optimize the total system travel time. However, the associated impact of such learning-based TSCs to the air quality remains unexplored. In this paper, we examine the impact of DRL-based TSCs on the environment in terms of fuel consumption and CO2 emission. We studied a major DRL approach called advantage actor-critic (A2C) using multi-agent settings on a synthetic multi-intersection network and on a real traffic network of San Francisco downtown with 24 hours traffic dataset. Our initial results indicate that learning based DRL methods achieved the lowest air pollution level on synthetic networks even with a simple delay-based reward function. However, DRL-based TSC performs slightly worse than rule-based adaptive TSCs (max-pressure control) in the San Francisco network.
ISSN:2577-2465
DOI:10.1109/VTC2021-Spring51267.2021.9448639