ACCENTS Journals

Download PDF

Hybrid FPA: a fairness-, pressure-, and emission-aware deep reinforcement learning framework for adaptive traffic signal control

Mihir R. Panchal¹ and Pankaj P. Prajapati²

Research Scholar, Department of Electronics and Communication Engineering,Government Engineering College, Modasa, Gujarat Technological University, Ahmedabad, Gujarat,India¹
Associate Professor, Department of Electronics and Communication Engineering,Vishwakarma Government Engineering College, Gujarat Technological University, Ahmedabad,Gujarat,India²

Corresponding Author : Mihir R. Panchal

Recieved : 22-August-2025; Revised : 19-April-2026; Accepted : 20-April-2026

Abstract

In urban networks, signalized intersections represent critical bottlenecks that often lead to fuel wastage, prolonged delays, and increased emissions of carbon dioxide (CO₂), carbon monoxide (CO), nitrogen oxides (NOx), hydrocarbons (HC), and particulate matter (PMx). Conventional fixed-time and actuated controllers tend to degrade under nonstationary demand, heterogeneous traffic composition, and recurrent traffic surges, resulting in congestion and environmental degradation. This study proposes a hybrid flow pressure-aware (Hybrid FPA) framework, a fairness–pressure–aware deep reinforcement learning (DRL) approach that integrates switching penalties, emission-weighted pressure, and fairness regularization into a unified reward function. The framework is supported by a double dueling deep Q-network (D3QN) with prioritized experience replay (PER) for enhanced stability, along with a compact one-dimensional convolutional (Conv1D) encoder. Using traffic data collected from unmanned aerial vehicles (UAVs), the proposed framework is evaluated on a Simulation of Urban MObility (SUMO)-based reconstruction of the Panjarapol Cross Road intersection in Ahmedabad. The performance is compared with state-of-the-art methods, including PressLight, convolutional block attention module D3QN (CBAM-D3QN), multi-directional D3QN (MD3QN), and partial detection D3QN (PD3QN). Experimental results over 100 training episodes demonstrate that Hybrid FPA reduces average waiting time by approximately 78% and queue length by about 65%, while also decreasing fuel consumption by nearly 27% and CO₂ emissions by 29%. Additionally, PMx emissions are reduced by nearly 20%, HC by over 20%, and NOx by approximately 25%, along with reductions in other pollutants. The phase-switching frequency is also lowered by more than one-third, contributing to smoother traffic flow and further indirect emission reductions. These findings indicate that reinforcement learning can effectively achieve both mobility and sustainability objectives by explicitly incorporating fairness, pressure balancing, and multi-pollutant optimization. Therefore, Hybrid FPA represents a dynamic and efficient paradigm for adaptive traffic signal control with strong potential for real-world implementation.

Keywords

Hybrid flow pressure-aware (Hybrid FPA), Deep reinforcement learning (DRL), Traffic signal control, Emission reduction, Intelligent transportation systems, SUMO simulation.

Cite this article

Panchal MR, Prajapati PP. Hybrid FPA: a fairness-, pressure-, and emission-aware deep reinforcement learning framework for adaptive traffic signal control. International Journal of Advanced Technology and Engineering Exploration. 2026;13(137):541-565. DOI : 10.19101/IJATEE.2025.121221179

References

[1] Van DBP, Arentze T, Timmermans H. A path analysis of social networks, telecommunication and social activity–travel patterns. Transportation Research Part C: Emerging Technologies. 2013; 26:256-68.

[Crossref] [Google Scholar]

[2] Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, et al. Human-level control through deep reinforcement learning. Nature. 2015; 518(7540):529-33.

[Crossref] [Google Scholar]

[3] Van HH, Guez A, Silver D. Deep reinforcement learning with double q-learning. In proceedings of the AAAI conference on artificial intelligence 2016 (pp. 2094-100). PKP.

[Crossref] [Google Scholar]

[4] Wang Z, Schaul T, Hessel M, Hasselt H, Lanctot M, Freitas N. Dueling network architectures for deep reinforcement learning. In international conference on machine learning 2016 (pp. 1995-2003). PMLR.

[Google Scholar]

[5] Horgan D, Quan J, Budden D, Barth-maron G, Hessel M, Van HH, et al. Distributed prioritized experience replay. In international conference on learning representations 2018 (pp. 1-19).

[Google Scholar]

[6] Wei H, Zheng G, Yao H, Li Z. Intellilight: a reinforcement learning approach for intelligent traffic light control. In proceedings of the 24th SIGKDD international conference on knowledge discovery & data mining 2018 (pp. 2496-505). ACM.

[Crossref] [Google Scholar]

[7] Wei H, Chen C, Zheng G, Wu K, Gayah V, Xu K, et al. Presslight: learning max pressure control to coordinate traffic signals in arterial network. In proceedings of the 25th SIGKDD international conference on knowledge discovery & data mining 2019 (pp. 1290-8). ACM.

[Crossref] [Google Scholar]

[8] Zhao H, Dong C, Cao J, Chen Q. A survey on deep reinforcement learning approaches for traffic signal control. Engineering Applications of Artificial Intelligence. 2024; 133:108100.

[Crossref] [Google Scholar]

[9] Zhang K, Cui Z, Ma W. A survey on reinforcement learning-based control for signalized intersections with connected automated vehicles. Transport Reviews. 2024; 44(6):1187-208.

[Crossref] [Google Scholar]

[10] Michailidis P, Michailidis I, Lazaridis CR, Kosmatopoulos E. Traffic signal control via reinforcement learning: a review on applications and innovations. Infrastructures. 2025; 10(5):1-41.

[Crossref] [Google Scholar]

[11] Zhang G, Chang F, Huang H, Zhou Z. Dual-objective reinforcement learning-based adaptive traffic signal control for decarbonization and efficiency optimization. Mathematics. 2024; 12(13):2056.

[Crossref] [Google Scholar]

[12] Haydari A, Aggarwal V, Zhang M, Chuah CN. Constrained reinforcement learning for fair and environmentally efficient traffic signal controllers. Journal on Autonomous Transportation Systems. 2024; 2(1):1-9.

[Crossref] [Google Scholar]

[13] Wang P, Ni W. An enhanced dueling double deep Q-network with convolutional block attention module for traffic signal optimization in deep reinforcement learning. IEEE Access. 2024; 12:44224-32.

[Crossref] [Google Scholar]

[14] Ducrocq R, Farhi N. Deep reinforcement Q-learning for intelligent traffic signal control with partial detection. International Journal of Intelligent Transportation Systems Research. 2023; 21(1):192-206.

[Crossref] [Google Scholar]

[15] Fan S, Lu K, Wang Y, Tian X, Zhang M. Action masking-based proximal policy optimization with the dual-ring phase structure for adaptive traffic signal control. IEEE Transactions on Intelligent Transportation Systems. 2024; 26(2):2422-33.

[Crossref] [Google Scholar]

[16] Obaid L, Hamad K, Al-ruzouq R, Dabous SA, Ismail K, Alotaibi E. State-of-the-art review of unmanned aerial vehicles (UAVs) and artificial intelligence (AI) for traffic and safety analyses: recent progress, applications, challenges, and opportunities. Transportation Research Interdisciplinary Perspectives. 2025; 33:101591.

[Crossref] [Google Scholar]

[17] Sarkar DR, Rao KR, Chatterjee N. Automatic traffic safety analysis using unmanned aerial vehicle technology at unsignalized intersections in heterogeneous traffic. Transportation Research Record. 2025; 2679(2):1274-90.

[Crossref] [Google Scholar]

[18] Deng Y, Garoni TM, Grimm J, Nasrawi A, Zhou Z. The length of self-avoiding walks on the complete graph. Journal of Statistical Mechanics: Theory and Experiment. 2019; 2019(10):103206.

[Crossref] [Google Scholar]

[19] Guan J, Yang X, Liu P, Oeser M, Hong H, Li Y, et al. Multi-scale asphalt pavement deformation detection and measurement based on machine learning of full field-of-view digital surface data. Transportation Research Part C: Emerging Technologies. 2023; 152:104177.

[Crossref] [Google Scholar]

[20] Li C, Yan H, Zhao Q. Efficient policy transfer in large-scale traffic light control via multi-agent hierarchical reinforcement learning. In 19th international conference on automation science and engineering (CASE) 2023 (pp. 1-6). IEEE.

[Crossref] [Google Scholar]

[21] Lim J, Dalmeijer K, Guhathakurta S, Van HP. The bicycle network improvement problem. Journal of Transportation Engineering, Part A: Systems. 2022; 148(11):04022095.

[Crossref] [Google Scholar]

[22] Liu T, Meidani H. End-to-end heterogeneous graph neural networks for traffic assignment. Transportation Research Part C: Emerging Technologies. 2024; 165:1-14.

[Crossref] [Google Scholar]

[23] Jia W, Ji M. Multi-agent deep reinforcement learning for large-scale traffic signal control with spatio-temporal attention mechanism. Applied Sciences. 2025; 15(15):1-16.

[Crossref] [Google Scholar]

[24] Devailly FX, Larocque D, Charlin L. IG-RL: inductive graph reinforcement learning for massive-scale traffic signal control. IEEE Transactions on Intelligent Transportation Systems. 2021; 23(7):7496-507.

[Crossref] [Google Scholar]

[25] Zhang G, Chang F, Jin J, Yang F, Huang H. Multi-objective deep reinforcement learning approach for adaptive traffic signal control system with concurrent optimization of safety, efficiency, and decarbonization at intersections. Accident Analysis & Prevention. 2024; 199:107451.

[Crossref] [Google Scholar]

[26] Peng Z, Cheng H, Shi K, Zou C, Huang R, Li X, et al. Optimal tracking control for motion constrained robot systems via event-sampled critic learning. Expert Systems with Applications. 2023; 234:121085.

[Crossref] [Google Scholar]

[27] Wang Y, Wu Y, Tang Y, Li Q, He H. Cooperative energy management and eco-driving of plug-in hybrid electric vehicle via multi-agent reinforcement learning. Applied Energy. 2023; 332:120563.

[Crossref] [Google Scholar]

[28] Bideris-davos AA, Vovos PN. Co-optimization of power and water distribution systems for simultaneous hydropower generation and water pressure regulation. Energy Reports. 2024; 11:3135-48.

[Crossref] [Google Scholar]

[29] Sun X, Lu Y, Wu Q, Li H, Xu Z, Zheng L, et al. An eco-driving strategy of vehicle comfort-safety and energy efficiency based on deep reinforcement learning. Alexandria Engineering Journal. 2025; 132:252-63.

[Crossref] [Google Scholar]

[30] Oroojlooy A, Nazari M, Hajinezhad D, Silva J. Attendlight: universal attention-based reinforcement learning model for traffic signal control. Advances in Neural Information Processing Systems. 2020; 33:4079-90.

[Google Scholar]