 
              Real-time Lane Configuration with Coordinated Reinforcement Learning Presenter: Udesh Gunarathna Authors: Udesh Gunarathna, Hairuo Xie, Egemen Tanin, Shanika Karunasekara, Renata Borovica-Gajic University of Melbourne Udesh Gunarathna Real-time Lane Configuration 1 / 12
How Often Have You Stuck In Traffic Like This? Figure: Directionally imbalanced traffic. Congested traffic in one direction and oppose direction having less traffic. 1 1 https://i.dailymail.co.uk Udesh Gunarathna Real-time Lane Configuration 2 / 12
Real-time Lane-direction Configuration with Connected Autonomous Vehicles What is real-time lane-direction configuration ? Changing the travelling direction of lanes in road segments based on real-time traffic information in short time intervals. Why consider this problem now? Capabilities of Connected Autonomous Vehicles! Udesh Gunarathna Real-time Lane Configuration 3 / 12
Difficult to Compute? Yes! Lane-direction change in one road segment may affect traffic flow in neighboring road segments. A B C Figure: Three road segments A, B, C with different lane-configurations. What makes lane configuration computation difficult in real-time? Computation needs to be lightweight Udesh Gunarathna Real-time Lane Configuration 4 / 12
Proposed Architecture: C oordinated L earning-based L ane A llocation We propose an efficient multi-agent, scalable solution. Lane-directions proposed by RL Agents RL C C RL Road network C RL RL Coordinating Agents RL Agents Coordinated lane-directions by Coordinatiing Agents Figure: Architecture of CLLA consists of RL Agents that operate at the intersection level and Coordinating Agents who evaluate the global impact of local lane-direction changes. Udesh Gunarathna Real-time Lane Configuration 5 / 12
Why Existing Methods Fail? Existing approaches use mathematical programming to compute lane-direction allocation based on pre-known traffic patterns. Why existing methods cannot compute real-time lane-direction allocations? Inability to work with real-time data Computation cost is very high Microscopic simulation vs Macroscopic simulation gap Udesh Gunarathna Real-time Lane Configuration 6 / 12
Why Multi-agent Reinforcement Learning? Why reinforcement learning? Real-time control Lack of lane-changing traffic models Why not a single reinforcement learning agent? Exponential growth of state-space Difficulty of learning Coordination is the key! Network level impact of changes needs to be considered Distributed RL Agents’ action may conflict with each other Udesh Gunarathna Real-time Lane Configuration 7 / 12
C oordinated L earning-based L ane A llocation Coordinating Agents C Aggregated traffic information Coordinated Lane-directions C by Coordinating Agents Upper Layer C Lane-direction changes proposed by RL Agents Traffic information RL RL RL RL Agents RL RL RL Area contolled by a RL Agent RL RL RL Road network Bottom Layer Figure: Architecture of CLLA consists of RL Agents that operate at the intersection level and Coordinating Agents that evaluate the global impact of local lane-direction changes. Udesh Gunarathna Real-time Lane Configuration 8 / 12
CLLA Algorithm At every time step After every t a time steps Clear LLC Reset t LLC : List of changes Increase/ Vechile Paths Incoming Decrease Queue lengths Global Impact Vechicles Current lane-configurations Evaluation Coordinating Agent I Outgoing CLC : Coordinated changes Vehicles Pre-trained RL Agents
Global Impact Evaluation : Complexity Complexity O ( m × n ) m : number of proposed changes from RL Agents n : Number of neighbors per road segment n does not increase with the network size O ( m × n ) → O ( m ) Worst case: O ( | E | ), | E | : total number of road segments Distributed Version A distributed version can reduce the complexity further with a communication layer. Udesh Gunarathna Real-time Lane Configuration 10 / 12
Results from Manhattan Road Network Simulated using SMARTS [1], a microscopic simulator Using one hour of New York taxi data on Manhattan road network Baseline Travel Time(s) % of Vehicles with DFFT > 6 no-LA 604.32 45.9 LLA 585.83 48.6 DLA 496.12 50.7 CLLA 471.28 45.87 Table: Performance of baselines evaluated using New York taxi data. noLA is a baseline with no lane-direction allocations, LLA is similar to CLLA , without the upper-layer coordination and DLA is a baseline algorithm which allocates lane-directions based on aggregated traffic demand. Udesh Gunarathna Real-time Lane Configuration 11 / 12
Thank you Q & A Udesh Gunarathna Real-time Lane Configuration 12 / 12
Recommend
More recommend