SLIDE 1 Heterogeneous Stream Processing and Crowdsourcing for Urban Traffic Management
Alexander Artikis1, Matthias Weidlich2, Francois Schnitzler3, Ioannis Boutsis4, Thomas Liebig5, Nico Piatkowski5, Christian Bockermann5, Katharina Morik5, Vana Kalogeraki4, Jakub Marecek6, Avigdor Gal3, Shie Mannor3, Dermot Kinane7 and Dimitrios Gunopulos8
1NCSR Demokritos, Greece, 2Imperial College London, UK, 3Technion, Israel, 4Athens University of Economics and Business, Greece, 5Technical University Dortmund, Germany, 6IBM Research, Ireland, 7Dublin City Council, Ireland, 8University of Athens, Greece
http://www.insight-ict.eu/
SLIDE 2 Urban Traffic Management Challenges
◮ Volume
◮ Increasing number of data sources.
◮ Variety
◮ Heterogeneous data sources.
◮ Veracity
◮ Inaccurate measurements, network failures, interference of
mediators.
◮ Sparsity
◮ Traffic in several locations is never/infrequently monitored.
SLIDE 3 Addressing the Challenges
◮ Volume
◮ Stream processing.
◮ Variety
◮ Complex event processing.
◮ Veracity
◮ Crowdsourcing.
◮ Sparsity
◮ Traffic modelling.
SLIDE 4
The Insight System
SLIDE 5 Complex Event Processing
◮ Data variety problem: heterogeneous event sources
◮ Buses: position, direction, route, congestion. ◮ SCATS sensors: traffic flow, traffic density.
◮ Solution: complex event processing
◮ Compute bus punctuality, bus driving quality, traffic congestion
trend, traffic congestion.
◮ Engine: Event Calculus for Run-Time reasoning (RTEC)
◮ Formal, declarative semantics. ◮ Interval-based reasoning. ◮ Succinct & intuitive representation of complex event patterns. ◮ Highly efficient (for complex event hierarchies). ◮ Machine learning support for automated construction of
complex event patterns
SLIDE 6
Complex Event Processing
Buses reporting congestion at some location (Lon, Lat) of interest: busCongestion(Lon, Lat) initiated iff move(Bus, LonB, LatB, 1) happens, close(LonB, LatB, Lon, Lat) busCongestion(Lon, Lat) terminated iff move(Bus, LonB, LatB, 0) happens, close(LonB, LatB, Lon, Lat)
SLIDE 7
Complex Event Processing
Identifying mismatches among different streams: disagree(Bus, LonI, LatI, 1) happens if move(Bus, LonB, LatB, 1) happens, close(LonB, LatB, LonI, LatI),
not (scatsCongestion(LonI, LatI) = true holds)
disagree(Bus, LonI, LatI, 0) happens if move(Bus, LonB, LatB, 0) happens, close(LonB, LatB, LonI, LatI), scatsCongestion(LonI, LatI) = true holds
SLIDE 8
Complex Event Processing
Dealing with event source disagreement: noisy(Bus) = true initiatedAt T iff disagree(Bus, LonI, LatI, BusVal) happensAt T, crowd(LonI, LatI, CrowdVal) happensAt T ′, BusVal = CrowdVal, 0 < T ′−T < threshold noisy(Bus) = true terminated if agree(Bus) happens noisy(Bus) = true terminatedAt T if disagree(Bus, LonI, LatI, BusVal) happensAt T, crowd(LonI, LatI, CrowdVal) happensAt T ′, BusVal = CrowdVal, 0 < T ′−T < threshold
SLIDE 9
Self-Adaptive Complex Event Processing
Discarding temporarily unreliable event sources: busReportedCongestion(Lon, Lat) initiated iff move(Bus, LonB, LatB, 1) happens, ¬ (noisy(Bus) holds), close(LonB, LatB, Lon, Lat) busReportedCongestion(Lon, Lat) terminated iff move(Bus, LonB, LatB, 0) happens, ¬ (noisy(Bus) holds), close(LonB, LatB, Lon, Lat)
SLIDE 10 Complex Event Processing in Dublin
1 2 3 4 5 6 7 8 9 10 10 min = 12,5K SDE 30 min = 40,5K SDE 50 min = 67K SDE 70 min = 94,5K SDE 90 min = 124K SDE 110 min = 152K SDE
Time (sec) Working Memory
Static Event Recognition Self-Adaptive Event Recognition
SLIDE 11 Crowdsourcing
◮ Data veracity problem: Inaccurate measurements, network
failures, interference of mediators.
◮ Solution: Query human volunteers (imperfect experts) close to
the location of event source disagreement.
◮ Model the reliability of each participant
◮ Online Expectation-Maximisation.
◮ Use participant reliability to improve the aggregation of
answers.
SLIDE 12 Crowdsourcing
200 400 600 800 1000
0.5 1 Relative estimation error of pi 1 2 3 4 5 6 7 8 9 10 Number of queries to participant i
SLIDE 13
Crowdsourcing—Query Execution Engine
◮ Communicate queries to
participants reliably & efficiently.
◮ MapReduce is used. ◮ Each participant registers
using a mobile device.
◮ Select a list of participants
based on reliability, location, etc.
◮ Disseminate the query. ◮ Aggregate answers.
SLIDE 14
Crowdsourcing—Query Execution Engine
100 200 300 400 500 600 700 800 900 1000 2G 3G WiFi Latency (ms) Trigger Task Send Push Notification Communication Time
SLIDE 15 Traffic Modelling
◮ Data sparsity problem: Several parts of the city are
never/infrequently monitored.
◮ Solution: Generalise observations of monitored locations to
produce estimates for locations without sensors.
◮ Scalability to city-sized areas is achieved by modelling the usual
case.
◮ Traffic network is represented with a Gaussian Process
regression framework
◮ SCATS intersections: observed traffic flow values. ◮ Variables are highly correlated if they are adjacent in the traffic
network.
SLIDE 16
Traffic Modelling: Map of Dublin
SLIDE 17
Traffic Modelling: Street Network & SCATS locations
SLIDE 18
Traffic Modelling: Traffic Flow Estimates
SLIDE 19 Summary
Insight solution to Urban Traffic Management:
◮ Volume
◮ Stream processing.
◮ Variety
◮ Complex event processing.
◮ Veracity
◮ Crowdsourcing.
◮ Sparsity
◮ Traffic modelling.