pulse a real time system for crowd flow prediction at
play

PULSE: A Real Time System for Crowd Flow Prediction at Metropolitan - PowerPoint PPT Presentation

PULSE: A Real Time System for Crowd Flow Prediction at Metropolitan Subway Stations Ermal Toto, Prof. Elke A. Rundensteiner Prof. Yanhua Li 1 Outline Introduction Challenges State of the art Proposed Solution Experimental


  1. PULSE: A Real Time System for Crowd Flow Prediction at Metropolitan Subway Stations Ermal Toto, Prof. Elke A. Rundensteiner Prof. Yanhua Li 1

  2. Outline • Introduction • Challenges • State of the art • Proposed Solution • Experimental Evaluation Worcester Polytechnic Institute 2

  3. Urban Population Growth 1960: 1 Billion 2014: 3.9 Billion United Nations. (2014). World Urbanization Prospects 2014: Highlights . United Nations Publications. Worcester Polytechnic Institute 3

  4. Public Transportation • Growing size and complexity of transportation networks • Vital to the urbanization process • Heterogeneous modes of transport • Need for better coordination Annez, P. C., & Buckley, R. M. (2009). Urbanization and growth: setting the context. Urbanization and growth , 1, 1-45. Worcester Polytechnic Institute 4

  5. Subway Transaction Data Model • ID • Timestamp • Location • Action Morning Evening Worcester Polytechnic Institute 5

  6. Outline • Introduction • Challenges • State of the art • Proposed Solution • Experimental Evaluation Worcester Polytechnic Institute 6

  7. Different Traffic Patterns Different models are needed for different: • Stations. • Days of the week. • Prediction Horizons. Worcester Polytechnic Institute 7

  8. High Dimensionality Of Features • ML prediction models Subway Stations perform differently depending on traffic characteristics at a location and time. • Each location needs a Bus Stations different model. • Each model needs streams from multiple locations. Worcester Polytechnic Institute 8

  9. Problem statement. • In order to generate high accuracy localized predictions of human mobility, custom models are needed for each location. This process is complicated by the high dimensionality of data streams in urban transportation networks. Worcester Polytechnic Institute 9

  10. Outline • Introduction • Challenges • State of the art • Proposed Solution • Experimental Evaluation Worcester Polytechnic Institute 10

  11. Common Prediction Models • ARIMA Stathopoulos, A., & Karlaftis, M. G. (2003). A multivariate state space approach for urban traffic flow modeling and prediction. Transportation Research Part C: Emerging Technologies, 11(2), 121-135. Univariate Method. • Can be made • multivariate by handcrafting Prediction transfer functions t+1 t-3 that model the t-1 interaction between different variables in a regression model. Does not scale to • t-2 many nodes. t Current Time Worcester Polytechnic Institute 11

  12. Common Prediction Models Features : Local • Linear Models Streams, Remote Sun, H., Liu, H. X., Xiao, H., He, R. R., & Ran, B. (2003, Streams, Weather, January). Short term traffic forecasting using the local Time Information, etc. linear regression model. In 82nd Annual Meeting of the Transportation Research Board , Washington, DC. Prediction • K-Nearest Neighbors (KNN) Model Clark, S. (2003). Traffic prediction using multivariate nonparametric regression. Journal of transportation engineering , 129(2), 161-168. Prediction at t+i Features are selected • manually, therefore models are not scalable. Worcester Polytechnic Institute 12

  13. Common Prediction Models Features : Local • Random Forest Streams, Remote Hamner, B. (2010, December). Predicting travel times Streams, Weather, with context-dependent random forests by modeling Time Information, etc. local and aggregate traffic flow. In Data Mining Workshops (ICDMW), 2010 IEEE International Conference on (pp. 1357-1359). IEEE. • Artificial Neural Networks Prediction (ANN) Model Vlahogianni, E. I., Karlaftis, M. G., & Golias, J. C. (2005). Optimized and meta-optimized neural networks for short-term traffic flow prediction: a genetic approach. Transportation Research Part C: Emerging Technologies, 13(3), 211-234. Prediction at t+i Features are selected • manually, therefore models are not scalable. Worcester Polytechnic Institute 13

  14. Generalized Worcester Polytechnic Institute 14

  15. Hand Crafted Worcester Polytechnic Institute 15

  16. Stream Selection • Local Streams • All Streams • Hand picked streams. • Downstream traffic to upstream locations. • No automated methods for stream selection. Worcester Polytechnic Institute 16

  17. Outline • Introduction • Challenges • State of the art • Proposed Solution • Experimental Evaluation Worcester Polytechnic Institute 17

  18. PULSE: Framework • Multi-layered framework. • Customized Models: ─ Temporal Localization ─ Spatial Localization ─ Prediction horizon Worcester Polytechnic Institute 18

  19. Streaming Features • Time Interval – The operation hours are divided in 15 min time intervals [1 – 64]. • Day of the week [1 – 7]. • Weather – Temperature and humidity during each Time Interval. • Local Station arrivals and departures during each Time Interval. • Remote Station arrivals and departures during each Time Interval. Worcester Polytechnic Institute

  20. Personality Features Average number of Average time duration of arrivals to a station per trips arriving to a station (15min) interval. (from any station). Worcester Polytechnic Institute

  21. Personality Features Peak scores capture peak traffic behaviors during mornings and evenings, for both arrivals and departures. They are defined by the number of local outliers. Attrition Rate is the ratio of trips that are departing only (do not have a matching return trip). Worcester Polytechnic Institute

  22. Stream Selection: TBSS = 4 Horizon Time Based Stream Selection is based on the assumption that future arrivals at a station, will come from departures of other stations that are within the prediction horizon. Worcester Polytechnic Institute

  23. Stream Selection: TBSS =12 Horizon Time Based Stream Selection is based on the assumption that future arrivals at a station, will come from departures of other stations that are within the prediction horizon. Worcester Polytechnic Institute

  24. Stream Selection: FBSS Flow Based Stream Selection is based on the assumption that future arrivals at a target station will come from stations with high historical traffic to that station. Worcester Polytechnic Institute

  25. Stream Selection: FBSS Prediction Arrivals Actual 268006 Flow Time Interval Departures Departures 261017 262018 Worcester Polytechnic Institute

  26. Stream Selection: PFDBSSMin Worcester Polytechnic Institute

  27. Stream Selection Worcester Polytechnic Institute 27

  28. Model Selection – Temporal Localization Worcester Polytechnic Institute 28

  29. Model Selection – Temporal Localization • Personality Features are computed separately for weekdays and weekends. • Other temporal localizations exist: ─ Days and Nights ─ Holidays, Fridays etc .. • Temporal localization is treated as if it was a different spatial location. ─ A locations behavior during weekends, is assumed to have no relation to its behavior during the weekdays, unless otherwise described by the personality features. Worcester Polytechnic Institute 29

  30. Model Selection – Spatial Localization Brute Force: 118 Stations x 2 Time Periods x 6 prediction horizons x 118 TBSS • Values x 118 FBSS Values x 118 PFBSS Values ~ 2.3Billion Models 1 to 15 seconds to train and test each model (6seconds on average) ~ • 443.7years (For each ML Method, therefore x 5) . Our system works in parallel and utilize 30cores so it would only need • 14.7years : ). What about a network with 10K nodes? • 126Million Years 14.7 Years Worcester Polytechnic Institute 30

  31. Model Selection Personality Features, Horizon, Day of the Week • After an initial set of models have been discovered, future models can be quickly looked up from existing models. Classification Model Using • Further gradient search of KNN stream selection parameters, only slightly improves the performance when using a lookup model. • Post-hoc: Using KNN to decide the KNN, RF, ML method (Classification task) ANN, LM … gives a 80% accuracy, when trained with 50% of the discovered models and tested on the other 50%. Worcester Polytechnic Institute 31

  32. Outline • Introduction • Challenges • State of the art • Proposed Solution • Experimental Evaluation Worcester Polytechnic Institute 32

  33. Overall Results Worcester Polytechnic Institute 33

  34. Select Stations Worcester Polytechnic Institute 34

  35. Future Work Subway Stations Bus Stations Worcester Polytechnic Institute 35

  36. Questions? Worcester Polytechnic Institute 36

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend