patterns in a large
play

patterns in a large urban center ICIST 2019 - Paulo Figueiras - PowerPoint PPT Presentation

Big Data Analytics for extracting mobility patterns in a large urban center ICIST 2019 - Paulo Figueiras (paf@uninova.pt) Summary Motivation UNINOVA Big Data Architecture Processing & Analytics Performance Visualization


  1. Big Data Analytics for extracting mobility patterns in a large urban center ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  2. Summary  Motivation  UNINOVA Big Data Architecture  Processing & Analytics  Performance  Visualization  Conclusions ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  3. Motivation Public Transportation in Lisbon, Portugal  Independent public/private operators  One association (OTLIS) handles data coming from all operators   Ticket validations  Stations/stops locations and information  Users  Data sharing between operators is a challenge  Legal/business advantage issues  Privacy concerns Analytics performed with traditional techniques   Data gathered through questionnaires and human observations  Difficulty to get meaningful insights with traditional DW approaches ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  4. Research question  Which technologies can be used in order to provide useful insights about mobility patterns in large urban centers, considering large volumes of ticketing data from different operators? ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  5. UNINOVA Big Data Architecture ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  6. Processing & Analytics  Clean original data  Duplicates  Erroneous validations (e.g. consecutive entry validations on the same station)  Validations without location information  Consecutive entry and/or exit validations with less than 5 minutes between  Harmonize original data into three distinct formats: Validations, Users, Locations  Provide semantics via GTFS mappings of locations and routes  Create new knowledge/insights from collected data (about connections, transhipments and pendular movements) ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  7. Processing & Analytics ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  8. Performance  Test:  One month of data (May 2018): +55 million records  Before:  Oracle Cloud with traditional DW processes  Only pre-processing and visualization  Time span: Some days – one week  With UNINOVA Big Data Architecture:  Single node (AMD Ryzen 5 1600 - 12CPU’s, 32GB RAM (Corsair Vengeance LPX), SSD 120GB + 1TB HDD)  Pre-processing + analytics  Time span: 4hours (Reading/writing to MongoDB on each stage, no indexes) ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  9. Visualization ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

  10. Conclusions  Novel Big Data architecture for efficiently perform processing and analytics on public transportation data  The architecture spans the whole life cycle of Big Data  Development of an unsupervised approach to collect and process data, and to produce meaningful insights  Comparing with traditional DW processes, the architecture enables much better performances, even on a single machine  Less costs (with dedicated Cloud services)  Better knowledge and insights  Possibility to have an effient in-house solution ICIST 2019 - Paulo Figueiras (paf@uninova.pt)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend