Big Data Analytics for extracting mobility patterns in a large urban center
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
patterns in a large urban center ICIST 2019 - Paulo Figueiras - - PowerPoint PPT Presentation
Big Data Analytics for extracting mobility patterns in a large urban center ICIST 2019 - Paulo Figueiras (paf@uninova.pt) Summary Motivation UNINOVA Big Data Architecture Processing & Analytics Performance Visualization
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
Motivation UNINOVA Big Data Architecture Processing & Analytics Performance Visualization Conclusions
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
Public Transportation in Lisbon, Portugal
Independent public/private operators
One association (OTLIS) handles data coming from all operators
Ticket validations Stations/stops locations and information Users
Data sharing between operators is a challenge
Legal/business advantage issues Privacy concerns
Analytics performed with traditional techniques
Data gathered through questionnaires and human observations Difficulty to get meaningful insights with traditional DW approaches
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
Which technologies can be used in order to provide useful insights about
mobility patterns in large urban centers, considering large volumes of ticketing data from different operators?
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
Clean original data
Duplicates Erroneous validations (e.g. consecutive entry validations on the same station) Validations without location information Consecutive entry and/or exit validations with less than 5 minutes between
Harmonize original data into three distinct formats: Validations, Users,
Locations
Provide semantics via GTFS mappings of locations and routes Create new knowledge/insights from collected data (about connections,
transhipments and pendular movements)
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
Test:
One month of data (May 2018): +55 million records
Before:
Oracle Cloud with traditional DW processes Only pre-processing and visualization Time span: Some days – one week
With UNINOVA Big Data Architecture:
Single node (AMD Ryzen 5 1600 - 12CPU’s, 32GB RAM (Corsair Vengeance LPX), SSD
120GB + 1TB HDD)
Pre-processing + analytics Time span: 4hours (Reading/writing to MongoDB on each stage, no indexes)
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)
Novel Big Data architecture for efficiently perform processing and analytics
The architecture spans the whole life cycle of Big Data Development of an unsupervised approach to collect and process data, and to
produce meaningful insights
Comparing with traditional DW processes, the architecture enables much
better performances, even on a single machine
Less costs (with dedicated Cloud services) Better knowledge and insights Possibility to have an effient in-house solution
ICIST 2019 - Paulo Figueiras (paf@uninova.pt)