Deutsche Bank
COO Chief Data Office
Estimating Large Scale Population Movement ML Dublin Meetup
John Doyle PhD
Assistant Vice President CDO Research & Development Science & Innovation
john.doyle@db.com https://www.db.com/ireland/
Estimating Large Scale Population Movement ML Dublin Meetup John - - PowerPoint PPT Presentation
Deutsche Bank COO Chief Data Office Estimating Large Scale Population Movement ML Dublin Meetup John Doyle PhD Assistant Vice President CDO Research & Development Science & Innovation john.doyle@db.com https://www.db.com/ireland/
Assistant Vice President CDO Research & Development Science & Innovation
john.doyle@db.com https://www.db.com/ireland/
2 Deutsche Bank COO - Chief Data Ofce
Estimating Large Scale Population Movement Presentation Outline
Mobility: Trajectories & Large Scale Movement Application: How to Use the Data Conclusions: Summary of the Research Introduction: Research Motivation & Data Population: Density Estimates
3 Deutsche Bank COO - Chief Data Ofce
4 Deutsche Bank COO - Chief Data Ofce
– Approximately 1 million customers generating over 1.5 billion records
Mobile Operator CDR Collection Server BS1 BS2 U1 U2
5 Deutsche Bank COO - Chief Data Ofce
Trajectory Information Cell Activities User Social / Cell Network
CDR Spatiotemporal Data Types
Sampling rate
Spatial Resolution
coverage areas
4 : 8 : 1 2 ; 1 6 : 2 : 2 4 : . 2 . 4 . 6 . 8 1 T i m e V i s i b i l e P r
t i
P
u l a t i
Voronoi cells
2 . 5 2 . 6 2 . 7 2 . 8 2 . 9 3 3 . 1 3 . 2 3 . 3 3 . 4 x 1
5
2 . 6 2 . 8 3 3 . 2 3 . 4 3 . 6 x 1
5
E a s t i n g N
t h i n g
2 . 5 2 . 6 2 . 7 2 . 8 2 . 9 3 3 . 1 3 . 2 3 . 3 3 . 4 x 1
5
2 . 6 2 . 8 3 3 . 2 3 . 4 3 . 6 x 1
5
E a s t i n g N
t h i n g
2 . 5 2 . 6 2 . 7 2 . 8 2 . 9 3 3 . 1 3 . 2 3 . 3 3 . 4 x 1
5
2 . 6 2 . 8 3 3 . 2 3 . 4 3 . 6 x 1
5
E a s t i n g N
t h i n g
2 . 5 2 . 6 2 . 7 2 . 8 2 . 9 3 3 . 1 3 . 2 3 . 3 3 . 4 x 1
5
2 . 6 2 . 8 3 3 . 2 3 . 4 3 . 6 x 1
5
E a s t i n g N
t h i n g
Within each 15-minute temporal window, the estimate of location is based on the last recorded servicing cell tower recorded for that subscriber during that period. CDR trajectory state sequence sampling of the output sequence S = {S1, S1, S3, S3, S4}. Smaller yellow circles represent actual regional transitions within a sample period and larger yellow circles represent the observed output transition sequence before resampling.
population metrics, which includes among others population count, religious status, material status and household occupancy.
infrastructure and public services.
cost of carrying out a census is prohibitively expensive. As a result a census may be only carried out every 5-10 years.
information on the current status of a population.
where W is a matrix with identical rows w, and all components of w sum to 1.
probability of observing that subscriber at a region in space over a long period of time.
where Q is a modified Markov chain, R is the number of states, J is a R x R matrix of
random transition probabilities introduced by the term J/R
data and maximum weighting approach is approximately 98.4%.
data and aggregated approach is approximately 97.7%.
restricted by its ability to measure population proportions in different areas, but not the ability to estimate counts, the effectiveness of such techniques for inferring census type data needs further research and is the subject of future work.