Demography, meet Big Data; Big Data, meet Demography: Reflections - - PowerPoint PPT Presentation

demography meet big data big data meet demography
SMART_READER_LITE
LIVE PREVIEW

Demography, meet Big Data; Big Data, meet Demography: Reflections - - PowerPoint PPT Presentation

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Demography, meet Big Data; Big Data, meet Demography: Reflections on the Data-Rich Future of Population Science Emmanuel


slide-1
SLIDE 1

Demography, meet Big Data; Big Data, meet Demography:

Reflections on the Data-Rich Future

  • f Population Science

Emmanuel Letouzé

Director & Co-Founder, Data-Pop Alliance

UNITED NATIONS EXPERT GROUP MEETING ON STRENGTHENING THE DEMOGRAPHIC EVIDENCE BASE FOR THE POST-2015 DEVELOPMENT AGENDA Session on ‘Complementing traditional data sources with alternative acquisition, analytic and visualization approaches to ensure better utilization

  • f data for sustainable development’

United Nations HQ, New York | October 6, 2015

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 1

slide-2
SLIDE 2

1—What are we talking about? 2—What has been done? 3—What could / should be done?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 2

slide-3
SLIDE 3

1—What are we talking about? 2—What has been done? 3—What could / should be done?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 3

slide-4
SLIDE 4

1—What are we talking about? 2—What has been done? 3—What could / should be done?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 4

slide-5
SLIDE 5

2009

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 5

slide-6
SLIDE 6

Demography is[finally] entering the ‘Data Revolution’ conversation

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 6

slide-7
SLIDE 7

Source: Martinho and Letouzé (2015)

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 7

slide-8
SLIDE 8

Demography—from the Greek demo, for people, and graphy, for writing,

  • r analysis, or field of study, is “the study of changes (such as the number of

births, deaths, marriages, and illnesses) that occur over a period of time in human populations”, or just “science of population”

  • 1. http://www.merriam-webster.com/dictionary/demography
  • 2. p://www.demogr.mpg.de/En/education_career/what_is_demography_1908/default.htm

First things first…what is demography?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 8

slide-9
SLIDE 9

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 9

slide-10
SLIDE 10

This is demography

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 10

slide-11
SLIDE 11

This is demography

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 11

slide-12
SLIDE 12

This is demography

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 12

slide-13
SLIDE 13

Demography as / is a (vibrant, complex) discipline

Survey, official stats, administrative data.. Demographic tools, methods, principles… Demography community

C h a r a c t e r i z e / m e a s u r e p

  • p

u l a t i

  • n

d y n a m i c s

(life tables, indirect estimates,..)

U n d e r s t a n d / a f f e c t p

  • p

u l a t i

  • n

d y n a m i c s

(hypothesis, instrumental variables,..to influence policies)

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 13

slide-14
SLIDE 14

What is Big Data?

From the 3 Vs of big data…

Lots of it.. From many sources High frequency== low temporal & geographical granularity

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 14

slide-15
SLIDE 15
  • 1. Exhaust data
  • 2. Web content
  • 3. Sensing data

…to the 3 Cs of Big Data as an ecosystem

‘Crumbs’ Capacities Community

  • 1. Soft & hardware
  • 2. Methods & tools
  • 3. Institutional & human
  • 1. Organizations
  • 2. Objectives
  • 3. Outputs/venues..

Crumbs Capacities Communities

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 15

slide-16
SLIDE 16
  • 1. Exhaust data
  • 2. Web content
  • 3. Sensing data

…to the 3 Cs of Big Data as an ecosystem Crumbs Capacities Community

A s d a t a , b i g d a t a i s n

  • t

p r i m a r i l y a b

  • u

t s i z e ; i t i s : “ t h e d i g i t a l t r a n s l a t i

  • n
  • f

h u m a n b e h a v i

  • r

s a n d b e l i e f s p a s s i v e l y e m i t t e d a n d /

  • r

p i c k e d u p b y d i g i t a l d e v i c e s ”

Crumbs

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 16

slide-17
SLIDE 17

…to the 3 Cs of Big Data as an ecosystem Crumbs Capacities Community

B i g d a t a a r e “ n

  • n
  • s

a m p l e d d a t a , c h a r a c t e r i z e d b y t h e c r e a t i

  • n
  • f

d a t a b a s e s f r

  • m

e l e c t r

  • n

i c s

  • u

r c e s w h

  • s

e p r i m a r y p u r p

  • s

e i s s

  • m

e t h i n g

  • t

h e r t h a n s t a t i s t i c a l i n f e r e n c e . ” M i c h a e l H

  • r

r i g a n , U S B L S

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 17

slide-18
SLIDE 18

Source: McKinsey Global Institute

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 18

slide-19
SLIDE 19

Source: McKinsey Global Institute

World Bank Telefónica Orange Vodafone MIT Harvard

The new data ecosystem

UN

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 19

slide-20
SLIDE 20
  • 1. Exhaust data

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 20

slide-21
SLIDE 21
  • 1. Exhaust data—the example of CDRs

Called Detail records (CDRs) are metadata (data about data) that capture subscribers’ use

  • f their cell-phones — including an identification code and, at a minimum, the location of the

phone tower that routed the call for both caller and receiver — and the time and duration of

  • call. Large operators collect over six billion CDRs per day.

Source: http://www.unglobalpulse.org/Mobile_Phone_Network_Data-for-Dev Note: these are structured data==answers actively sought by the collector…but the emitter emits them passively; i.e. as a by- product, typically without full knowledge/ informed consent / choice….

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 21

slide-22
SLIDE 22

Types Examples Opportunities Cate Category gory 1: 1: Ex Exha haust st data data Mobile-based Call Details Records (CDRs) GPS (Fleet tracking, Bus AVL) Estimate population distribution and socioeconomic status in places as diverse as the U.K. and Rwanda Financial transactions Electronic ID E-licenses (e.g. insurance) Transportation cards (including airplane fidelity cards) Credit/debit cards Provide critical information

  • n population movements

and behavioural response after a disaster Transportation GPS (Fleet tracking, Bus AVL) EZ passes Provide early assessment of damage caused by hurricanes and earthquakes Online traces Cookies IP addresses Mitigate impacts of infectious diseases through more timely monitoring using access logs from the online encyclopedia Wikipedia

  • 2. Web content
  • 3. Sensing data

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 22

slide-23
SLIDE 23

…to the 3 Cs of Big Data as an ecosystem Capacities Community Crumbs

  • M

a c h i n e

  • l

e a r n i n g

  • S

t a t i s t i c a l m a c h i n e l e a r n i n g

  • N

e w m e a s u r e s / c

  • n

c e p t s e . g . r a d i u s

  • f

g y r a t i

  • n

, e n t r

  • p

y … .

  • V

i s u a l i z a t i

  • n

s … 1 . S

  • f

t & h a r d w a r e 2 . M e t h

  • d

s & t

  • l

s 3 . I n s t i t u t i

  • n

a l & h u m a n C a p a c i t i e s

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 23

slide-24
SLIDE 24

Example: Dynamic Population Mapping Using Mobile Phone Data

France and Portugal (2014)

Source: Deville, Linard et al (2014), PNAS, vol. 111 no. 45. http://www.pnas.org/content/111/45/15888.abstract

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 24

slide-25
SLIDE 25

Example: Tea Party vs. Occupy Wall Street on twitter

USA (2011)

Not demography? What if pro-life vs. pro-choice?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 25

slide-26
SLIDE 26

…to the 3 Cs of Big Data as an ecosystem Crumbs Capacities Community

B i g D a t a c

  • m

m u n i t i e s : è W h a t f

  • r

, w i t h a n d b y w h

  • m

?

  • M

a c h i n e

  • l

e a r n i n g

  • S

t a t i s t i c a l m a c h i n e l e a r n i n g

  • N

e w m e a s u r e s / c

  • n

c e p t s = = r a d i u s

  • f

g y r a t i

  • n

, e n t r

  • p

y … .

  • V

i s u a l i z a t i

  • n

s …

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 26

slide-27
SLIDE 27

Crumbs Capacities Community

B i g D a t a c a n h a v e f

  • u

r m a i n r

  • l

e s

  • r

f u n c t i

  • n

s : 1 . D e s c r i p t i v e ( e . g . m a p s ) ; 2 . P r e d i c t i v e , i n c l u d e s w h a t h a s b e e n c a l l e d ‘ n

  • w
  • c

a s t i n g ’

  • r

i n f e r e n c e a s w e l l a s f

  • r

e c a s t i n g 3 . P r e s c r i p t i v e (

  • r

d i a g n

  • s

t i c ) , b y e s t a b l i s h i n g c a u s a l r e l a t i

  • n

s ( = 4 . D i s c u r s i v e (

  • r

e n g a g e m e n t ) , c

  • n

c e r n s s p u r r i n g a n d s h a p i n g d i a l

  • g

u e w i t h i n a n d b e t w e e n c

  • m

m u n i t i e s = = “ d a t a d v

  • c

a c y ” — l i k e D H S ?

è è What for, with and by whom?

B i g D a t a c

  • m

m u n i t y :

…to the 3 Cs of Big Data as an ecosystem

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 27

slide-28
SLIDE 28

Demography meets Big Data, Big Data meets demography Crumbs Capacities Community Tools, methods, principles… Community Capacities Survey, official stats, administrative data..

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 28

slide-29
SLIDE 29

Demography meets Big Data, Big Data meets demography Crumbs Capacities Community Data Tools, methods, principles… Community Capacities

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 29

slide-30
SLIDE 30

1—What are we talking about? 2—What has been done? (a few cases) 3—What could / should be done?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 30

slide-31
SLIDE 31

Predicting Population Density from Cell-Phone Activity

Senegal (2015)

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 31

slide-32
SLIDE 32

Post-Earthquake Population Movement

Nepal (2015)

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 32

slide-33
SLIDE 33
  • Binary classification task using Random Forest

Predicting Crime Hotspots from Cell Phone Data

London (2013-14)

Source: Moves on the Streets: Predicting Crime Hotspots Using Aggregated Anonymized Data on People Dynamics Bogomolov A., Lepri B., Staiano J., Letouze, Oliver N., Pentland A., Pianesi F.

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 33

slide-34
SLIDE 34

Risk Sharing in Natural Disasters through “Mobile Money”

Rwanda (2010)

Source: Risk Sharing and Mobile Phones: Evidence in the Aftermath of Natural Disasters, 2014 Blumenstock J., Eagle N., Fafchamps M.,

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 34

slide-35
SLIDE 35

1—What are we talking about? 2—What has been done? 3—What could / should be done? (to build the data-rich future of population science)?

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 35

slide-36
SLIDE 36

Learning.

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 36

slide-37
SLIDE 37

Source: Lazer, David, Ryan Kennedy, Gary King, and Alessandro

  • Vespignani. 2014. The Parable of Google Flu: Traps in Big Data Analysis.

Science 343, no. 14 March: 1203-1205.

1—Learn from past mistakes (evolution)

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 37

slide-38
SLIDE 38

Then: blending of hypothesis based vs. supervised machine learning methods to model bias

2—From old recipes (using new ingredients)

Modeling and correcting sample bias in non-sampled data

Zagheni & Weber, 2012 Letouzé, Zagheni & al, Weber, 2015

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 38

slide-39
SLIDE 39

3—From and for others

https://c1.staticflickr.com/9/8570/16070333273_5139661ba4.jpg

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 39

slide-40
SLIDE 40

Modeling and projecting world population of data?

1 1 1000000000.. 1000000000.. 50 years 90 days Data not yet known Circa 2000 Circa 2050

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 40

slide-41
SLIDE 41

4—From good quotes

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 41

slide-42
SLIDE 42

5—From innovation and imagination

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 42

slide-43
SLIDE 43

Conclusions:

1. Demography has been slow to enter the data revolution but it is well positioned to catch up fast—and has a lot to contribute 2. What is at play & stakes is not just about new kinds of data; it’s an entirely new ecosystem of data, tools and actors emerging 3. Demography should and probably will reinvent itself by becoming population science, a science that should strive to both measure and understand to positively affect old and new population processes based on old, sound, principles 4. This should happen gradually in the next 15 years but it will require and significant efforts and investments to build new mindsets, systems and capacities

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 43

slide-44
SLIDE 44

Thank you

eletouze@datapopalliance.org

UN EGM on Strengthening the Demographic Evidence Base For The Post-2015 Development Agenda, New York, 5-6 October 2015 Session 5. Complementing traditional data sources with alternative acquisition, analytic and visualization approaches: Emmanuel Letouzé (Data-Pop Alliance) – New data sources for population sciences 44