Emergent Geospatial Data and Measurement Issues Michael F. - - PowerPoint PPT Presentation

emergent geospatial data and measurement issues michael f
SMART_READER_LITE
LIVE PREVIEW

Emergent Geospatial Data and Measurement Issues Michael F. - - PowerPoint PPT Presentation

Emergent Geospatial Data and Measurement Issues Michael F. Goodchild University of California Santa Barbara New data sources: VGI Volunteered and therefore free Abundant Timely time-critical community mapping


slide-1
SLIDE 1

Emergent Geospatial Data and Measurement Issues Michael F. Goodchild University of California Santa Barbara

slide-2
SLIDE 2

New data sources: VGI

  • Volunteered and therefore free
  • Abundant
  • Timely

– time-critical community mapping

  • Multidimensional

– if people can map anything, what do they choose

to map?

  • graffiti, potholes, shortcuts, cemeteries
  • No guarantees

– metadata, data quality – three approaches

slide-3
SLIDE 3

http://www.directrelief.org/Flash/HaitiShipments/Index.html

slide-4
SLIDE 4

Crandall et al. 2009. Mapping the world’s photos. http://www.cs.cornell.edu/~crandall/papers/mapping09www.pdf

slide-5
SLIDE 5

Density of geo-located tweets in Los Angeles, Jan1 to Feb 25, 2011

slide-6
SLIDE 6

The crowd solution

  • Linus’s Law

– the more eyes to review, the more accurate – works for popular facts – in emergencies confidence is based on the

number of identical reports

  • Geographic facts may be obscure

– little-known areas of the world

  • or not so obscure

– in emergencies a single report may be crucial

slide-7
SLIDE 7

The social solution

  • Who can be trusted?
  • A hierarchy of moderators and gate-keepers

– all volunteered facts referred up the hierarchy

  • A social structure

– promotion based on track record – heavy, accurate contributors promoted – e.g., Wikipedia, OSM – top levels of Google MapMaker reserved for

Google staff

slide-8
SLIDE 8

The geographic solution

  • How can we know if a purported geographic

fact is false?

– because it violates the rules by which the

geographic world is constructed

– the syntactic rules – compare language rules, the sentence structure

  • f English
  • What are those rules?

– essential, fundamental geographic knowledge

slide-9
SLIDE 9

Some sample rules

  • Tobler’s First Law

– “…but nearby things are more similar than distant

things”

– horizontal context – a geographic fact should be consistent with its

surroundings

  • “All things are related…”

– vertical context – a geographic fact should be consistent with other

things that are known about that location

slide-10
SLIDE 10

Census issues

  • Traditionally the primary source of data for

spatial demography

  • The American Community Survey

– replacement for the Long Form – a Republican target – a rolling monthly sample

  • 1-year, 3-year, 5-year estimates

– sacrificing spatial detail for temporal

  • For spatial demography?

– good for coarse analysis of rapid change – poor for detailed analysis

slide-11
SLIDE 11

Administrative data

  • Tax returns, social programs, local

government records

  • In some countries a replacement for the

traditional census

  • Little progress in the US

– lack of coordination between agencies and levels

  • f government
slide-12
SLIDE 12

Private-sector data

  • Google, Facebook, etc.
  • Vast amounts of social data of potential

relevance to social demography

– no regular sampling, no quality control – “soft” data – but soft data has value in science

  • exploratory research
  • hypothesis generation
  • In-house research

– Facebook’s analyses of network linkages – 4.74 degrees (New York Times 21 Nov 2011)

slide-13
SLIDE 13

Privacy and confidentiality

  • Many data types of great interest to spatial

demography are off limits to researchers

– tracks of individuals – administrative records – detailed census records

  • The Census Data Center solution

– requires physical presence

  • The virtual Census Data Center

– a firewall preventing unacceptable queries – many unresolved technical issues

slide-14
SLIDE 14

Reporting-zone geometry

  • Data must be aggregated to protect

confidentiality

  • Reporting zones change through time
  • Reporting zones may not meet the needs of

specific projects

  • Adopting standard reporting zones leads to

distortion

– e.g., defining an individual’s neighborhood by the

containing census tract

slide-15
SLIDE 15
slide-16
SLIDE 16
slide-17
SLIDE 17
slide-18
SLIDE 18

Possible approaches

  • Re-aggregation of smaller zones
  • Make available all reporting-zone geometries

– NHGIS (National Historic GIS)

  • all historic Census geometries

– SABINS

  • all school catchment areas by grade
  • Areal interpolation
slide-19
SLIDE 19

1 target zone 4 source zones A B C D 10% of A 15% of B 5% of C 50% of D PopTARGET = 0.10 PopA + 0.15 PopB + 0.05 PopC + 0.50 PopD

slide-20
SLIDE 20
slide-21
SLIDE 21

Concluding points

  • A very dynamic area

– many new data sources – powerful new technologies – the modern era of taxpayer-financed, rigorously

controlled data sets is clearly losing ground

– a post-modern era of disparate data sets is

emerging

– we do not yet understand the implications

  • quality control, synthesis
  • what new kinds of social science are enabled

– some important issues for discussion