historical map polygon and feature extractor mauricio giraldo - - PowerPoint PPT Presentation

historical map polygon and feature extractor
SMART_READER_LITE
LIVE PREVIEW

historical map polygon and feature extractor mauricio giraldo - - PowerPoint PPT Presentation

historical map polygon and feature extractor mauricio giraldo arteaga NYPL Labs @mgiraldo NYGeoCon 2013 background ~120k polygons produced in three years by staff and volunteers (NYPL volunteers) building = building = not paper-colored


slide-1
SLIDE 1

historical map polygon and feature extractor

mauricio giraldo arteaga NYPL Labs @mgiraldo NYGeoCon 2013

slide-2
SLIDE 2

background

slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5
slide-6
SLIDE 6
slide-7
SLIDE 7
slide-8
SLIDE 8
slide-9
SLIDE 9
slide-10
SLIDE 10
slide-11
SLIDE 11
slide-12
SLIDE 12
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15

~120k polygons produced in three years by staff and volunteers

(NYPL ♥ volunteers)

slide-16
SLIDE 16

building =

slide-17
SLIDE 17

not paper-colored

building =

slide-18
SLIDE 18

not paper-colored completely enclosed by black lines

building =

slide-19
SLIDE 19

not paper-colored completely enclosed by black lines dashed lines are not walls

building =

slide-20
SLIDE 20

not paper-colored completely enclosed by black lines dashed lines are not walls > 20m2 (~180ft2)

building =

slide-21
SLIDE 21

not paper-colored completely enclosed by black lines dashed lines are not walls > 20m2 (~180ft2) < 3,000m2 (~27,000ft2)

building =

slide-22
SLIDE 22

not paper-colored completely enclosed by black lines dashed lines are not walls > 20m2 (~180ft2) < 3,000m2 (~27,000ft2) + attributes (color, dots, crosses...)

building =

slide-23
SLIDE 23

process

slide-24
SLIDE 24
slide-25
SLIDE 25

https://github.com/NYPL/map-vectorizer try it!

slide-26
SLIDE 26

gdal_polygonize.py

generates polygons automagically!

slide-27
SLIDE 27
slide-28
SLIDE 28

$ gdal_polygonize.py test.tif -f "ESRI Shapefile" test.shp test

slide-29
SLIDE 29

$ gdal_polygonize.py test.tif -f "ESRI Shapefile" test.shp test

slide-30
SLIDE 30

gdal_polygonize.py

generates polygons automagically! (not really)

slide-31
SLIDE 31

we need to optimize the input

slide-32
SLIDE 32

differences in resampling

cubic nearest neighbor

slide-33
SLIDE 33

differences in resampling

cubic nearest neighbor

slide-34
SLIDE 34
slide-35
SLIDE 35
slide-36
SLIDE 36
slide-37
SLIDE 37
slide-38
SLIDE 38
slide-39
SLIDE 39

we need to simplify the output

(for those polygons that we care about)

slide-40
SLIDE 40
slide-41
SLIDE 41
slide-42
SLIDE 42

pts = spsample(polygon, n=1000, type="hexagonal")

slide-43
SLIDE 43

pts = spsample(polygon, n=1000, type="hexagonal") pts = spsample(polygon, n=1000, type="regular")

slide-44
SLIDE 44

pts = spsample(polygon, n=1000, type="hexagonal") pts = spsample(polygon, n=1000, type="regular") pts = spsample(polygon, n=1000, type="random")

slide-45
SLIDE 45

x.as = ashape(pts@coords,alpha=2.0)

slide-46
SLIDE 46

x.as = ashape(pts@coords,alpha=2.0)

lower alpha produces more concave shapes (good) but holes may start appearing (bad)

slide-47
SLIDE 47
slide-48
SLIDE 48
slide-49
SLIDE 49
slide-50
SLIDE 50

Ramer–Douglas–Peucker and other point reduction algorithms can be considered

slide-51
SLIDE 51
slide-52
SLIDE 52
slide-53
SLIDE 53
slide-54
SLIDE 54
slide-55
SLIDE 55
slide-56
SLIDE 56

66,056 polygons produced in one day (as opposed to years)

slide-57
SLIDE 57

but: adjacency is not being enforced false positives/negatives buildings may also overlap

slide-58
SLIDE 58

we need to validate the output

http://buildinginspector.nypl.org

*not included in the paper

slide-59
SLIDE 59
slide-60
SLIDE 60
slide-61
SLIDE 61
slide-62
SLIDE 62

2 weeks later...

slide-63
SLIDE 63

341,005 flags for 66,055 unique polygons 62,402 polygons with consensus Yes 84.2% Fix 6.4% No 9.4%

“consensus” = 75%+ agreement of 3+ flags

slide-64
SLIDE 64

no sleep till Brooklyn

14k+ more polygons

slide-65
SLIDE 65

thank you

mauricio giraldo arteaga NYPL Labs @mgiraldo NYGeoCon 2013