A General Approach to Discovering, Registering, and Extracting - - PowerPoint PPT Presentation

a general approach to discovering registering and
SMART_READER_LITE
LIVE PREVIEW

A General Approach to Discovering, Registering, and Extracting - - PowerPoint PPT Presentation

Information Sciences Institute Agent of Innovation : from visionary to viable A General Approach to Discovering, Registering, and Extracting Features from Raster Maps Craig Knoblock University of Southern California & Geosemble


slide-1
SLIDE 1

Information Sciences Institute

Agent of Innovation: from visionary to viable

A General Approach to Discovering, Registering, and Extracting Features from Raster Maps

Craig Knoblock University of Southern California & Geosemble Technologies

Joint work with Ching-Chien Chen, Yao-Yi Chiang, Aman Goel, Matthew Michelson, and Cyrus Shahabi

slide-2
SLIDE 2

Introduction

  • Raster maps are a rich source of geospatial data:

— Easily accessible — Many different types of information — Often contains information that cannot be found elsewhere

USGS topographic map of St. Louis, MO Travel map of Tehran, Iran

slide-3
SLIDE 3

Challenges

  • Maps have lots of useful information, but…

— They have overlapping features — There is limited access to the meta-data — Often only available in raster format

  • How do we find, register, and extract and recognize the

features in a raster map

slide-4
SLIDE 4

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

4

slide-5
SLIDE 5

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

5

slide-6
SLIDE 6

Map Discovery

  • Collect candidate maps from the Web

— Standalone maps

  • Found using an image search engine

— Maps embedded in PDF documents

  • Found using a general search engine and then extracting

the images

  • Classify the images

— Extract features from the images — Identify similar images using Content Based Image Retrieval (CBIR) — Classify the image using k-Nearest Neighbor

slide-7
SLIDE 7

Identifying Maps

Image Server Map Server Non-Map Image Repository Map Image Repository

Our approach :

  • Extract features
  • Find similar images
  • Classify image
slide-8
SLIDE 8

Extract Features

  • Water-filling features

— Zhou, X.S. et al. - Water-filling: A novel way for image structure feature extraction, 1999, Intl. conference on Image Processing — Works well on images with strong edges

 Works on standard Canny edge maps of original images

  • Color invariant
slide-9
SLIDE 9

Water-Filling Features

Fork Count : 6 Filling Time : 57 Water Amount : 68 Fork Count : 0 Filling Time : 45 Water Amount : 45 Fork Count

  • Features computed for each segment
  • Normalized histogram - size invariant
  • No. of segments
  • 3 features x 8 buckets = 24 element feature vector
slide-10
SLIDE 10

Content-Based Image Retrieval (CBIR)

Map12 Map75 Map36 Non-map23 Non-map139 Map repository Non-map repository CBIR* (find 5 most similar images)

Query image feature vector

  • Built on top of Lire system

* In our experiment we used 9 similar images

0.20 0.12 0.15 . . . 0.12 0.20 0.07

slide-11
SLIDE 11

k - Nearest neighbor classification

Map12 Map75 Map36 Non-map23 Non-map139 Majority Maps? Label image as a map yes

slide-12
SLIDE 12

Results

8,000 images (4,000 maps/4,000 nonmaps)

4,000 images (2,000 maps/2,000 nonmaps) 4,000 images (2,000 maps/2,000 nonmaps)

All images Repository Test set

Precision Recall F1-measure 77.39% 71.20% 74.17%

Results are average over 10 runs

slide-13
SLIDE 13

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

13

slide-14
SLIDE 14

Background Removal

  • Use the Triangle method (Zack, 1977) to locate clusters in

the grayscale histogram

  • Remove the background clusters

Binary Map Input Map Grayscale Histogram Background colors have the dominate number of pixels Remove the dominate cluster (background pixels)

slide-15
SLIDE 15

Text/Graphics Separation

15

  • Separate linear structures from text (Cao and Tan, 02)

Remove small connected object groups Group small connected objects - string Detect small connected objects - character Add up the removed objects Text Layer Road Layer

slide-16
SLIDE 16

Parallel-Pattern Tracing

16

  • Assuming we know the road

width is 3 pixels, if the yellow pixel is on a double- line layer, we can find: — At least 1 pixel on the

  • riginal road line

— At least 1 corresponding pixel on the other road line

Corresponding pixel on the second line Construct the first line

The parallel pattern@3-pixel road width! Pixilated view of a segment of double-line streets Black cells: Road pixels White cells: Backgrounds

3Pixels 3Pixels 3Pixels

slide-17
SLIDE 17

Road Format and Road Width Detection

 Apply parallel-pattern tracing (PPT) iteratively on different sizes of road width  If it is a double-line road layer, the actual road width

  • Has the maximum percentage of parallel pattern pixels
  • The percentage is larger than a threshold

Apply PPT using the detected road with to remove non-parallel lines

slide-18
SLIDE 18

Road Format and Road Width Detection

 Apply parallel-pattern tracing (PPT) iteratively on different sizes of road width  If it is a double-line road layer, the actual road width

  • Has the maximum percentage of parallel pattern pixels
  • The percentage is larger than a threshold

Apply PPT using the detected road with to remove non-parallel lines

slide-19
SLIDE 19
  • Use morphological operations to

reconnect broken lines and generate one- pixel width roads

Road Topology Extraction

Dilation Erosion Thinning Morphological Operations: Use the detected road format and road width to determine the number of iterations

slide-20
SLIDE 20

Extracted Road and Text Layers

20 Text Layer Road Layer (road topology)

slide-21
SLIDE 21

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

21

slide-22
SLIDE 22

Supervised Map Decomposition

  • What if we cannot automatically remove the background

from raster maps? — Raster maps usually contain noise from scanning and compression process

slide-23
SLIDE 23

Difficulties

  • Raster maps contain numerous colors

— Manually examining each color for extracting features is laborious

285,735 colors

Grayscale histogram

slide-24
SLIDE 24

Color Segmentations

  • The Mean-shift

algorithm

— Consider distance in the RGB color space and in the image space — Preserve object edges — From 285,735 to 155,299 colors

  • The K-means algorithm

— Limit the number of colors to K — From 155,299 to 10 colors (K=10)

Grayscale histogram

slide-25
SLIDE 25

User Labeling

  • To extract the road layer, the user needs to provide a user

label for each road color (at most K colors)

User label should be (approximately) centered at a road intersection or at the center of a road line

slide-26
SLIDE 26

Label Decomposition

  • Decompose each user label into color images so that every

color image contains only one color

1 2 3 4 5

(background is shown in black)

slide-27
SLIDE 27

Hough-Line Approach to Identify Road Color

 Detect Hough lines  The center of the user label is the center of a road line  The Hough lines that are far away from the image center are NOT constructed by road pixels  Identify road colors using  The average distance between the Hough lines to the image center

1 2 3 4 5 5 4 3 2 1

Road color Road color Red Hough lines are within 5 pixels to the image center

slide-28
SLIDE 28

Initial Road Template

  • Generate an initial road template using the images of

identified road colors from the Hough line approach

4 5

(background is shown in black) (road pixels are shown in red, background is shown in black)

slide-29
SLIDE 29

Road Topology Extraction using Identified Road Colors

  • Identify a set of road colors from each user label
  • Use the identified road colors to extract road pixels
  • Apply morphological operations to remove solid

areas and reconnect lines

slide-30
SLIDE 30

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

30

slide-31
SLIDE 31

Automatic Map Registration

  • Exploit the pattern of intersections found on a map

and compare to a road vector dataset

slide-32
SLIDE 32

Road-Intersection Position Detection

  • Corner detector (OpenCV)

— Find intersection candidates

  • Compute the connectivity to

determine real intersections

Corner Detector Connectivity>=3 Connectivity<3, discard Road Intersection!!

slide-33
SLIDE 33
  • Road lines are distorted by the

morphological operators

  • The extracted road vector around

intersections will not be accurate

  • Extract accurate road-intersection

template — road intersection position — road connectivity — road orientation

Road-Intersection Template Extraction

slide-34
SLIDE 34

Distortion Correction

The blob image The thinned lines Intersect the images Intersection Positions Use the road width to determine the blob size for covering the distorted lines

slide-35
SLIDE 35

Accurate Road- Intersection Templates

With distortion Avoid distortion

Accurate Road- Intersection Templates

slide-36
SLIDE 36

Point Pattern Matching

  • Distribution of intersections is used to determine the

relationship between a map and an image

  • Find the mapping between these points to get a set of

control point pairs

— Find the transformation T between the layout (with relative distances)

  • f the two point sets

Example: (x,y) = (83,22)

Example: (lon,lat) = (-118.407088,33.92993)

80 points 400 points

slide-37
SLIDE 37

Aligning Maps and Imagery

  • Using matched point pattern to align maps with

imagery by rubber-sheeting

slide-38
SLIDE 38

Aligning Maps and Imagery

  • Using matched point pattern to align maps with

imagery by rubber-sheeting

slide-39
SLIDE 39
slide-40
SLIDE 40

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

40

slide-41
SLIDE 41

Road Vectorization

  • Start from the extracted road intersections to

connect the salient points and produce the road vector

slide-42
SLIDE 42

Example Result

slide-43
SLIDE 43

Experimental Results

  • Tested on 16 maps from 11 sources
  • Compared with R2V from Able Software

— an automated raster-to-vector conversion software package specialized for digitizing raster maps

  • Average completeness, correctness, quality, redundancy

Strabo: 96.53% 97.61% 94.41% 0.19% R2V: 94.9% 87.4% 79.73% 42.81%

  • R2V could achieve better results if we tuned R2V with

manually specified pre-processing and post-processing functions — e.g., manually specify the gap size for reconnecting two lines)

slide-44
SLIDE 44

Text Recognition

44

slide-45
SLIDE 45

Group Characters Using the Dilation Operator

45

slide-46
SLIDE 46

Split Merge Strings

46

slide-47
SLIDE 47

Detect Orientation with Run-Length Smoothing

47

slide-48
SLIDE 48

Run Commercial OCR

  • n Extracted Text

48

Commercial OCR System

slide-49
SLIDE 49

Results

49

slide-50
SLIDE 50

Outline

  • Map Discovery
  • Automatic Extraction of Features
  • Feature Extraction from Noisy Maps
  • Automatic Registration of Maps
  • Feature Recognition
  • Related Work and Discussion

50

slide-51
SLIDE 51

Related Work

  • Map Feature Extraction Using Map Specification (Samet and

Soffer, 94, 96; Myers et al., 96)

— Require huge amount of prior information and training

  • Text/Graphics Separation and Text Recognition (Bixler, 00; Li

et al.,00; Cao and Tan 02; Vela, 03; Pouderoux, 07)

— Require fixed pre-processing steps, e.g., binarization with fixed threshold

  • Supervised Graphics Extraction (Khotanzad and Zink, 03; Salvatore

and Guitton, 04; Chen et al. 06)

— Laborious training tasks, e.g., labeling all combination of line and background pixels

  • Road Extraction and Vectorization (Bin, 98; Habib et al.,

99;Itonaga et al., 03 )

— Require lots of parameter tunings, e.g., road width

  • Map, Vectors, and Imagery Conflation (Chen et al., 06; Chen et

al., 08;Wu et al., 07)

— Exploit for determining feature locations

slide-52
SLIDE 52

Conclusion

  • Presented a general approach to discovering,

registering, extracting features from maps

  • Contributions

— Ability to identify maps — Tecniques to extract road and text layers from poor quality maps — Algorithms to automatically determine the geocoordinates — Automatic feature recognition

  • Intersection templates, road vector data, and text labels
  • Applications

— Annotating imagery — Creating and updating maps — Constructing gazetteers

slide-53
SLIDE 53

Publications

  • Available from: http://www.isi.edu/~knoblock
  • A General Approach to Discovering, Registering, and Extracting Features from

Raster Maps. Knoblock, C. A.; Chen, C.; Chiang, Y.; Goel, A.; Michelson, M.; and Shahabi, C., In Proceedings DRR, 2010.

  • An Approach for Recognizing Text Labels in Raster Maps. Chiang, Y., and Knoblock,
  • C. A., In Proceedings of ICPR, 2010.
  • A Method for Automatically Extracting Road Layers from Raster Maps. Chiang, Y.,

and Knoblock, C. A., In Proceedings ICDAR, 2009.

  • Automatic and Accurate Extraction of Road Intersections from Raster Maps.

Chiang, Y.; Knoblock, C. A.; Shahabi, C.; and Chen, C., Geoinformatica, 13(2):121-157, 2008.

  • Automatically and Accurately Conflating Raster Maps with Orthoimagery. Chen,

C.; Knoblock, C. A.; and Shahabi, C., Geoinformatica, 12(3):377—410, 2008.

  • Automatic Extraction of Road Intersection Position, Connectivity, and Orientation

from Raster Maps. Chiang, Y., and Knoblock, C. A., In Proceedings of ACM GIS, 2008.

  • Automatic Extraction of Road Intersections From Raster Maps. Chiang, Y.;

Knoblock, C. A.; and Chen, C., In Proceedings of ACM GIS, 2005.

  • Automatically and Accurately Conflating Orthoimagery and Street Maps. Chen, C.;

Knoblock, C. A.; Shahabi, C.; Thakkar, S.; and Chiang, Y., In Proceedings of ACM GIS, 2004.

53