6. Image databases Image representations: Digitized (sampled) - - PowerPoint PPT Presentation

6 image databases
SMART_READER_LITE
LIVE PREVIEW

6. Image databases Image representations: Digitized (sampled) - - PowerPoint PPT Presentation

6. Image databases Image representations: Digitized (sampled) representation of field-based spatial data Raw images digital images bitmapped images raster images m n matrix of pixels, resolution = sampling rate,


slide-1
SLIDE 1

MMDB-6 J. Teuhola 2012 141

  • 6. Image databases

Image representations:

Digitized (sampled) representation of field-based spatial data ‘Raw’ images ≈ digital images ≈ bitmapped images ≈ raster images m × n matrix of pixels, resolution = sampling rate, pixels per inch Each pixel represented by k bits (= accuracy = color depth);

2k possible values. Image types:

Binary (bi-level) images (black = 0, white = 1; e.g. telefax) Grey-scale images (usually k = 8; enables 256 grey-levels) Color images (various representations)

Sources:

Devices: scanner, digital camera, electron microscope, medical

imaging devices (PET, MRI)

Wavelengths: visible light, infrared, X-rays

slide-2
SLIDE 2

MMDB-6 J. Teuhola 2012 142

Color images

‘True’ color schemes:

Three components per pixel (possibly 4th for α-channel =

transparency)

RGB = Red - Green - Blue (typically 3 x 8 = 24 bits per pixel) CMY = Cyan - Magenta - Yellow (CMYK used in printing, K = black) HSI = Hue - Saturation - Intensity (used in image processing) YUV ≈ YCbCr = Luminance (brightness) + 2 x chrominance (color)

Used in image compression (JPEG) Correlations between color components reduced. Most information is collected in the Y-component.

Indexed color schemes:

Palette of e.g. 256 colors Mapping table from color indices to RGB-values Saves space, sufficient for many applications

slide-3
SLIDE 3

MMDB-6 J. Teuhola 2012 143

Image formats

Tens of formats exist for different environments and applications, e.g.

BMP = Bitmap image file (MS Windows) GIF = Graphics Interchange Format (indexed colors; sincludes

compression, supported by web browsers)

JBIG = Joint Bi-level Image experts Group file interchange format JPEG = Joint Photographic Experts Group

(JFIF = JPEG File Interchange Format)

JP2 = JPEG 2000 PBM = Portable Bitmap Format (black-and-white) PGM = Portable Greymap Format (grey-scale) PPM = Portable Pixmap Format (color) PNG = Portable Network Graphics TIFF = Tagged Image File Format (large number of options)

slide-4
SLIDE 4

MMDB-6 J. Teuhola 2012 144

Image compression

Necessary for large image archives:

saves space, reduces transmission time.

Possible due to redundancy in images Several methods specialized for different types of images Image formats with compression:

JPEG, based on cosine transform JPEG 2000, based on wavelet transform GIF, based on LZW string compression PNG, based on LZ77 string compression JBIG (bi-level images), based on prediction by context

slide-5
SLIDE 5

MMDB-6 J. Teuhola 2012 145

Compression method characteristics

Lossless / lossy methods: Can the original image be recovered

precisely or only approximately? E.g. JPEG is typically lossy.

Compression efficiency (bit rate), measured in bits/pixel Speed (separately for compression and decompression) Distortion (for lossy methods):

MAE = Mean Absolute Error MSE = Mean Square of Errors RMS = Root Mean Square error SNR = Signal to Noise Ratio PSNR = Peak Signal to Noise Ratio

Robustness against transmission errors Blockiness, blurring, ... (for lossy methods)

slide-6
SLIDE 6

MMDB-6 J. Teuhola 2012 146

Searching from an image database

  • 1. Using a hierarchical classification of images:

The user follows paths in the hierarchy, e.g. Art works Paintings France 18th century

  • 2. Search using keywords in metadata

Images can be considered similar to documents with index terms

  • 3. Search by content features

Pattern matching based on similarity with a query image, shape, color distribution, etc.

slide-7
SLIDE 7

MMDB-6 J. Teuhola 2012 147

Feature extraction and indexing of images

Extraction of descriptive attributes from images Manually, automatically, or using a hybrid scheme

(automatic segmentation & manual assignment of properties). Manual indexing:

Performed by a ‘knowledge worker’, trained on patterns and

vocabulary of the image database application

Multiple indexers: Strict consistency rules, common glossary. Automatic tools may help in pattern recognition. Each interesting object (spatial structure) is presented manually to

the system for indexing, equipped with descriptive attributes.

Assistance in selecting index terms: Hierarchical dictionaries, cross-

referencing systems, domain thesaurus.

Time-consuming and costly; possibility to community-indexing, cf.

http://gimp-savvy.com/PHOTO-ARCHIVE/

slide-8
SLIDE 8

MMDB-6 J. Teuhola 2012 148

Automatic indexing

Specialized for various application domains (document

recognition, optical character recognition (OCR), engineering drawings, x-rays, ...)

The system must first ‘learn’ and categorize domain element

  • bjects.

A certain amount of uncertainty (fuzziness) must be tolerated. Important area of automatic image analysis and object

recognition: Transformation of paper documents into digital form, and indexing those documents appropriately (so called document imaging → digital libraries).

slide-9
SLIDE 9

MMDB-6 J. Teuhola 2012 149

Color feature extraction

Usually based on color histograms, i.e. number of pixels of each

color (or color component):

Separate histograms can be built for various subregions of the

image (e.g. top-left, top-right, middle, ...)

The quantification can be made coarser than 0..255 by grouping

adjacent histogram values, in order to reduce the dimensionality of the resulting feature vectors.

255 #pixels RED 255 #pixels GREEN 255 #pixels BLUE

slide-10
SLIDE 10

MMDB-6 J. Teuhola 2012 150

Image segmentation

Detection of interesting regions within images. A segment is a connected region that satisfies a homogeneity

predicate.

Basis for subsequent search. One of the most difficult tasks in image processing. Several possible (heuristic) methods.

Connected region:

For each pair (x1, y1), (xn, yn) of pixels, there exists a chain of

pixels {(x1, y1), ..., (xn, yn)} in the region such that {(xi, yi), (xi+1, yi+1)} are adjacent for all i.

slide-11
SLIDE 11

MMDB-6 J. Teuhola 2012 151

Examples of homogeneity predicates

Binary images: p % of the pixels of the connected region have

the same color (black or white)

Classified grey-scales, e.g. 0...9, 10...19, etc.

A connected region is homogeneous, if at least p % of its pixels belong to the same class.

Dynamic grey-scale classification: Class boundaries are not

predefined, but the interval size is: p % of the cells should have a grey-level within δ units.

Grey-scale images with a reference function f for homogeneity:

The number of pixels in { (x, y) | ⏐grey-level(x, y) - f(x, y) ⏐ < δ } should be at least p % of the pixels in the region.

slide-12
SLIDE 12

MMDB-6 J. Teuhola 2012 152

Miscellaneous segmentation techniques

(a) Regular block segmentation:

Example: Quadtree or binary tree decomposition until

homogeneous regions are obtained.

Does not usually satisfy the maximality condition for segmentation:

Neighboring blocks may constitute a homogeneous region.

Generalization of binary tree segmentation: blocks can be split in

any direction: polygon segmentation. Compromise solution: splitlines only in 0°, 45°, 90°, and 135° directions. (b) Splitting and merging:

Augments category (a) methods to satisfy the maximality condition. Merging tests the obtained regions pairwise for homogeneity. Does not usually produce a unique segmentation for an arbitrary

homogeneity predicate.

slide-13
SLIDE 13

MMDB-6 J. Teuhola 2012 153

Miscellaneous segmentation techniques (cont.)

(c) Thresholding:

Applicable, if objects of interest and the background have

sufficiently distinct grey-level values.

The grey-level histogram of the image has two or more peaks,

between which we can choose the threshold grey-level values.

Must usually be augmented with more sophisticated techniques.

(d) Region growing:

Start from a set of seed points. Include neighboring pixels as long as homogeneity holds. Difficulty: How to choose the seeds?

(e) Edge-following algorithms:

Follow a (hopefully circular) path of largest gradients (steepest

slope) around the object to be detected.

slide-14
SLIDE 14

MMDB-6 J. Teuhola 2012 154

Example: thresholding

Threshold = 128

slide-15
SLIDE 15

MMDB-6 J. Teuhola 2012 155

Examples:

Tolerance = 80 Convolution kernel: 0 -1 0

  • 1 4 -1

0 -1 0

Region growing Edge detection

slide-16
SLIDE 16

MMDB-6 J. Teuhola 2012 156

Segment feature extraction from images

Various approaches, e.g.

area of the segment eccentricity/circularity shape approximation curvature

Desirable properties of segment features:

Invariance to translation Invariance to rotation Invariance to scaling

slide-17
SLIDE 17

MMDB-6 J. Teuhola 2012 157

Representing the shape of segments

See: Sven Loncaric: ” A Survey of Shape Analysis Techniques”,

Pattern Recognition, 31 (8), pp. 983-1001,1998

Example: boundary scalar transform: Another possibility: tangent angles at regular intervals Both can be made rotation and scaling invariant. The resulting 1D-function is usually Fourier-transformed:

amplitude (magnitude) values are rotation invariant; phase determines orientation and starting point).

The lower-frequency Fourier coefficients can be used as the

feature vector representing the shape. About 20 is often enough.

Technical problems: Non-convex shapes; shapes with holes.

slide-18
SLIDE 18

MMDB-6 J. Teuhola 2012 158

Representing the shape of segments (cont.)

  • Syntactic techniques, e.g. encoding of boundary into symbols representing

quantized directions: forward, left, right, forward-left, forward-right, ... String matching techniques (e.g. longest common subsequence) can be used to measure the distance between shapes

  • Global scalar transform techniques, e.g. moments:

The (infinite) set of moments contains all the information about the shape; in practice, we take a limited number (say 20) of lower order moments. This is a straightforward technique for feature vector construction for shapes.

∫ ∫

∞ ∞ − ∞ ∞ −

= =

  • ,

1 , , , ) , ( q p where dy dx y x f y x m

q p pq

(f,f,f,fr,f,fr,r,f,fr,l,fr,fr,f,r,f) start

slide-19
SLIDE 19

MMDB-6 J. Teuhola 2012 159

Texture feature extraction: some approaches

(a) Pixel neighbourhood features Measure directly the visual patterns occurring in the texture.

  • Color co-occurrence matrix

Probability of co-occurring colors at a given distance & direction.

  • Local binary patterns

Classifies pixel neighbourhood distributions at several distances (b) Transform-based features Produce coefficients representing weights of spatial frequencies

  • Fourier transform

Texture frequencies induce large related coefficients

  • Wavelet transform

Division of the image signal into ’subbands’ by a low-/high-pass filter bank. High-pass coefficients tend to be close to zero. Various approaches and filters have been suggested.

slide-20
SLIDE 20

MMDB-6 J. Teuhola 2012 160

Content-based retrieval from image databases

General property:

Retrieval is not 100 % precise.

Query types:

Find images having certain features, e.g. color, texture, shape, etc. Find images containing certain types of objects. Find image objects having certain attributes, such as shape (circle,

triangle, arc, ...), size, color, etc.

Find images where object of type 1 is located left of object of type 2

(= spatial relationship)

Similarity search: Find images (segments) similar to a given query

image (segment). Applications: recognition of persons from photographs / fingerprints, recognition of military airplanes, ships, etc.

slide-21
SLIDE 21

MMDB-6 J. Teuhola 2012 161

Approaches to similarity-based retrieval

(a) Direct metric approach

A distance function is defined for images (segments) Task: Find the nearest neighbor (k nearest neighbors) of the query

image (segment).

Naive distance functions for m × n color images:

L1-metric: Sum of pairwise Euclidean distances of RGB pixels L2-metric: Euclidean distance in (m × n × 3)-dimensional space. See e.g. http://www.tineye.com

Plenty of computation needed.

slide-22
SLIDE 22

MMDB-6 J. Teuhola 2012 162

Approaches to similarity-based retrieval (cont.)

(b) Feature-based metrics

Use feature extraction to reduce dimensionality, e.g.

Shapes of segments in the image Color, texture, … features of segments or the whole images Different types of features often combined.

The ‘true’ distance function d of images/segments, and the distance

function d’ of extracted feature vectors should satisfy approximately d(a, b) < d (a, c) ⇒ d’(a, b) < d’(a, c)

Use indexing to accelerate similarity retrieval:

Multidimensional indexing for feature groups Inverted indexing for distinct features Mixture of these

slide-23
SLIDE 23

MMDB-6 J. Teuhola 2012 163

Approaches to similarity-based retrieval (cont.)

(c) Transformation approach:

Subsumes the metric approach. Basic idea: Dissimilarity is proportional to the minimum cost of

transforming one image (segment) to the other.

Choose the image which is the least dissimilar with the query image. Examples of transformation operators: Translation, rotation, scaling

(reduction /magnification), extension, painting, etc.

Each operator has an associated cost function. The total cost of transformation is the sum of elementary costs.

Choose the minimum-cost chain of transformation steps for an image, then the minimum over all images in the target set.

More flexible than the metric approach; ‘users’ can specify their own

transformation operators and cost functions.

Metric approach supports better indexing.

slide-24
SLIDE 24

MMDB-6 J. Teuhola 2012 164

Example system: QBIC (Query By Image Content)

See: http://www.research.ibm.com/topics/popups/deep/manage/html/qbic.html

Content-based finding of pictorial info from image & video databases Feature extraction in database loading:

Positional color/texture Object identification: manual/semiautomatic/automatic

segmentation

Graphically expressed queries based on:

example images user-constructed sketches and drawings (shape parameters) color (principal color or color histogram) texture patterns (coarseness, contrast, directionality) camera and object motion (in videos)

Distance functions between query and image features Fast searching:

Filtering and indexing Reducing the dimensionality by transforms

slide-25
SLIDE 25

MMDB-6 J. Teuhola 2012 165

Image database structures

The storage of the pixel matrix (possibly compressed) is usually

sequential, because it spans several disk blocks.

Each image can be considered as a file of its own.

(a) Relational representation

Image relation: Image id and image-level (global) properties. Object relation: Objects (segments, rectangles) within images;

extracted manually or automatically. Attributes include: image id, object id, MBR coordinates, features.

Generalization: Probabilistic relations; object x is in image i with

probability p.

Queries: Apply ‘normal’ database techniques using feature values in

query conditions.

slide-26
SLIDE 26

MMDB-6 J. Teuhola 2012 166

Image database structures (cont.)

(b) Spatial representation

E.g. using R-trees, R*-trees, etc. Build a single R-tree for all images in the database. A leaf page contains a set of closely-located objects (their MBRs),

with a list of pointers to source images.

Each list element contains the additional properties of the object. Separate indexes can be built for other than spatial properties.

General observations:

In non-spatial respects, images can usually be treated as

documents, and retrieved using techniques developed for general information retrieval.

Combined usage of spatial and non-spatial criteria in retrieval is

achieved simply by combining (union, intersection, etc.) the pointer lists from the related indexes.

slide-27
SLIDE 27

MMDB-6 J. Teuhola 2012 167

Scenario for an image database architecture

Indexes for global properties of images Indexes for object features in images Spatial index for

  • bjects in images

Oid Iid feat1 ... feat n Coord Images Iid prop1 ... propm Ptr