Image filtering and image features
September 26, 2019
Image filtering and image features September 26, 2019 Outline: - - PowerPoint PPT Presentation
Image filtering and image features September 26, 2019 Outline: Image filtering and image features Images as signals Color spaces and color features 2D convolution Matched filters Gradient filters Separable convolution
September 26, 2019
n2, color plane c.
PPM) distribute images with three color planes: Red, Green, and Blue (RGB)
Schwarzeneggerβs face), the grayscale image was created as Μ π¦ π$, π& = 1 3 *
+β{.,/,0}
π¦ π$, π&, π
n1 n2
continuous spectrum of colors.
color sensors:
frequencies
frequencies
frequencies
hardware at just three discrete colors (R, G, and B), it is possible to fool the human eye into thinking that it sees a continuum of colors.
code three discrete colors (RGB).
Illustration from Anatomy & Physiology, Connexions Web site. http://cnx.org/content/col11496/1. 6/, Jun 19, 2013.
intensities, i.e., Μ π¦ π$, π& = $
3 β+β{.,/,0} π¦ π$, π&, π .
either red or blue.
the standard ITU-R BT.601:
π¦ π$, π&, π = 0.299π¦ π$, π&, π + 0.587π¦ π$, π&, π» + 0.114π¦ π$, π&, πΆ
luminance (luminance is sort of green-based, remember?)
color-shift of the pixel, not its average luminance.
π π0 π. = β π€E β π€0 β π€. π π» πΆ Where π‘π£π( β π€.) = π‘π£π( β π€0) = 0.
Cr and Cb, at Y=0.5 Simon A. Eugster, own work.
π π π
.
= 0.299 0.587 0.114 β0.168736 β0.331264 0.5 0.5 β0.418688 β0.081312 π π» πΆ gives π‘π£π( β π€.) = π‘π£π( β π€0) = 0.
(e.g., fire, or wood)
(e.g., water, or sky)
a good feature for distinguishing between, for example, βfireβ versus βwaterβ
π0 =
$ NONP βQORS NOT$ βQPRS NPT$ π0 π$, π& .
i.e., some pixels might not be all that bluish β as a result, some βwaterβ images have low average-pooled Pb.
π0 = max
QO max QP π0 π$, π& .
i.e., in the βfireβ image, there might be one or two pixels that are blue, even though all of the
pooled Pb.
= βQORS
NOT$ βQPRS NPT$ π0 & π$, π& $/&
value β it tends to resemble an average of the largest values.
pooling or max-pooling instead.
The 2D convolution is just like a 1D convolution, but in two dimensions. π¦ π$, π&, π ββ β π$, π&, π = *
\ORS NOT$
*
\PRS NPT$
π¦ π$, π&, π β π$ β π$, π& β π&, π Note that we donβt convolve over the color plane β just over the rows and columns.
π§ π$, π&, π = *
\ORS NOT$
*
\PRS NPT$
π¦ π$, π&, π β π$ β π$, π& β π&, π Suppose that x is an N1xN2 image, while h is a filter of size M1xM2. Then there are three possible ways to define the size of the output:
and then π§ π$, π& is defined wherever the result is nonzero. This gives π§ π$, π& the size of (N1+M1-1)x(N2+M2-1).
some zero-padding.
which both x and h are well-defined. This gives π§ π$, π&, π the size of (N1- M1+1)x(N2-M2+1).
Suppose we want to calculate the difference between each pixel, and its second neighbor: π§ π$, π& = π¦ π$, π& β π¦ π$, π& β 2 We can do that as π§ = *
\ORS NOT$
*
\PRS NPT$
π¦ π$, π& β π$ β π$, π& β π& where β π$, π& = ^ 1 π$ = 0, π& = 0 β1 π$ = 0, π& = 2 πππ‘π β¦we often will write this as h=[1,0,-1].
Suppose we want to calculate the average between each pixel, and its two neighbors: π§ π$, π& = π¦ π$, π& + 2π¦ π$, π& β 1 + π¦ π$, π& β 2 We can do that as π§ = *
\ORS NOT$
*
\PRS NPT$
π¦ π$, π& β π$ β π$, π& β π& where β π$, π& = ^ 1 π$ = 0, π& β {0,2} 2 π$ = 0, π& = 1 πππ‘π β¦we often will write this as h=[1,2,1].
designed to pick out a particular type of object (e.g., a bicycle, or a Volkswagon beetle). The output of the filter has a large value when the
value otherwise.
image gradient π»a π$, π&, π =
b bQP π¦ π$, π&, π , and one to estimate
the vertical image gradient π»c π$, π&, π =
b bQO π¦ π$, π&, π
Suppose we have a noisy signal, x[n]. We have two hypotheses:
variance Gaussian white noise signal.
deterministic (non-random) signal that we know in advance. We want to create a hypothesis test as follows:
threshold, then conclude that H0 is true (signal absent). Can we design h[n] in order to maximize the probability that this classifier will give the right answer?
π§ π = π¦ π β β π = π‘ π β β π + π€ π β β[π]
random variable with zero average.
= 0 because E π€ π = 0
& = β πl &β& [π β π] = β β& [π β π]
& = 1).
& = 1.
Gaussian random signal.
π§ π = π¦ π β β π = π‘ π β β π + π₯ π So w[0] is a zero-mean, unit-variance Gaussian random variable. We have two hypotheses:
Goal: we know s[m]. We want to design h[m] so that β π‘ π β[βπ] is as large as possible, subject to the constraint that β β& [π β π] = 1.
Goal: we know s[m]. We want to design h[m] so that β π‘ π β[βπ] is as large as possible, subject to the constraint that β β& [π β π] = 1. The solution: β π β π‘[βπ]. (Specifically, β π = π‘[βπ]/ β π‘& [π]) Under H0 (signal absent), y[0] is a zero-mean unit-variance Gaussian (ZMUVG): π§ 0 = π₯ 0 Then under H1 (signal present), y[0] is a ZMUVG + 1: π§ 0 = π₯ 0 + * π‘ π β[βπ] = π₯ 0 + β π‘& [π] β π‘& [π] = π₯ 0 + * π‘& [π]
The solution: β π = π‘[βπ]/ β π‘& [π].
1. If y[0] > 0.5 β π‘& [π], then conclude that H1 is true (signal present). 2. If y[0] < 0.5 β π‘& [π], then conclude that H0 is true (signal absent).
π§n 0,0, π = *
\O
*
\P
π¦n π$, π&, π β βπ$, βπ&, π β¦on average for 0 β€ π β€ 11, subject to β\O β\P β& π$, π&, π = 1. Solution: β π$, π&, π β 1 12 *
n
π¦n βπ$, βπ&, π
Solution: β π$, π&, π β 1 12 *
n
π¦n βπ$, βπ&, π
first two rows, last two rows, first two columns, and last two columns
M1=N1-4, M2=N2-4, so the βvalidβ output will be 5x5.
π§n π$, π&, π = π¦n π$, π&, π ββ β π$, π&, π
= ββ
π§n π$, π&, π = *
\O
*
\P
π¦n π$, π&, π β π$βπ$, π& β π&, π Valid output pixels are the values of π§n π$, π&, π whose summations include only valid pixels of π¦n π$, π&, π and β π$βπ$, π& β π&, π . The result has size (N1-M1+1)x(N2-M2+1)=5x5.
= ββ
π§n π$, π&, π = *
\O
*
\P
π¦n π$, π&, π β π$βπ$, π& β π&, π The best match should occur at n1=0, n2=0, which is sort of the pixel in the middle of the output. However, that middle pixel is rarely the best match. Instead, the best match often occurs a few pixels to the right, left, up, or down, implying that this particular image is shifted relative to the mean- image.
= ββ
The gradient of an image turns each image plane, c, into a pair of image planes: βπ¦ π$, π&, π = π ππ$ π¦ π$, π&, π , π ππ& π¦ π$, π&, π We usually divide the gradient into two sub-images, the horizontal gradient Gx, and the vertical gradient Gy: π»a π$, π&, π = π ππ& π¦ π$, π&, π π»c π$, π&, π = π ππ$ π¦ π$, π&, π
Of course we canβt really calculate the derivative of a discrete image. So we approximate it using filters π»a π$, π&, π = π¦ π$, π&, π ββ βa π$, π& β π ππ& π¦ π$, π&, π π»c π$, π&, π = π¦ π$, π&, π ββ βc π$, π& β π ππ$ π¦ π$, π&, π
The Sobel mask is a particularly simple approximation to the gradient β it takes the difference in
βa π$, π& = 1 β1 2 β2 1 β1 βc π$, π& = 1 2 1 β1 β2 β1
The Sobel mask is very popular, in part, because each of the 2D filters can be separated into a row-filter, followed by a column- filter: βa π$, π& = 1 β1 2 β2 1 β1 = 1 2 1 1 β1 βc π$, π& = 1 2 1 β1 β2 β1 = 1 β1 1 2 1
A βseparable filterβ is one that can be written as the product of a row-filter, times a column-filter: β π$, π& = β$ π$ β& π& If the filter can be separated, then the convolution can also be separated: *
\O
*
\P
π¦n π$, π& β π$βπ$, π& β π& = *
\P
*
\O
π¦n π$, π& β$ π$ βπ$ β& π& β π&
This operation requires a double-summation, which has a computational complexity equal to (# rows)X(# columns): π§ π$, π& = *
\O
*
\P
π¦n π$, π& β π$βπ$, π& β π& This operation requires a single summation, which has a computational complexity equal to (# rows): π€ π$, π& = *
\O
π¦n π$, π& β$ π$ βπ$
This operation requires a double-summation, which has a computational complexity equal to (# rows)X(# columns): π§ π$, π& = *
\O
*
\P
π¦n π$, π& β π$βπ$, π& β π& This operation requires a single summation, which has a computational complexity equal to (# rows): π€ π$, π& = *
\O
π¦n π$, π& β$ π$ βπ$ This operation requires a single summation, which has a computational complexity equal to (# columns): π§ π$, π& = *
\P
π€ π$, π& β& π& βπ&
This operation requires a double-summation, which has a computational complexity equal to (# rows)X(# columns): π§ π$, π& = *
\O
*
\P
π¦n π$, π& β π$βπ$, π& β π& This operation requires two single summations, with a computational complexity equal to (# rows) + (# columns): π§ π$, π& = *
\P
*
\O
π¦n π$, π& β$ π$ βπ$ β& π& β π& Usually, a computational complexity of (# rows) + (# columns) is much, much less than (# rows)X(# columns)!!!
Test some feature, f[k]. Say that the test image is βclass 1β if and
Example: airplanes (blue) vs. skyscrapers (green) Threshold for the color feature
Threshold for the matched- filtering feature Threshold for the gradient feature
Notice that the only thresholds that are worth testing are the values of feature[k] that are actually measured values, for at least one datum! Varying the threshold from one datum to the next makes no change, at all, in the accuracy, so itβs not useful to test those thresholds. Example: airplanes (blue) vs. skyscrapers (green)
The βaccuracy spectrumβ for a particular feature is the list of all possible accuracies that could be achieved by any one-feature classifier.
value of feature[k] observed for any training image.
accuracies.
threshold. Example: airplanes vs. skyscrapers
What happens if some accuracies are below 50%? That just means that you should use, instead, a βnegative polarityβ classifier: Call an image βclass 0β if feature[k]>=threshold. The accuracy of the βnegative polarityβ classifier is 1 minus the accuracy of the βpositive polarityβ
max(accuracy,1-accuracy), maximum
Example: airplanes vs. skyscrapers
Skyscrapers have a lot of vertical edges (Gx is large), while airplanes have a lot of horizontal edges (Gy is large).
beetles and bicycles have pretty matchable shapes.