Department of Computer Science IV University of Mannheim, Germany - - PowerPoint PPT Presentation
Department of Computer Science IV University of Mannheim, Germany - - PowerPoint PPT Presentation
Stephan Kopf Department of Computer Science IV University of Mannheim, Germany Motivation Part I: Basic Retargeting Operations Scaling and cropping Regions of interest Automatic crop & scale Sports video adaptation
Motivation Part I: Basic Retargeting Operations
- Scaling and cropping
- Regions of interest
- Automatic crop & scale
- Sports video adaptation
Part II: Seam Carving
- Seam carving for images
- Preservation of straight lines
- Fast seam carving for videos
Summary
15.02.2011 Stephan Kopf 2
Mobile phones are multimedia devices that allow to
- browse the Web
- display images and videos
- support novel input technologies (multi-touch)
15.02.2011 Stephan Kopf 3
But they still have limitations:
- Small screen size
- Wireless connection (bandwidth)
- Computational power (CPU, memory)
- Battery
Typical resolutions of images and videos
- Digital camera: 10 megapixels (3.600 x 2.700 pixels)
- Camcorder: high definition (1.920 x 1.080 pixels)
- Mobile phone (240 x 320 pixels)
15.02.2011 Stephan Kopf 4
HD video mobile phone
Bitrate: 24 Mbit/s Distortions caused by scaling (aspect ratio)
Goal als of media dia retar arget getin ing
Shrink photos and videos for the presentation on a mobile
phone (this automatically limits the bitrate)
Keep aspect ratio Preserve the most important visual content
Algorithms for image and video retargeting
15.02.2011 Stephan Kopf 5
6
Shrink image (merge pixels) by a fixed scale factor (uniform
scaling)
Different scale factors for each axis change the aspect ratio
(non-uniform scaling)
Relevance of image content is ignored „Letterboxing“ is used to preserve aspect ratio Example:
15.02.2011 Stephan Kopf 7
Crop image borders until aspect ratios of image and display
match
Relevance of image content is ignored: important content
may be lost
Typically use scaling to convert to target size Example:
15.02.2011 Stephan Kopf 8
Idea ea
Identify most relevant image regions (regions of interest) Crop borders but preserve regions of interest Use automatic algorithms to identify regions of interest:
- Saliency maps
- Faces
- Text regions
15.02.2011 Stephan Kopf 9
Assumption: image regions that are relevant for an observer
have a high contrast
Step 1: Contrast map of an image of size n × m :
color of a pixel: pi ,j pixel in local neighborhood of pi ,j : distance function: d (.)
Step 2: Quantize contrast map Step 3: Find connected regions Step 4: Mark region of interest
15.02.2011 Stephan Kopf 10 *Source: Ma and Zhang HJ: Contrast-based image attention analysis by using fuzzy growing, ACM Intl. Conf. on Multimedia, 2003
contrast map quantized contrast map region of interest bounding box
15.02.2011 Stephan Kopf 11
Use automatic face detection algorithms to localize face
regions
Frontal face detection algorithms work very robust
(in contrast to face recognition)
15.02.2011 Stephan Kopf 12
Characteristic features of text:
- horizontal alignment
- significant luminance difference between text and background
- the character size is within a certain range
- single-colored
- text is visible in consecutive frames (video)
- horizontal or vertical motion is possible (video)
Calculate a horizontal projection profile to detect the
boundaries of text lines
15.02.2011 Stephan Kopf 13
Calculate importance value V for each region of size H:
minimum perceptible size: Hmin maximum reasonable size: Hmax
Find optimal target region W based on regions of interest Si:
15.02.2011 Stephan Kopf 14
Selection of one feature Combination of two features … three features Full image
15.02.2011 Stephan Kopf 15
scaling cropping crop & scale
15.02.2011 Stephan Kopf 16
Automatically detect:
Court lines Players Ball
scaled video modify video content
15.02.2011 Stephan Kopf 17 *Source: Kopf, Guthier, Farin, Han: Analysis and Retargeting of Ball Sports Video, IEEE Workshop on Applications
- f Computer Vision, 2011
Step 1: Mark bright pixels (line pixels) Step 2: Algorithm to detect straight lines (based on RANSAC)
- 1. Randomly select two line pixels and calculate line parameters
- 2. Count number of white pixels N located on line
- 3. If (N
N > threshold) stop
- 4. Goto 1.
Step 3: Remove line pixels and detect next line (Step 2)
15.02.2011 Stephan Kopf 18 RANSAC: Fischler, Bolles: Random sample concensus: a paradigm for model fitting with applications to image analysis and automated cartography, Communications ACM, vol 24(6), 1981.
Problem: Position of lines change from frame to frame Solution: use a reference court model to estimate camera
motion
- Step 1: Calculate intersection points of two lines
- Step 2: Transform lines to court model
How many intersection points do we need
for the transformation?
15.02.2011 Stephan Kopf 19
Translation (horizontal/vertical shift)
1 intersection point
15.02.2011 Stephan Kopf 20
Translation and scaling
2 intersection points
Affine transform (translation, scaling, rotation)
3 intersection points
Perspective transform
4 intersection points
cropping scaling crop & scale (zoom on largest player) modify lines & ball
15.02.2011 Stephan Kopf 21
22
If important content is located near image borders:
crop & scale is not applicable Idea ea of f seam am carvin ving* g*
Systematic removal of less important pixels Use energy function as measurement of „importance“ of
single pixels
*Source: Shai Avidan and Ariel Shamir: Seam Carving for Content-Aware Image Resizing. ACM SIGGRAPH, 2007 15.02.2011 Stephan Kopf 23
Image width should be reduced by 40 percent
- riginal image energy map
15.02.2011 Stephan Kopf 24
Remove N pixels with the lowest energy from each line
15.02.2011 Stephan Kopf 25
remove N=200 pixels from each line based on energy values source image
Summarize energy in each column of the image and
remove N columns with lowest energy
remove 200 columns based on energy values
- f columns
- riginal image
15.02.2011 Stephan Kopf 26
A vertical seam is an 8-connected path
- f pixels from top to bottom that contains
- ne and only one pixel in each row.
Formal definition: Horizontal seams are defined in a analog way.
15.02.2011 Stephan Kopf 27
1 | 1)
- x(i
- x(i)
| : i subject to , i)} {(x(i), = } {s = s
n 1 i n 1 i x i x
Advantage of seams compared to columns or rows:
- Pixels of low energy are removed
- Relevant objects are preserved
15.02.2011 Stephan Kopf 28
Remove the vertical seam with the lowest energy Repeat this step N times
15.02.2011 Stephan Kopf 29
remove N=200 seams based on lowest energy source image
Seam carving uses an energy function that characterizes the
relevance of each pixel (similar to saliency maps).
The optimal seam minimizes the cumulated pixel energy of
all seam pixels.
Method to find optimal seam: dynamic programming
15.02.2011 Stephan Kopf 30
M ( i, j ) specifies the cost of the optimal (vertical) seam from
the upper image border to pixel position (i, j )
Calculate M( i, j ) recursively:
) 1 , 1 ( ) , 1 ( ) 1 , 1 ( min ) , ( ) , ( j i M j i M j i M j i e j i M
15.02.2011 Stephan Kopf 31
1
Example how to calculate the optimal seam:
1 3 6 7 3 6 7 2 5 1 4 1 2 3 4 1 2 3 3 5 4 4 1 ) 1 , 1 ( ) , 1 ( ) 1 , 1 ( min ) , ( ) , ( j i M j i M j i M j i e j i M 2 5 4 3 4 5 4 5 7 9 8 9
15.02.2011 Stephan Kopf 32
energy map cumulated energy map M( i, j )
Image gradient: simple energy function that calculates the
luminance difference to adjacent pixels:
Assumption: Luminance values do not differ much in image
regions of low relevance
This simple energy function gives good results in many cases
) , ( ) , ( )) , ( ( y x I y y x I x y x I e
15.02.2011 Stephan Kopf 33
Problem: The light house is an important region, but the pixel
values are very similar
- riginal image
- ptimal seams
result
15.02.2011 Stephan Kopf 34
Combine energy function with saliency map
)) , ( ( ) , ( )) , ( ( y x I e y x saliency w y x I e
s sal
saliency map
- ptimal seams result
(esal is used as energy function)
Source: Hwang and Chien. Content-Aware Image Resizing using Perceptual Seam Carving with Human Attention Model. IEEE Conference on Multimedia and Expo, 2008. 15.02.2011 Stephan Kopf 35
Use results from face detection as additional saliency:
)) , ( ( ) , ( ) , ( )) , ( ( y x I e y x face w y x saliency w y x I e
f s face sal
saliency map face map seams based on esal+face as energy function result
15.02.2011 Stephan Kopf 36
The quality of seam carving drops significantly in case of
straight lines
- riginal image
seam carving (width reduced by 40%)
Source: Kiess, Kopf, Guthier and Effelsberg: Seam Carving with Improved Edge Preservation.
- Proc. of IS&T/SPIE Electronic Imaging, 2010.
15.02.2011 Stephan Kopf 37
Problem: lines become distorted when seams are removed
image section visualizing a straight line seams intersect a straight line result after removal of seams
15.02.2011 Stephan Kopf 38
This is especially critical when several seams intersect a line
in adjacent pixel positions
Idea: Distribute intersection points of seams and lines
along the line
seams intersect a line in adjacent pixel positions result: errors are clearly visible equal distribution
- f intersection
points result: errors are much less obvious
15.02.2011 Stephan Kopf 39
Implementation: modify energy function before the next
- ptimal seam is calculated
Intersection point of seam and line: increase energy values in
a certain radius
The following seams will avoid these pixels
seam intersects a straight line Modify energy function next to the intersection point detect next seams and modify energy function for each intersection
15.02.2011 Stephan Kopf 40
- riginal image seam carving
seam Carving with line preservation
15.02.2011 Stephan Kopf 41
1. Idea: Use seam carving on each frame separately
video becomes blurred and shaky
- riginal
adapted
Impr provem
- vements
ts
Video defines a 3D space-time volume (3D cube) Remove 2D seam manifolds (seam surface areas) where each
seam pixel is connected in 3D
Use graph cuts (max-flow min-cut) to detect optimal seam
manifold
Source: Rubinstein, Shamir, Avidan: Improved seam carving for video retargeting. ACM Trans. Graph. 27, 3, 2008.
time
source node sink node frame N frame 1 edges: energy between pixels
Computational effort?
Idea ea
Create one image that aggregates the pixel values / energy
values of all frames
Detect 1D seam in aggregated image Map this seam back to all frames
Source: Kopf, Kiess, Lemelson, Effelsberg: FSCAV - Fast Seam Carving for Size Adaptation of Videos, ACM Intl. Conf. on Multimedia, 2009. 15.02.2011 Stephan Kopf 44
Proble lem: camera motion, zoom, panning
Use image registration techniques to calculate the parameters
- f the camera model (use perspective camera model)
Align frames and create a background image Detect optimal seam in background image Use inverse camera motion to transform optimal seam back
to all original frames
15.02.2011 Stephan Kopf 45
Example: construct background image from a camera pan
15.02.2011 Stephan Kopf 46
15.02.2011 Stephan Kopf 47
scaling seam carving
15.02.2011 Stephan Kopf 48
scaling fast seam carving
The quality of adapted images or videos depends on the
visual content. The results of crop & scale might be much better than seam carving or vice versa.
Crop & scale typically works well if the relevant content is
located in a small region.
In case of large background areas, many seams with low
energy are detected and the results based on seam carving are very good.
15.02.2011 Stephan Kopf 49
No technique works well if most of the content is highly
relevant.
Would it be possible to find better energy functions for
seams?
Would it be possible to preserve other geometric objects
similar to straight lines?
Would it be possible to automatically evaluate the quality of
adapted images or videos?
15.02.2011 Stephan Kopf 50