Time Domain Lapped Transform and its Applications Jie Liang - - PowerPoint PPT Presentation

time domain lapped transform and its applications
SMART_READER_LITE
LIVE PREVIEW

Time Domain Lapped Transform and its Applications Jie Liang - - PowerPoint PPT Presentation

Jie Liang Simon Fraser University Time Domain Lapped Transform and its Applications Jie Liang (jiel@sfu.ca) School of Engineering Science Simon Fraser University Vancouver, BC, Canada http://www.ensc.sfu.ca/people/faculty/jiel/ BIRS


slide-1
SLIDE 1

Banff, AB, Canada, July 23-28, 2005

  • Jie Liang

Simon Fraser University

Time Domain Lapped Transform and its Applications

Jie Liang (jiel@sfu.ca) School of Engineering Science Simon Fraser University Vancouver, BC, Canada http://www.ensc.sfu.ca/people/faculty/jiel/ BIRS Workshop on Multimedia and Mathematics

Acknowledgements

  • Prof. Trac D. Tran, The Johns Hopkins University
  • Dr. Chengjie Tu, Microsoft Corporation
  • Dr. Lu Gan, The University of Newcastle, Australia
slide-2
SLIDE 2

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Outline

Introduction

From filter bank to time-domain lapped transform (TDLT)

Fast TDLT Pre/post-filtering for 2D and 3D Wavelet transform Error Resilient TDLT Current Works Summary

slide-3
SLIDE 3

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

M M M M M M

] [ ˆ n x ) (

0 z

F ) (

1 z

F

) (

1 z

FM −

Filter Bank Fundamental

M M M M M M

] [n x ) (

0 z

H ) (

1 z

H

) (

1 z

H M −

Processing

( )

) ( ) ( ) ( z z ) (

1

  • K

1

  • 1

1

z z z z

K

X P X P P P Y = + + + =

+ −

                               =                

+ + − − − + +

  • 2

1 1 1 1 1 1 1 2 1 m m m K K K m m m

P P P P P P P P P x x x y y y

1 1 1 1 + − − −

+ + + =

K m K m m m

x P x P x P y

slide-4
SLIDE 4

Jie Liang

  • Simon Fraser University

Banff July 27, 2005 Perfect Reconstruction: R (z) P (z) = I, or R (z) = P-1(z).

Fast Implementation: Factorization of P (z)

P (z) ) (

0 z

G ) (

1 z

G ) (

1 z K −

G

… …

m

x

m

y

Filter Bank Fundamental

) ( z y

) ( z x

P (z) 1 M-1

) ( ˆ z x

R (z)

M M M M M M ] [ ˆ n x ) (

0 z

F ) (

1 z

F

) (

1 z

FM −

M M M M M M ] [n x ) (

0 z

H ) (

1 z

H

) (

1 z

H M −

slide-5
SLIDE 5

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Linear Phase Filter Banks

Linear phase:

Desired property for image/video coding

General structure [Vaidyanathan93, Gao01, Gan01]

1 −

z

1 −

z G ) (

1 z

G

U

V

1

V

) (

2 z

G ) (

1 z K−

G

: ,

i i V

U

Invertible matrices. Can be optimized for different applications.

slide-6
SLIDE 6

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Rate-Distortion Optimization

Objective: Design the filter bank to minimize the MSE for a given bit rate.

x

P (z) P (z)

{ }

2

ˆ trace δ = + =

P ee

x x e R

  • 1

2 CM 10 2 P

10log

P

σ γ σ   =    

Design Criterion: Coding Gain

MSE reduction of transform coding w.r.t. PCM

Q

slide-7
SLIDE 7

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

A Special Case: Block Transform

Q Q

1 −

P

1 −

P

m

x

1 + m

x P P

m

x ˆ

1

ˆ

+ m

x

Karehunen-Loeve Transform (KLT):

Optimal block transform Signal dependent, No fast algorithm

Discrete Cosine Transform (DCT):

Fast approx. of the KLT for AR(1) signals.

Drawbacks of block transform:

Blocking artifact Limited compression capability

) (z P P

slide-8
SLIDE 8

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Lapped Transform [Malvar et al. 1985]

Apply post-processing of the DCT to

Improve compression efficiency and reduce blocking artifact

  • A special case of linear phase filter banks

DCT

… …

V V

post-processing

1 −

V

1 −

V

… …

IDCT

pre-processing

DCT DCT IDCT IDCT

Q Q

slide-9
SLIDE 9

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Time-Domain Lapped Transform

V V

1 −

V

1 −

V

DCT IDCT IDCT IDCT DCT DCT

Q Q Q

[Tran-2001]

More compatible to DCT-based schemes Also a special case of linear phase filter banks Adopted by MS WMV-9, SMPTE VC-1, HD-DVD.

slide-10
SLIDE 10

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Effect of Prefiltering

A flattened image

slide-11
SLIDE 11

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

DCT vs LT: Basis Functions

8-point DCT 8 x 16 TDLT

slide-12
SLIDE 12

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Frequency Responses

0.1 0.2 0.3 0.4 0.5

  • 40
  • 35
  • 30
  • 25
  • 20
  • 15
  • 10
  • 5

5

DC Att. ≥ 322.1021, Mirr Att. ≥ 305.2548, Stopband ≥ 11.9696, Coding Gain = 9.6115 dB Normalized Frequency Magnitude Response (dB)

8-point DCT 8 x 16 TDLT 8.83 dB 9.61 dB

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

  • 40
  • 35
  • 30
  • 25
  • 20
  • 15
  • 10
  • 5

5

DC Att. ≥ 409.0309, Mirr Att. ≥ 320.1639, Stopband ≥ 9.9559, Coding Gain = 8.8259 dB

Normalized Frequency Magnitude Response (dB)

slide-13
SLIDE 13

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Applications & Generalizations

  • Fast TDLT [Liang et al. 2001]
  • Pre/Post-filtering for Wavelet [Liang et al. 2003]
  • Error Resilient TDLT [Tu et al. 2002, Liang et al. 2005]
  • Generalized Lapped Transform [Liang et al. 2002]
  • Adaptive Entropy Coding for TDLT [Tu et al. 2001]
  • Oversampled TDLT [Gan-Ma-2002]
  • Undersampled TDLT [Tu et al. 2004]
  • Regularity Constrained TDLT [Dai et al. 2001]
  • Adaptive TDLT [Dai et al. 2005]
slide-14
SLIDE 14

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Outline

Introduction Fast TDLT Pre/post-filtering for 2D and 3D Wavelet transform Error Resilient TDLT Current Works Summary

slide-15
SLIDE 15

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Fast Orthogonal TDLT

  • θ

1

θ

2

θ

3

θ

4

θ

5

θ T

V V =

−1

General structure of V:

θ sin θ sin − θ cos θ cos

M x M Prefilter:

V

1 M/2 M-1

slide-16
SLIDE 16

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Fast Orthogonal TDLT

Quasi-optimal coding gain [Liang-Tran-Tu-01]: 9.26 dB for M = 8 Close to optimal filter bank Can be generalized to large block size (e.g., 128)

  • 0.17π
  • 0.12π
  • 0.05π
slide-17
SLIDE 17

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Fast Biorthogonal TDLT

9.58 3 / 4

  • 3/8

1 / 2

  • 1/4

1 / 4

  • 1/16

9/8 9/8 9/8 3/2 9.37 3 / 4

  • 1 / 2

1 / 2

  • 1/4

1 / 4 1 1 1 1

  • 3/8

P2 9.59 3 / 4 1 / 2

  • 1/4

1 / 4

  • 1/16

8/7 8/7 8/7 4/3 Gain U2 U1 P1 U0 P0 S3 S2 S1 S0

Integer Solutions: > 0.3 dB higher than orthogonal TDLT

  • More freedoms, better performance

Fast approximation (lifting steps, LU factorization):

1

:

T − ≠

V V

s0 s1 s2 s3 P2 U2 P1 U1 P0 U0

slide-18
SLIDE 18

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Image Coding Performance

TDLT vs Wavelet Both coded by SPIHT [Said, Pearlman, 1996] (Improved entropy coding in [Tu, Tran, 2001])

10 20 30 40 50 60 70 80 90 100 22 24 26 28 30 32 34 36 38 40 42

  • Comp. Ratio

PSNR WT TDLT

Lena Barb

> 1.5dB

slide-19
SLIDE 19

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Image Coding Performance

WT 32:1: 27.58 dB TDLT 32:1: 28.95 dB

slide-20
SLIDE 20

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Outline

Introduction Fast TDLT Pre/post-filtering for 2D and 3D Wavelet transform Error Resilient TDLT Current Works Summary

slide-21
SLIDE 21

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

JPEG 2000 & Tiling Artifact

No blocking artifact if

WT is applied to the entire image

Used by JPEG 2000

Problem:

Memory requirement

Tradeoff:

Tiling approach Tiling Artifact

Tile size: 64 x 64, 0.2bpp

slide-22
SLIDE 22

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Average MSE of All Rows & Columns

50 100 150 200 250 300 350 400 450 500 100 200 300 400 X Pixel MSE 50 100 150 200 250 300 350 400 450 500 100 200 300 Y Pixel MSE

MSE is more than doubled at tile boundaries

slide-23
SLIDE 23

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Pre/post-filtering for WT

Q Q

P W T W T W T I W T I W T I W T

Q

Apply small pre/post filters at tile boundaries

P

… …

  • 1

P

  • 1

P

slide-24
SLIDE 24

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Problem Formulation

Effect of pre/post-filters:

In LT: affect all subbands In WT: affect some subbands

) 1 (

1

z + γ

) 1 (

1 2 −

+ z γ

1

k

2

k

5/3 WT x7 x0 x1 x2 x3 x4 x5 x6 y0 y1 y2 y3 y4 y5 y6

  • 1

P P

ˆ x

1

ˆ x

2

ˆ x

3

ˆ x

4

ˆ x

LP HP

slide-25
SLIDE 25

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Problem Formulation

Boundary Filter Bank Optimization can be performed similar to LT

P P

  • 1

. . . . . . . . . . . . . . . . . . . . . . . .

F0 G0 F1 G1

Processing Processing x x

1

u u u

1

u

1

y

1

y . . . . . . y

1

y

{ {

{ {

{ {

slide-26
SLIDE 26

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Fast Structure

Examples: S = [5/3, 4/3, 6/5, 9/8], U = [1/8, 1/4, 5/8], S = [2, 1, 1, 1], U = [1/8, 1/4, 1/2], (lossless) Optimal pre/post-filters for WT and DCT are similar

S0 S S S U0 U

  • 1/2

1/2 1/2 1/2 1/2 1/2 1/2 1/2

slide-27
SLIDE 27

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Examples

9/7 WT, 8 x 8 Pre/Post Filters: 0.2bits/pixel JPEG 2000 (Kakadu) JPEG 2000 & Pre/Post 29.87 dB 29.97 dB

slide-28
SLIDE 28

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Average MSE of All Rows and Columns

50 100 150 200 250 300 350 400 450 500 100 200 300 400 X Pixel MSE 50 100 150 200 250 300 350 400 450 500 100 200 300 Y Pixel MSE 50 100 150 200 250 300 350 400 450 500 100 200 300 400 X Pixel MSE 50 100 150 200 250 300 350 400 450 500 100 200 300 Y Pixel MSE

JPEG 2000 (Kakadu) JPEG 2000 & Pre/Post

slide-29
SLIDE 29

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Applications in 3D WT Video Coding

Divide video sequence into groups of frames Apply 2D WT within each frame Apply another 1D WT in temporal direction

WT WT WT

Advantages:

Lower complexity (no motion estimation)

Full scalabilities: SNR, resolution, and frame rate.

slide-30
SLIDE 30

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Jittering Artifact

144 frames 16 frames per GOP 120 : 1

Performance degradation at group boundaries

Solution 1: Global WT [Xu et al. 2002] Problems: Memory, random access, error resilience …

20 40 60 80 100 120 140 160 32 32.5 33 33.5 34 34.5 35 35.5 36 36.5 37

  • Avg. PSNR: 35.32 dB, STD: 0.71 dB

Frame PSNR (dB)

slide-31
SLIDE 31

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Pre/Post-filtering for 3-D WT

WT WT WT Pre Pre Apply pre-filter before WT, and post-filter after IWT Previous pre/post design can be directly applied.

slide-32
SLIDE 32

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Group: 16 frames 6-tap pre/post filter

9/7 WT Up to 2.5 dB gain at

boundaries

Comparison with GOP and Global WT

20 40 60 80 100 120 140 160 32 32.5 33 33.5 34 34.5 35 35.5 36 36.5 37 Frame PSNR (dB) Claire.qcif at 120 : 1 Global WT (35.67dB) Filtered GOP WT (35.57dB) GOP WT (35.32dB)

slide-33
SLIDE 33

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Outline

Introduction Fast TDLT Pre/post-filtering for 2D and 3D Wavelet transform Error Resilient TDLT Current Works Summary

slide-34
SLIDE 34

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Error Concealment and Error Resilience

Compressed bit stream is sensitive to transmission error/loss Traditional solutions: Channel coding, Retransmission Not always acceptable Human visual system can tolerate some errors: Error concealment at the decoder is preferred. Error resilient encoder: Encoder introduces some redundancies to facilitate concealment at the decoder. Lapped transform is a good candidate…

slide-35
SLIDE 35

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Motivation for Error Resilient LT

DCT

… …

V V DCT DCT

Encoder:

Can spread the information

  • f each block into two blocks

Decoder:

Prediction of the lost block is easier

Conflicting requirements:

Compression Error resilience

Trade-off required

slide-36
SLIDE 36

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Error Resilient Lapped Transform

( )

1 1

ˆ ˆ ˆ /2

n n n − +

← + y y y

[Hemami1996] First error resilient LT design. Encoder: Trading compression for error resilience. Decoder:

  • 1. Estimate lost blocks by mean reconstruction method:

[Chung-Wang1999, 2002] Multiple description coding Maximal smoothness reconstruction

Improved visual quality

  • 2. Apply inverse lapped transform.
slide-37
SLIDE 37

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Error Resilient TDLT

[Tu-Tran-Liang-2003] Only need to design the pre/post-filters

Old approach: M x 2M unknowns Pre/post-filter: M/2 x M/2

More flexibilities:

Biorthogonal filter Non-perfect-construction design

Limitation:

Mean reconstruction is still used

I D C T I D C T 1

ˆn− s

1

ˆn+ s P-1 P-1 T T 1/2 1/2

slide-38
SLIDE 38

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

General Wiener Filter Solution

General framework: Estimate and from and

I D C T I D C T 1

ˆn− s ˆ n y

1

ˆ n+ y

ˆ n x

1

ˆ n+ x

1

ˆ n− y

1

ˆn+ s P-1 P-1 T T 1/2 1/2

ˆ n x

1

ˆ n+ x

1

ˆn− s

1

ˆn+ s

Existing Method

I D C T I D C T

H

1

ˆ n − s

1

ˆ n+ y

ˆ n x

1

ˆ n+ x

1

ˆ n− y

1

ˆ n + s P-1 P-1 General Framework [Liang et al. 05]

( )( )

{ }

2 2 2 2

ˆ ˆ

T

E =

ee

R Hs - x Hs - x

{ }

min MSE=tr

ee

R

2 2 2 2

1 ˆ ˆ ˆ 2 2 M M x s s s

H R R

− ×

=

Expression can be found if Rxx is known (e.g., AR(1)).

slide-39
SLIDE 39

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Special Cases

H1: M x 2M Wiener filter (Near optimal performance) Link to previous approach: H1 = [I I] / 2 Mean reconstruction method.

I D C T I D C T

1

ˆ n+ y

ˆ n x

1

ˆ n+ x

1

ˆ n− y

P-1 P-1 T T H1

1

ˆn− s

1

ˆn+ s ˆn s

Divide H into two stages:

slide-40
SLIDE 40

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Design Criteria for Optimization

Use Matlab to find optimal pre-filter and post-filter Trade coding performance for error concealment. Objective function for optimization: Coding Gain (CG):

MSE when there is no transmission error

Concealment Residual (CR):

The MSE after transmission error and error concealment

Reconstruction Gain (RG)

[Hemami96] Control the distribution of error to improve visual quality

maximize (CG) - (CR) (RG) ε α β = +

slide-41
SLIDE 41

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Design Example

5 10 15 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

MSE at Each Pixel (CG: 8.42 dB)

Pixel MSE TDLT: 0.18 Old: 0.15 New: 0.06

Cfg 1:

  • Coding gain optimized filters
  • M x 2M Wiener filter

Cfg 2:

  • Joint optimized filters
  • Mean reconstruction
  • [Tu et al 03]

Cfg 3:

  • Joint optimized filters,
  • M x 2M Wiener:

60% less than the mean reconstruction method.

slide-42
SLIDE 42

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Simulation Results (512 x 512 Lena, 1bpp, 50% loss)

Loss Pattern Cfg 1: 24.3 / 40.1 dB Cfg 2: 26.0 / 38.3 dB Cfg 3: 30.5 / 39. 2 dB

slide-43
SLIDE 43

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Outline

Introduction Fast TDLT Pre/post-filtering for 2D and 3D Wavelet transform Error Resilient TDLT Current Works Summary

slide-44
SLIDE 44

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Image TDLT Transformed Image Entropy coding Entropy coding Entropy coding Entropy coding MDC1 MDC2 MDC3 MDC4

Group 1 Group 2 Group 3 Group 4

One block

Multiple Description Coding

An alternative to improve the robustness to transmission error: Create multiple (equally important) output bit streams. Each stream alone can give a coarse reconstruction. Quality can be improved if more descriptions are received.

Error scenarios (15 cases):

slide-45
SLIDE 45

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

2-D Error Concealment

Current method:

1-D prediction Average of row and column results

Ideal method:

Joint prediction from 2-D neighbors How to predict? How to design the pre/postfilters?

slide-46
SLIDE 46

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Difficulties

Related work:

Edge directed prediction [Li-Orchard-2002]

But geometrical structure is disturbed by prefiltering

Pixel-by-pixel approach may not work

Prefiltered image After IDCT (with loss)

slide-47
SLIDE 47

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

Summary

Time domain lapped transform Fast algorithms Applications in 2D and 3D WT Error resilient design Comparison to JPEG 2000: Lower complexity Competitive performance Promising for handheld devices

V V

1 −

V

1 −

V

DCT IDCT IDCT IDCT DCT DCT

Q Q Q

slide-48
SLIDE 48

Jie Liang

  • Simon Fraser University

Banff July 27, 2005

References

  • T. D. Tran, J. Liang, and C. Tu, "Lapped transform via time-

domain pre- and post-filtering," IEEE Trans. on Signal Processing, vol. 51, No. 6, pp. 1557-1571, Jun. 2003.

  • J. Liang, C. Tu and T. D. Tran, "Optimal pre/post-filtering for

wavelet-based image and video compression," IEEE Trans. on Image Processing, to appear.

  • J. Liang, C. Tu, T. D. Tran and L .Gan, “Wiener filtering for

generalized error resilient time domain lapped transform," 2005 IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, Philadelphia, PA, Mar. 2005.

  • C. Tu, T. D. Tran and J. Liang, "Error resilient pre-/post-filtering

for DCT-based block coding systems," IEEE Trans. on Image Processing, to appear.

  • C. Tu and T. D. Tran, "Context based entropy coding of block

transform coefficients for image compression," IEEE Trans. on Image Processing, vol. 11, pp. 1271-1283, Nov. 2002.