M- -Channel Filter Banks: Channel Filter Banks: M Block and - - PDF document

m channel filter banks channel filter banks m block and
SMART_READER_LITE
LIVE PREVIEW

M- -Channel Filter Banks: Channel Filter Banks: M Block and - - PDF document

M- -Channel Filter Banks: Channel Filter Banks: M Block and Lapped Transforms Block and Lapped Transforms Trac D. Tran ECE Department The Johns Hopkins University Baltimore MD 21218 Outline Outline Block transforms DCT: discrete


slide-1
SLIDE 1

1

M M-

  • Channel Filter Banks:

Channel Filter Banks: Block and Lapped Transforms Block and Lapped Transforms

Trac D. Tran ECE Department The Johns Hopkins University Baltimore MD 21218

Outline Outline

Block transforms

DCT: discrete cosine transform

Lapped transforms

LOT: lapped orthogonal transform TDLT: time-domain lapped transform MLT: modulated lapped transform

General filter bank connection Applications

Audio coding: MP3, AAC Image coding: JPEG, HD-Photo or JPEG-XR

slide-2
SLIDE 2

2

Filter Bank Interpretation Filter Bank Interpretation

H (z) H (z) H (z) H (z)

1 1

Q Q F (z) F (z) F (z) F (z)

1 1

x[n] x[n] x[n] x[n] ^ ^ M M M M M M M M H (z) H (z)

M M-

  • 1

1

M M F (z) F (z)

M M-

  • 1

1

M M

] [ ] [ n p n h

k k

=

Local cosine basis: a family of M-channel cosine-

modulated filter bank.

Efficient block-based implementation

Discrete Cosine Transform Discrete Cosine Transform

⎩ ⎨ ⎧ = = − ≤ ≤ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + = − − ≤ ≤ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + = − − ≤ ≤ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + = − ≤ ≤ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = −

  • therwise

, 1 , , 2 1 where 1 , 2 1 2 1 cos 2 ] [ : DCT IV Type 1 , 2 1 cos 2 ] [ : DCT III Type 1 , 2 1 cos 2 ] [ : DCT II Type , cos 2 ] [ : DCT I Type M k c M k,n M k n M n p M k,n M n k c M n p M k,n M k n c M n p M k,n M nk c c M n p

k k n k k k k n k

π π π π

Real, orthogonal block transforms with fast implementations JPEG & MPEG use Type-II DCT (symmetric basis) [Rao]

slide-3
SLIDE 3

3

DCT Type DCT Type-

  • II

II

⎪ ⎩ ⎪ ⎨ ⎧ ≠ = = − = ⎪ ⎪ ⎩ ⎪ ⎪ ⎨ ⎧ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ + = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ + =

∑ ∑

− = − =

, 1 , 2 1 1 ,..., 1 , , 2 ) 1 2 ( cos ] [ 2 ] [ 2 ) 1 2 ( cos ] [ 2 ] [

1 1

i i M n m M n m m X M n x M m n n x M m X

K K K

i M m n M n m

π π

DC DC

8 x 8 block 8 x 8 block

l

  • w

f r e q u e n c y l

  • w

f r e q u e n c y h i g h f r e q u e n c y h i g h f r e q u e n c y middle frequency middle frequency horizontal edges horizontal edges vertical edges vertical edges

DCT basis DCT basis

  • orthogonal
  • rthogonal
  • real coefficients

real coefficients

  • symmetry

symmetry

  • near

near-

  • optimal
  • ptimal
  • fast algorithms

fast algorithms

DCT Symmetry DCT Symmetry

( ) ( ) ( ) ( ) ( )

⎥ ⎦ ⎤ ⎢ ⎣ ⎡ + ± = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ + − = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + − − = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + − − M m n M m n M Mm M m n M M n M m 2 1 2 cos 2 1 2 2 2 cos 2 1 2 2 2 cos 2 1 1 2 cos π π π π π DCT basis functions are either symmetric

  • r anti-symmetric
slide-4
SLIDE 4

4

DCT: Recursive Property DCT: Recursive Property

An M-point DCT–II can be implemented via an M/2-point

DCT–II and an M/2-point DCT–IV

[ ]

⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = I J J I J C C C

IV M/2 II M/2 II M

2 1

Butterfly

II M/2

C J CIV

M/2

input samples DCT coefficients

Fast DCT Implementation Fast DCT Implementation

13 multiplications and 29 additions per 8 input samples 13 multiplications and 29 additions per 8 input samples

slide-5
SLIDE 5

5

Block DCT Block DCT

⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡

N 2 1 II M II M II M II M N 2 1

x x x C C C C X X X M M

input blocks, each of size M

  • utput blocks
  • f DCT coefficients,

each of size M

Block DCT Coding Framework Block DCT Coding Framework

Q Q Q

… …

DCT

1 2 3 1 2 3

IDCT

1 2 3 1 2 3

DCT

IDCT

1 2 3 1 2 3

DCT

IDCT

… … … …

slide-6
SLIDE 6

6

Block Transform Block Transform

Q Q

1 −

P

1 −

P

m

x

1 + m

x

P P

m

x ˆ

1

ˆ

+ m

x

… … … … …

m

y

1 + m

y

Different blocks are processed independently

Cause discontinuities at block boundary in reconstruction

Lena: JPEG Q factor = 5

Blocking Artifact Blocking Artifact

4-point DCT Q, DeQ IDCT

Blocking artifact

) , (

1 1 j T i N i N j T

j i Y t t YT T X

∑∑

− = − =

= =

X is the weighted combination of basis images. Blocking artifact is generated if basis images are not smooth at

boundaries.

Q, DeQ Data

slide-7
SLIDE 7

7

8x8 DCT Basis 8x8 DCT Basis

Has some large coefficients at block boundaries More discontinuities at boundaries Blocking artifact will be created

Filter Bank Connection Filter Bank Connection

z z

M M M M

z z z Analysis

P Q C

1

C −

1 −

z

1 −

z 1

P −

M M M M

z-1 z-1 z-1

] [n x ] [ ˆ n x

M M M M M M M M M M M M

Q

] [n x ] [ ˆ n x ) (

0 z

H ) (

1 z

H

) (

1 z

H M −

) (

0 z

F ) (

1 z

F

) (

1 z

FM −

Synthesis

slide-8
SLIDE 8

8

Lapped Transform Connection Lapped Transform Connection

Global Matrix Representation

1 1 1 1 1 1 1 n n n n n n − − + +

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ y E E x y E E x y E E x M O M M O M

LT

n

x

n

y

T Inverse LT

ˆ n x

ˆ n y

1 1 1 1 1 1 1 1

ˆ ˆ ˆ ˆ ˆ ˆ

n n n n n n − − + +

⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ G x G G y x G G y x G G y M O M M O M

T

Perfect Reconstruction Condition: T T = I

  • 1

1

, .

1 1

G E + G E = I G E = G E = 0

a s a s

Lapped Transform Basis: A Sneak Preview Lapped Transform Basis: A Sneak Preview

  • Longer filters (16-tap)
  • Coefficients are close to zero at boundaries
  • Smoother basis vectors and images
  • How to achieve this?

16x16 images

slide-9
SLIDE 9

9

Two LT Families Two LT Families

Conventional Lapped Transform

First proposed in 1985 (Malvar et al.) Post-processing of the DCT output

Time-Domain Lapped Transform

First proposed in 2001 (Tran et al.) Pre-processing of the DCT input

Conventional Lapped Transform Conventional Lapped Transform

DCT

… …

V V

post-processing

1 −

V

1 −

V

… …

IDCT

pre-processing

reconstructed signal

DCT DCT IDCT IDCT

Q Q

  • Block size is 4 here, can be generalized to other even block sizes.
  • Forward: Each coefficient is a function of two input blocks
  • Inverse: Each coefficient contributes to two output blocks
  • Overlap-add structure smoothens the block boundaries.

1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2 1/2

slide-10
SLIDE 10

10

Time Time-

  • Domain Lapped Transform

Domain Lapped Transform

  • Post-filter can be viewed as de-blocking filter
  • More compatible to DCT-based schemes
  • Used in

Pre-filter P Existing Structure Post-filter P-1

DCT

IDCT IDCT IDCT

DCT DCT

Q Q Q

W V I W P W V I W P I J J I W ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − =

− − 1 1

2 1

V

1/2 1/2 1/2 1/2

V

1/2 1/2 1/2 1/2

1 −

V

1/2 1/2 1/2 1/2

1 −

V

1/2 1/2 1/2 1/2

Forward Transform Filters Forward Transform Filters

DCT

P P X0 X1 Y S0 S1

1 1

, PX S PX S = =

Block size: M Prefiltering: Input to DCT: S Lower half of S0 and upper half of S1. Let:

M 2 M : ,

1

× ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ =

i

P P P P

⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ =

i i i i i i

X P X P PX S S S

1 1

⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = =

× × × 1 2 1 2 / 2 / 1 1 1 10 01

X X H X X P P C X P X P C S S C CS Y

M M M M M M

⎥ ⎦ ⎤ ⎢ ⎣ ⎡ =

× × × 2 / 2 / 1 2

P P C H

M M M M M M

Forward transform: Each row of H is a 2M-tap filter (basis function).

slide-11
SLIDE 11

11

Inverse Transform Filters Inverse Transform Filters

I D C T

T T X0 X1 Y Each block of coefficients contributes to two reconstruction blocks: Postfilter: T = P-1.

Y C S

T

=

After inverse DCT:

⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ =

× × 1 2 / 1 2 / 2 M M M

S S

Input to two postfilters: Reconstructed X0 and X1:

⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡

1

S T T X X

Let:

[ ]

M/2 M : T ,

i 1

× = T T T column th i : ,

1 2 1 1 1

− = = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡

− = × i M i i i M M T

g y g Y G Y C T T S T T X X

T M M

C T T G ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ =

× 1 2

Each column of G is a recon. basis function. If P orthogonal: G = HT.

Toy Example Toy Example

… …

Before Post-Filtering

i

x

1 + i

x

… …

After Post-Filtering

i

x ˆ

1

ˆ +

i

x ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡

+ + 1 1

1 1 1 1 2 1 1 1 1 1 1 2 1 ˆ ˆ

i i i i

x x x x

post-filtering operator

( )

i i i i

x x x x − + =

+1

4 1 ˆ

( ) ( )⎥

⎦ ⎤ ⎢ ⎣ ⎡ − + + =

+ + 1 1

2 1 2 1

i i i i

x x x x

( )

i i i i

x x x x − − =

+ + + 1 1 1

4 1 ˆ

( ) ( )⎥

⎦ ⎤ ⎢ ⎣ ⎡ − − + =

+ + 1 1

2 1 2 1

i i i i

x x x x

slide-12
SLIDE 12

12

2D LT Implementation 2D LT Implementation

  • 1. Apply prefilter to each row, except for image boundaries.
  • 2. Apply prefilter to each column, except for image boundaries.
  • 3. Apply 2-D DCT in each block as in JPEG.

DCT DCT DCT DCT DCT DCT DCT DCT DCT

The reverse procedure is applied at the decoder.

An image with 9 blocks. Each block is 4x4.

Pre Pre-

  • Filtering Effects

Filtering Effects

pre-filter = flattening operator post-filter = de-blocking operator high-frequency time shifting: creating piece-wise smooth signal Clockwise: original, borrow 1, 2, 3, 4.

slide-13
SLIDE 13

13

DCT: 32:1, 27.23 dB LT: 32:1, 28.95 dB

Image Coding Examples Image Coding Examples General Case General Case

KM M ×

1 1 1 1 + − − −

+ + + =

K m K m m m

x P x P x P y K

1 × M

1 × M

M-subband, KM-tap Filter Bank: ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡

+ + − − − + +

M M O O O L L L O O M M

2 1 1 1 1 1 1 1 2 1 m m m K K K m m m

P P P P P P P P P x x x y y y

M M M M M M

] [n x ) (

0 z

H ) (

1 z

H

) (

1 z

H M −

m

y

O O O O O

1 M-1

( )

) ( ) ( ) ( z z ) (

1

  • K

1

  • 1

1

z z z z

K

X P X P P P Y = + + + =

+ −

K

M M ×

1 × M 1 × M

matrix M M : ×

i

P

slide-14
SLIDE 14

14

Polyphase Polyphase Matrix Representation Matrix Representation

To get Perfect Reconstruction (PR), need

Fast Implementation: Factorization of P (z)

) (

0 z

G ) (

1 z

G ) (

1 z K −

G

… …

m

x

m

y

M M M M M M

] [ ˆ n x ) (

0 z

F ) (

1 z

F

) (

1 z

FM −

m

x ˆ

M M M M M M

] [n x ) (

0 z

H ) (

1 z

H

) (

1 z

H M −

m

y

m

x

1 M-1 H (z) p H (z) p F (z) p

F (z)H (z) = I

p p

Special Cases Special Cases

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡

+ − + −

M M O O M M

1 1 1 1 m m m m m m

x x x P P P y y y

Block Transform:

H (z) = P

Lapped Transform:

H (z) = P0 + P1z-1

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡

+ − + −

M M O O M M

1 1 1 1 1 1 1 m m m m m m

x x x P P P P P P y y y

p p

slide-15
SLIDE 15

15

Modulated Construction Modulated Construction

* =

envelope / window cosine modulated function cosine sine envelope

Modulated Lapped Transform Modulated Lapped Transform

⎩ ⎨ ⎧ ≤ ≤ ≤ ≤ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + + = 1

  • M

2 n 1

  • M

k , 2 1 2 1 cos ] [ 2 ] [ M k M n n h M n pk π frequency index time index window cosine for modulation total number of basis functions ⎩ ⎨ ⎧ = − − + − − = 1 ] 1 [ ] [ ] 1 2 [ ] [

2 2

n M h n h n M h n h Flexible window design, For perfect reconstruction: ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ + − = M n n h 2 2 1 sin ] [ π Typical window:

slide-16
SLIDE 16

16

Pre Pre-

  • and Post

and Post-

  • Filtering Construction

Filtering Construction

… … … …

coding / communication channel

IV

C

1 2 3

1 2 3

… …

T IV

C

1 2 3 1 2 3 1 2 3 1 2 3

pre-filtering

P

post-filtering

P

T Rotation ] 1 [ ] [ ] [ ] 1 [ ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − − − − − − − ≡ = n M L h n h n h n M L h R

MP3 MP3

MP3 = MPEG2

Layer III audio coding

Transform: cascade of 32- channel filter bank and 6- channel or 18- channel MDCT Quantization: uniform scalar quantizer with a psycho-acoustic model Entropy coding: run-length + Huffman

slide-17
SLIDE 17

17

Transformation Stage in MP3 Transformation Stage in MP3

H (z) H (z) H (z) H (z)

1 1

x[n] x[n] 32 32 32 32 H (z) H (z)

31 31

32 32 H (z) H (z) H (z) H (z)

1 1

6 6 6 6 H (z) H (z)

6 6

6 6 32-channel 512-tap CMFB 6-channel 12-tap MLT/MDCT H (z) H (z) H (z) H (z)

1 1

32 32 32 32 H (z) H (z)

31 31

32 32 18-channel 36-tap MLT/MDCT transients steady-state

Advanced Audio Coding (AAC) Advanced Audio Coding (AAC)

Successor of MP3 Better audio quality than MP3 at most bit rates Perceptually lossless at 320 kbps for 5-channel

surround sound (64 kbps/channel)

Almost CD quality at 96 kbps (48 kbps/channel) AAC is part of the MPEG4 Standard Default audio format of Apple’s iPhone, iPod,

iTunes; Sony PlayStation 3; Nintendo Wii

MDCT – Scalar Quantization – Huffman Coding

slide-18
SLIDE 18

18

Transformation Stage in AAC Transformation Stage in AAC

H (z) H (z) H (z) H (z)

1 1 128 128 128 128

H (z) H (z)

127 127 128 128

128-channel 256-tap MDCT H (z) H (z) H (z) H (z)

1 1 1024 1024

H (z) H (z)

1023 1023

1024-channel 2048-tap MDCT for transient signals

1024 1024 1024 1024

for steady-state signals

AAC adaptively switches between

8 blocks of 128-point MDCT with 256-point windows 1 block of 1024-point MDCT with 2048-point window All windows have 50% overlap x[n] x[n] x[n] x[n]

References References

  • K. R. Rao and P. Yip, Discrete Cosine Transform:

Algorithms, Advantages, Applications, Academic Press, San Diego, 1990.

  • H. S. Malvar, Signal Processing with Lapped Transforms,

Artech House, 1992.

  • T. D. Tran, J. Liang, and C. Tu, “Lapped transform via

time-domain pre- and post-filtering,” IEEE Trans. on Signal Processing, vol. 51, No. 6, pp. 1557-1571, Jun. 2003.