Lecture 5: Compression
Instructor: Kate Ching-Ju Lin (林靖茹)
1
Wireless Communication Systems
@CS.NCTU
- Chap. 7-8 of “Fundamentals of Multimedia”
Some reference from http://media.ee.ntu.edu.tw/courses/dvt/15F/
Wireless Communication Systems @CS.NCTU Lecture 5: Compression - - PowerPoint PPT Presentation
Wireless Communication Systems @CS.NCTU Lecture 5: Compression Instructor: Kate Ching-Ju Lin ( ) Chap. 7-8 of Fundamentals of Multimedia Some reference from http://media.ee.ntu.edu.tw/courses/dvt/15F/ 1 Outline Concepts of
Instructor: Kate Ching-Ju Lin (林靖茹)
1
Some reference from http://media.ee.ntu.edu.tw/courses/dvt/15F/
2
3
Application uncompressed compressed Audio conference 64kbps 16-64kbps Video conference 30.41Mbps 64-768kbps Digital video on CD-ROM (30fps) 60.83Mbps 1.5-4Mbps HDTV (59.94fps) 1.33Gbps 20Mbps
Remove redundancy!
4
Source Encoder Channel Encoder 01011000… 01011001… Channel Encoder Source Encoder
reconstructed
according to a probability model
codes, etc.
data even with errors and (or) losses
during transmission (e.g., random loss or burst loss)
5
6
but not annoying, imperceptible
7
sizebefore sizeafter PSNR = 10 log10( σ2
peak
σ2
n
) SNR = 10 log10( σ2
s
σ2
n
) σ2
n = 1
N
N
X
i=1
(xi − yi)2
Mean square error
8
9
i(si) = log2 1 pi Low probability pi à Large amount of information High probability pi à Small amount of information
more disorder
an average number of bits equal to entropy
10
η = H(S) =
n
X
i=1
pi ∗ i(si) =
n
X
i=1
pi log2 1 pi = −
n
X
i=1
pi log2 pi
Claude Elwood Shannon, “A mathematical theory of communication,” Bell System Technical Journal, vol. 27, pp. 379-423 and 623-656, Jul. and Oct. 1948
if f(xi) = f(xj) = y for some xi ≠ xj
codeword, i.e., yi not the prefix of yj for all yi ≠ yj
message unambiguously from the beginning
11
x: symbol y: codeword
12
s1 = 0 s2 = 01 s3 = 11 s4 = 00
0011 could be s4s3 or s1s1s3
s1 = 0 s2 = 01 s3 = 011 s4 = 11
Coded sequence: 0111111 …. 11111 s4 à Decode until receiving all bits
13
14
à longer codeword path
s1 s2 sk
1 1 1 1
pmax pmin
1. Sort all symbols according to their probabilities 2. Repeat until only one symbol left
a) Pick the two symbols with the smallest probabilities b) Add the two symbols as childe nodes c) Remove the two symbols from the list d) Assign the sum of the children's probabilities to the parent e) Insert the parent node to the list
16
Symbol Count Probability Code A 15 0.375 B 7 0.175 100 C 7 0.175 101 D 6 0.150 110 E 5 0.125 111
17
E(5) D(6) P1(11) C(7) P2(14) B(7) P3(25) P4(45) A(20) 1 1 1 1
message approaches its entropy à shown η ≤ E[L] ≤ η+1
log2(1/p) close to 0 à but still need one bit
18
≤ a,b ≤ 1
à more bits to represent a smaller real number
à less bits to represent a greater real number
19
Symbol low high range 1.0 1.0 C 0.3 0.5 0.2 A 0.30 0.34 0.04 E 0.322 0.334 0.012 E 0.3286 0.3322 0.0036 $ 0.33184 0.33220 0.00036
Symbol low high range 1.0 1.0 C 0.3 0.5 0.2 A 0.30 0.34 0.04 E 0.322 0.334 0.012 E 0.3286 0.3322 0.0036 $ 0.33184 0.33220 0.00036 Sym probability range A 0.2 [0, 0.2) B 0.1 [0.2, 0.3) C 0.2 [0.3, 0.5) D 0.05 [0.5, 0.55) E 0.3 [0.55, 0.85) F 0.05 [0.85, 0.9) $ 0.1 [0.9, 1) Encode a message CAEE$
21
1 0.2 0.3 0.5 0.55 0.85 0.9 A B C D E F $ A B C D E F $ 0.3322 0.33184 A B C D E F $ 0.3322 0.3286 A B C D E F $ 0.334 0.322 A B C D E F $ 0.34 0.3 A B C D E F $ 0.5 0.3
rangemin(s) ≤ value ≤ rangemax(s)
22
23
24
Number of zeros next non-zero value
25
26
Source:http://jov.arvojournals.org/article.aspx ?articleid=2213283
representation: transformed domain
28
transformed domain
Greater entropy Need mores bits Smaller entropy Need less bits
decomposed into an infinite sum of sinusoidal waveforms
frequency components
after transformation
hardly be detected by human eyes
29
f(x) =
∞
F(u) cos(uwx)
31
http://68.media.tumblr.com/8ab71becbff0e242d0bf8d b5b57438ab/tumblr_mio8mkwT1i1s5nl47o1_500.gif
32
f(x)= + F(3) * F(2) * F(1) * + +
DC Low frequency
High frequency Basis functions
F(0) *
0.4 0.5 0.5 0.1
+ + + =
33
large amplitude
DCT
Block size 8x8
35
dc
Low frequencies Medium frequencies High frequencies
dc Vertical edges High frequencies
and different ACs basis functions
36
f(i)
DCT F(u)
Time domain Frequency domain
f(x)= + F(0) * F(3) * F(2) * F(1) * + +
Basis function
F(u)
IDCT f(i)
Time domain Frequency domain
F(u) = C(u)
N−1
f(x) cos (2x + 1)uπ 2N
f(x) =
N−1
C(u)F(u) cos (2x + 1)uπ 2N
C(0) =
N , C(u) = 1, u = 0
sum of 2D cosine functions (basis functions)
37
F(u, v) = 2 N C(u)C(v)
N−1
f(x, y) cos 2π(2x + 1)u 4N cos 2π(2y + 1)v 4N C(u), C(v) =
1, otherwise
38
DC AC (vertical patterns) AC (horizontal patterns)
entropy à need only a few bits for compression
time-domain pixel values frequency-domain response
39
DCT
40
41
resolution and coarser time resolution
42
43
LL2 HL2 LH2 HH2 HL1 LH1 HH1
44
45
Image
Transform Quantizer Entropy Coder Prediction or Reversible Transform Entropy Coder
Compressed Bitstream Image Compressed Bitstream
The only lossy
For De-correlation & Energy Compaction Lossless Compression
different levels is m = 2n
perceptual irrelevancy
46
0.2 0.4 0.6 0.8 1
47
gap: quantization error no error
0.2 0.4 0.6 0.8 1
gap: smaller error no error Coarser resolution Finer resolution
48
Source: wikipedia
Four levels Larger errors Eight levels Smaller errors
represented by the same number of points (e.g., k- means clustering)
49
to 0
à adjusting D to get better rate-distortion optimization
quantization
50
Q Q Q D Q Q Q
Reconstructed Values (negative) Reconstructed Values (positive)
complex)
real (training) data
51
: Reconstructed value point, usually the centroid
52
maximum amplitude 2n−1 maximum quantization noise, e.g., 1/2 ⇒ 20 log10 2n−1 1/2
53
input
Transform Quantizer Entropy Coder
Inverse Transform Inverse Quantizer Entropy Decoder Compression Decompression
54