Image and Video Coding: Transform Coefficient Coding 18 6 2 0 1 0 0 - - PowerPoint PPT Presentation

image and video coding transform coefficient coding
SMART_READER_LITE
LIVE PREVIEW

Image and Video Coding: Transform Coefficient Coding 18 6 2 0 1 0 0 - - PowerPoint PPT Presentation

Image and Video Coding: Transform Coefficient Coding 18 6 2 0 1 0 0 0 2 0 1 0 0 0 0 0 1 2 0 0 0 0 0 0 entropy 0 0 0 0 0 0 0 0 11101010001101110011110 coding 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0


slide-1
SLIDE 1

Image and Video Coding: Transform Coefficient Coding

18 6 2 0 1 0 0 0 2 0 1 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

· · · 11101010001101110011110 · · ·

entropy coding

slide-2
SLIDE 2

Transform Coefficient Coding

18 6 2 0 1 0 0 0 2 0 1 0 0 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

· · · 1110101010011110 · · ·

transform & quantization entropy coding

Entropy Coding of Quantization Indexes (for transform coefficients) Requirements: Lossless coding: Reconstruction of same indexes at decode side Efficient coding: Use as less bits as possible (on average) Transform and quantization: Quantization indexes have certain statistical properties Utilize statistical properties of quantization indexes for an efficient coding !

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 2 / 42

slide-3
SLIDE 3

Variable-Length Coding / Scalar Codes

Entropy Coding

Lossless Coding / Entropy Coding Maps sequence of symbols into sequence of bits Reversible Mapping: Original sequence of symbols can be reconstructed Unique Decodability Each sequence of bits can only be generated by one sequence of symbols Most important class: Prefix codes Simplest Variant: Scalar Variable-Length Codes Table that assigns a codeword to each symbol Examples:

symbol codeword a 000 b 001 c 010 d 011 e 100 f 101 g 110 h 111 symbol codeword a 1 b 01 c 001 d 0001 e 00001 f 000001 g 0000001 h 0000000 symbol codeword a 0000 b 0001 c 001 d 01 e 10 f 110 g 1110 h 1111

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 3 / 42

slide-4
SLIDE 4

Variable-Length Coding / Scalar Codes

Prefix Codes

Property: No codeword represents a prefix of any other codeword Prefix codes can be represented as binary code trees There are no better uniquely decodable codes than the best prefix codes Uniquely and instantaneously decodable codes

letter codeword a 00 b 010 c 011 d 10 e 1100 f 1101 g 111

1 1 1 1 1 1 root node a (00) b (010) c (011) d (10) e (1100) f (1101) g (111)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 4 / 42

slide-5
SLIDE 5

Variable-Length Coding / Scalar Codes

Coding Efficiency of Scalar Codes

Design Goal: Minimize number of required bits Minimize average codeword length ¯ ℓ while retaining unique decodability ¯ ℓ =

  • k

pk ℓk Assign shorter codewords to more probable symbols Minimum Achievable Average Codeword Length Entropy H: Lower bound on average codeword length ¯ ℓ H = H(S) = H(p) = −

  • k

pk log2 pk with ¯ ℓ ≥ H Redundancy: Increase in average codeword length relative to entropy absolute : ̺ = ¯ ℓ − H ≥ 0 relative : ̺′ = ¯ ℓ H − 1 ≥ 0

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 5 / 42

slide-6
SLIDE 6

Variable-Length Coding / Scalar Codes

Optimal Prefix Codes

Prefix Codes with Zero Redundancy Necessary condition: ¯ ℓ = H

  • k

pk ℓk = −

  • k

pk log2 pk ∀k : ℓk = − log2 pk ∀k : pk = 2−ℓk Only possible if all probability masses represent negative integer powers of 2 Optimal Prefix Codes: Huffman Algorithm Algorithm for constructing prefix codes with minimum redundancy

1 Construct binary code tree (for given probability mass function)

Select the two least likely symbols and create a parent node Consider the combination of the two symbols as new symbol Continue the algorithm until all symbols are merged into the root node

2 Construct code by labeling the branches with “0” and “1”

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 6 / 42

slide-7
SLIDE 7

Variable-Length Coding / Scalar Codes

Example: Construction of a Huffman Code

given: alphabet A with pmf {pk}

ak pk codewords a 0.16 111 b 0.04 0001 c 0.04 0000 d 0.16 110 e 0.23 01 f 0.07 1001 g 0.06 1000 h 0.09 001 i 0.15 101

first step: assign symbols and probabilities to terminal nodes

0.16 a 0.04 b 0.04 c 0.16 d 0.23 e 0.07 f 0.06 g 0.09 h 0.15 i 0.08 0.16 a 0.08 0.04 b 0.04 c 0.16 d 0.23 e 0.07 f 0.06 g 0.09 h 0.15 i 0.13 0.16 a 0.08 0.04 b 0.04 c 0.16 d 0.23 e 0.13 0.07 f 0.06 g 0.09 h 0.15 i

next step: re-order for better readability

0.23 e 0.16 a 0.16 d 0.15 i 0.13 0.07 f 0.06 g 0.09 h 0.08 0.04 b 0.04 c 0.17 0.23 e 0.16 a 0.16 d 0.15 i 0.13 0.07 f 0.06 g 0.17 0.09 h 0.08 0.04 b 0.04 c 0.28 0.23 e 0.16 a 0.16 d 0.28 0.15 i 0.13 0.07 f 0.06 g 0.17 0.09 h 0.08 0.04 b 0.04 c 0.32 0.23 e 0.32 0.16 a 0.16 d 0.28 0.15 i 0.13 0.07 f 0.06 g 0.17 0.09 h 0.08 0.04 b 0.04 c

next step: re-order for better readability

0.32 0.16 a 0.16 d 0.28 0.15 i 0.13 0.07 f 0.06 g 0.23 e 0.17 0.09 h 0.08 0.04 b 0.04 c 0.40 0.32 0.16 a 0.16 d 0.28 0.15 i 0.13 0.07 f 0.06 g 0.40 0.23 e 0.17 0.09 h 0.08 0.04 b 0.04 c 0.60 0.60 0.32 0.16 a 0.16 d 0.28 0.15 i 0.13 0.07 f 0.06 g 0.40 0.23 e 0.17 0.09 h 0.08 0.04 b 0.04 c 1.00

root next step: label branches with 0 and 1 1 1 1 1 1 1 1 1 next step: assign codewords (follow branches from root to terminal nodes) ¯ ℓ = 2.98 H(p) ≈ 2.94 ̺ ≈ 0.04 ̺′ ≈ 1.34 %

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 7 / 42

slide-8
SLIDE 8

Variable-Length Coding / Conditional Codes and Block Codes

Inefficiency of Scalar Huffman Codes

Large Probabilities Scalar codes are inefficient if a probability mass pk ≫ 0.5 Minimum length of a codeword is 1 bit Relative redundancy can become very large ( > 100 % ) Random Sources with Memory Strong statistical dependencies between successive symbols Symbol probabilities depend on value of previous symbols Cannot exploit dependencies with scalar code

binary source

sk p(sk) codeword a 0.9 b 0.1 1 H ≈ 0.469 ¯ ℓ = 1 ̺′ ≈ 113 %

Instationary Sources Statistical properties change over time Optimal code changes over time Adjust code: Send updates for codeword table (costs bit rate) Regularly construct code in encoder/decoder based on statistics (rather complex)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 8 / 42

slide-9
SLIDE 9

Variable-Length Coding / Conditional Codes and Block Codes

Conditional Codes

Concept:

Switch codeword table depending on preceding symbol (or function of preceding symbols) Construct codeword tables using corresponding conditional probabilities

conditional Huffman code sk−1 = a sk−1 = b sk−1 = c sk p(sk | a) codeword p(sk | b) codeword p(sk | b) codeword a 0.90 0.15 10 0.25 10 b 0.05 10 0.80 0.15 11 c 0.05 11 0.05 11 0.60 ¯ ℓa = 1.1 ¯ ℓb = 1.2 ¯ ℓc = 1.4 scalar Huffman code sk p(sk) codeword a

29/45

b

11/45

10 c

5/45

11 ¯ ℓscal = 61/45 ≈ 1.3556

Average codeword length ¯ ℓcond for conditional Huffman code

¯ ℓcond =

  • ∀z

p(z) · ¯ ℓz = 29 45 · 1.1 + 11 45 · 1.2 + 5 45 · 1.4 = 521 450 ≈ 1.1578

Conditioning never decreases coding efficiency (for optimal codes): ¯ ℓcond ≤ ¯ ℓscal

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 9 / 42

slide-10
SLIDE 10

Variable-Length Coding / Conditional Codes and Block Codes

Block Codes

Concept:

Assign codewords to fixed-size blocks of N consecutive symbols Construct codeword tables using corresponding joint probabilities

conditional pmf sk p(sk|a) p(sk|b) p(sk|c) a 0.90 0.15 0.25 b 0.05 0.80 0.15 c 0.05 0.05 0.60 scalar: ¯ ℓscal ≈ 1.3556 conditional: ¯ ℓcond ≈ 1.1578 block Huffman code (N = 2) sksk+1 p(sk, sk+1) codeword aa 0.5800 ab 0.0322 10000 ac 0.0322 10001 ba 0.0367 1010 bb 0.1956 11 bc 0.0122 101100 ca 0.0278 10111 cb 0.0167 101101 cc 0.0667 1001 ¯ ℓ2 ≈ 2.0188 ¯ ℓ ≈ 1.0094 block Huffman coding N ¯ ℓ # codewords 1 1.3556 3 (scalar) 2 1.0094 9 3 0.9150 27 4 0.8690 81 5 0.8462 243 6 0.8299 729 7 0.8153 2187 8 0.8027 6561 9 0.7940 19683

Two effects: Large probabilities are reduced & conditional probabilities can be taken into account Increasing block size N never decreases coding efficiency (for optimal codes) Size of codeword tables exponentially increases with block size N

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 10 / 42

slide-11
SLIDE 11

Variable-Length Coding / Conditional Codes and Block Codes

Efficiency and Bounds for Scalar, Conditional, and Block Huffman Codes

Random process S = {Sn}, alphabet A = {a}, condition Cn = f (Sn−1, Sn−2, · · · ) Lower Bounds on Average Codeword Length Scalar codes: Marginal Entropy

H(S) = −

  • a∈A

p(a) · log2 p(a)

Conditional codes: Conditional Entropy

H(S | C) = −

  • a∈A, c∈C

p(a, c) · log2 p(a | c) ≤ H(S)

Block codes: Block Entropy / N

HN(S) N = − 1 N

  • a∈AN

p(a) · log2 p(a) ≤ HN−1(S) N − 1

Entropy Rate & Fundamental Lossless Coding Theorem entropy rate: ¯ H(S) = lim

N→∞

HN(S) N for all lossless codes: ¯ ℓ ≥ ¯ H(S)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 11 / 42

slide-12
SLIDE 12

Variable-Length Coding / V2V Codes

V2V Codes

Generalization of Block Codes Assign variable-length codewords to symbol sequences of variable length (V2V) All messages must be representable by selected symbol sequences (desirable: redundancy-free set) Select symbol sequences that represent full M-ary tree (alphabet of size M)

A = {a, b} sequence codeword aaa aab 100 ab 101 ba 110 bb 111 A = {a, b} sequence codeword aaaaa aaaab 10 aaab 110 aab 1110 ab 11110 b 11111 A = {x, y, z} sequence codeword xxx xxy 100 xxz 101 xy 1100 xz 1101 y 1110 z 1111

Advantage: Smaller codeword tables than block codes for same efficiency

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 12 / 42

slide-13
SLIDE 13

Variable-Length Coding / V2V Codes

Prefix Codes for Symbol Sequences: V2V Codes as Double Tree

iid source (M = 3) symbol probability a 0.80 b 0.15 c 0.05 entropy rate: ¯ H(S) = 0.88418 ( equal to H(S) ) scalar Huffman: ¯ ℓ = 1.2 ( 3 codewords ) 2-symbol blocks: ¯ ℓ = 0.93375 ( 9 codewords ) V2V code: ¯ ℓ = 0.88934 ( 7 codewords ) ̺ = 0.00516 ( ̺′ = 0.58 % )

a a a b c b c b c aaa aab aac ab ac b c

(0.512) (0.096) (0.032) (0.12) (0.04) (0.15) (0.05) 1 1 1 1 1 1 1 000 01000 001 01001 011 0101

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 13 / 42

slide-14
SLIDE 14

Run-Level Coding Approaches / What Entropy Coding for Transform Coefficient Levels ?

Typical Properties of Transform Coefficient Levels

7 3 1 0 -3 -1 0 0 1 0 -1 0 -2 0 0 0 0 0 0 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

  • 4 0 2 0 0 -1 0 0
  • 3 1 0 0 0 0 0 0

2 -1 0 0 0 0 0 0

  • 2 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Typical properties at reasonable quantization Most quantization indexes are equal to zero ( p(0) ≫ 0.5 ) Probability of zero is higher for high-frequency components Non-zero levels are concentrated at top-left corner These properties should be exploited in entropy coding

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 14 / 42

slide-15
SLIDE 15

Run-Level Coding Approaches / What Entropy Coding for Transform Coefficient Levels ?

Scalar Variable Length Codes ?

Design Huffman Code Measure statistics for large number of pictures Develop Huffman code for estimated pmf Alternative: Universal Code Use universal code that works well for estimated pmf Example: Exponential Golomb code Coding Efficiency? Require at least one bit per sample Average codeword length: ¯ ℓ ≥ 1 ≫ ¯ H Maximum compression ratio 8 : 1 (for 8 bit samples) Unsuitable for transform coefficient levels in image and video coding

Exp-Golomb code

q codewords 1 10 0

  • 1

10 1 2 110 00

  • 2

110 01 3 110 10

  • 3

110 11 4 1110 000

  • 4

1110 001 5 1110 010

  • 5

1110 011 6 1110 100

  • 6

1110 101 . . . . . .

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 15 / 42

slide-16
SLIDE 16

Run-Level Coding Approaches / What Entropy Coding for Transform Coefficient Levels ?

Conditional Codes or Block Codes ?

Conditional Variable Length Codes Same problem as for simple scalar codes Need at least one bit per transform coefficient Block Codes Block codes for blocks of transform coefficients seems reasonable Should provide very good coding efficiency But:

N×N blocks with K quantization values yields codeword tables of size K N2 8×8 blocks with 255 values: More than 10154 codewords 8×8 blocks with 2 values: 264 codewords (address space of 64-bit architecture)

Possible usage:

Very small block sizes (2 or 3 coefficients) Signal position of non-zero coefficients for very small block sizes (e.g., 2×2) and additionally transmit non-zero levels

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 16 / 42

slide-17
SLIDE 17

Run-Level Coding Approaches / What Entropy Coding for Transform Coefficient Levels ?

V2V Codes ?

How to select symbol sequences ? Our problem are the zeros ( large probability ! ) Could use following set of symbol sequences:

X 0X 00X 000X ( X is a place holder for non-zero quantization indexes ) 0000X . . . 00000· · · 0

N×N block and K potential values: K · N2 + 1 codewords Example: 8×8 block and 255 values: 16321 codewords Large, but feasible codeword tables (can be further reduced by clustering into categories)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 17 / 42

slide-18
SLIDE 18

Run-Level Coding Approaches / Run-Level Coding

Run-Level Coding

Run-Level Coding = V2V Code with Structural Constraint Represent symbol sequences as (run, level) pairs X run = 0, level = X 0X run = 1, level = X 00X run = 2, level = X 000X run = 3, level = X 0000X run = 4, level = X . . . 00000· · · 0 end of block (eob) ( run = maximum possible value ) Codewords are assigned to (run, level) pairs run : Number of levels equal to zero preceding next non-zero level level : Value of next non-zero level eob : Special (run, level) pair symbol indicating that all remaining levels are equal to zero

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 18 / 42

slide-19
SLIDE 19

Run-Level Coding Approaches / Run-Level Coding

Scanning of Transform Coefficient Levels

Coding Order for Run-Level Coding Most effective (run, level) pair is the end-of-block symbol (eob) Should maximize number of zeros that are represented by eob symbol Define suitable coding order from low-frequency to high-frequency components

zig-zag scan (JPEG, MPEG-2, H.263, MPEG-4, H.264 | AVC) diagonal scan (HEVC, VVC)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 19 / 42

slide-20
SLIDE 20

Run-Level Coding Approaches / Run-Level Coding

Example: Run-Level Coding with Zig-Zag Scan

  • 4 0 2 0 0 0 0 0
  • 3 0 0 0 0 0 0 0

2 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

excerpt of codeword table (run, level) codeword (s = sign) (0, ±1) 11s (0, ±2) 0100 s (0, ±3) 0010 1s (0, ±4) 0000 110s (1, ±1) 011s (1, ±2) 0001 10s (1, ±3) 0010 0101 s (2, ±1) 0101 s (eob) 10

1 Scanning and conversion into (run, level) pairs

(0,-4) (1,-3) (0,2) (1,2) (2,-1) (eob)

2 Conversion into bitstream (via table look-up)

00001101 00100101 10100000 01100010 1110 (36 bits)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 20 / 42

slide-21
SLIDE 21

Run-Level Coding Approaches / Run-Level Coding

Run-Level Coding in Image and Video Coding Standards

Run-Level Coding in JPEG Split (run, level) information into

Variable-length code for (run, category) pair Fixed-length code for level inside category (# bits = category)

Variable-length codeword table for (run, category) pairs

Maximum run = 16, but there is a (16,0) pair for long runs of zeros 162 codewords (16 runs × 10 levels + 2 special symbols)

category absolute levels 1 1 2 2 ... 3 3 4 ... 7 4 8 ... 15 5 16 ... 31 6 32 ... 63 7 64 ... 127 8 128 ... 255 9 256 ... 511 10 512 ... 1023

Run-Level Coding in H.262 | MPEG-2 Video Fixed codeword table with 114 codewords (not counting sign bits)

112 most likely (run, level) pairs and end-of-block symbol Includes additional escape symbol for less likely (run, level) pairs

Escape symbol is followed by fixed-length code for run (6 bits) and level (12 bits)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 21 / 42

slide-22
SLIDE 22

Run-Level Coding Approaches / Improvements of Run-Level Coding

Run-Level-Last Coding (H.263 and MPEG-4 Visual)

Observation: Last non-zero level has typically small value Represent Sequence as Triples (run, level, last) run: Number of levels equal to zero preceding next non-zero level level: Value of next non-zero level last: Whether next non-zero level is last non-zero level in block Variable-Length Coding Table 103 codewords (when not counting sign bits) for most likely (run, level, last) triples Includes escape symbol, followed by 6 bits (run), 8 bits (level), 1 bit (last) Coded Block Pattern (CBP) Note: Code cannot represent a block with all levels equal to zero Need to signal a flag whether a block has any non-zero coefficients Coded block pattern: Block code for four 8×8 luma blocks of a 16×16 macroblock For chroma blocks: Combined with other syntax elements

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 22 / 42

slide-23
SLIDE 23

Run-Level Coding Approaches / Improvements of Run-Level Coding

Example: Run-Level-Last Coding

  • 4 0 2 0 0 0 0 0
  • 3 0 0 0 0 0 0 0

2 -1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

excerpt of codeword table (run, level, last) codeword (s = sign) (0, ±1, 0) 10s (0, ±2, 0) 1111 s (0, ±3, 0) 0101 01s (0, ±4, 0) 0010 111s (1, ±1, 0) 110s (1, ±2, 0) 0101 00s (1, ±3, 0) 0001 1110 s (2, ±1, 0) 1110 s (2, ±1, 1) 0011 10s

1 Scanning and conversion into (run, level, last) events

(0,-4,0) (1,-3,0) (0,2,0) (1,2,0) (2,-1,1)

2 Conversion into bitstream (without coded block pattern)

00101111 00011110 11111001 01000001 101 (35 bits)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 23 / 42

slide-24
SLIDE 24

Run-Level Coding Approaches / Improvements of Run-Level Coding

Further Improvements of Run-Level Coding

What Improvements are Possible ? Utilize further dependencies between levels inside a transform block Utilize dependencies between levels of neighbouring transform blocks H.264 | AVC: Context-Adaptive Variable Length Coding (CAVLC) Reverse scan order, separate coding of runs and levels, combined syntax element Multiple codeword tables per syntax element (selection based on preceding syntax elements) Advantages: Exploitation of conditional probabilities in designing codeword tables Utilization of more dependencies between quantization indexes Utilization of dependencies between transform blocks Note: H.264 | AVC specifies another entropy coding method Significantly improved coding efficiency (uses adaptive conditional probabilities) Context-Based Adaptive Binary Arithmetic Coding (CABAC)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 24 / 42

slide-25
SLIDE 25

Arithmetic Coding

Arithmetic Coding of Quantization Indexes

Arithmetic Coding Block coding without storage of codeword table Close to optimal (suboptimal in comparison to Huffman code for same block size) Many advantages Iterative codeword construction (e.g., block code for entire picture) Easy incorporation of conditional probabilities Easy incorporation of adaptive probability models Arithmetic Coding in Image and Video Coding Standards Already included in JPEG, H.263, ... Rarely used in practice (suboptimal design) Context-based Adaptive Binary Arithmetic Coding (CABAC) Alternative entropy coding method in H.264 | AVC Only entropy coding method in H.265 | HEVC and H.266 | VVC

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 25 / 42

slide-26
SLIDE 26

Arithmetic Coding / Shannon-Fano-Elias Coding

Basic Idea of Shannon-Fano-Elias Coding

Special Block Code for N symbols

Order all possible symbol sequences with N symbols: s1, s2, s3, · · · Each symbol sequence sk is associated with a half-open interval I(sk) =

  • L, L+W
  • f the cdf F(s)

Transmit any number v inside the interval I(sk) as binary fraction messages s of N symbols s1 · · · sk−1 sk+1 · · · sk sk F(sk−1) F(sk) I(sk) =

  • L, L+W
  • L = F(sk−1)

F(s) W = F(sk) − F(sk−1) = p(sk) v = 0.01011b codeword “01011”

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 26 / 42

slide-27
SLIDE 27

Arithmetic Coding / Shannon-Fano-Elias Coding

Iterative Determination of Interval Boundaries

a p(a) c(a) A

1/2

N

1/3 1/2

B

1/6 5/6

L0 + W0 = 1 L0 = 0 A N B A N B A N B A N B A N B A N B L = 0.8819.. L+W = 0.8842.. which value ?

Algorithm: Interval Refinement initialization: W0 = 1 L0 = 0 iteration: Wn+1 = Wn · p(sn) Ln+1 = Ln + Wn · c(sn) with c(s) =

  • ∀a<s

p(a) init

B A N A N A

Wn 1

1 6 1 12 1 36 1 72 1 216 1 432

Ln

5 6 5 6 21 24 21 24 127 144 127 144

final interval: LN = 127

144

and WN =

1 432

What value inside interval ? How to select codeword ?

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 27 / 42

slide-28
SLIDE 28

Arithmetic Coding / Shannon-Fano-Elias Coding

Codeword Selection for Given Interval

(n − 3) · 2−K (n − 2) · 2−K (n − 1) · 2−K n · 2−K (n + 1) · 2−K (n + 2) · 2−K (n + 3) · 2−K

2−K

v ∈ I v ∈ I

Required Number K of Bits Distance between successive binary fractions of K bits is 2−K For guaranteeing that a binary fraction of K bits falls inside an interval of width W , we require 2−K ≤ W choose: K =

  • − log2 W
  • Binary Codeword

Interval representative v ∈ I : Round up lower interval boundary L to next binary fraction of K bits Codeword = Binary representation of integer z =

  • L · 2K

with K bits

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 28 / 42

slide-29
SLIDE 29

Arithmetic Coding / Shannon-Fano-Elias Coding

Finalization of Iterative Encoding

a p(a) c(a) A

1/2

N

1/3 1/2

B

1/6 5/6

L0 + W0 = 1 L0 = 0 A N B A N B A N B A N B A N B A N B L = 0.8819.. L+W = 0.8842.. v = 0.8828.. Encoding Algorithm initialization: W0 = 1 L0 = 0 iteration: Wn+1 = Wn · p(sn) Ln+1 = Ln + Wn · c(sn) finalization: K = ⌈− log2 W ⌉ z = ⌈ L · 2K⌉ codeword: integer z with K bits

init

B A N A N A

Wn 1

1 6 1 12 1 36 1 72 1 216 1 432

Ln

5 6 5 6 21 24 21 24 127 144 127 144

K =

  • − log2 W
  • =
  • log2 432
  • = 9

z =

  • L · 2K

=

  • 127

144 · 512

  • = 452
  • v = 452

512

  • b = "111000100"

(z =452 with K =9 bits)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 29 / 42

slide-30
SLIDE 30

Arithmetic Coding / Shannon-Fano-Elias Coding

Iterative Decoding

a p(a) c(a) A

1/2

N

1/3 1/2

B

1/6 5/6

Wn+1 = Wn · p(.) Ln+1 = Ln+Wn · c(.) L0 + W0 = 1 L0 = 0 A N B A N B A N B A N B A N B A N B (Ln, Wn)

0,1

5 6, 1 6 5 6, 1 12 21 24, 1 36 21 24, 1 72 127 144, 1 216

(Ln+1, Wn+1) (A)

0, 1

2 5 6, 1 12 5 6, 1 24 21 24, 1 72 21 24, 1 144 127 144, 1 432

(Ln+1, Wn+1) (N)

1 2, 1 3 11 12, 1 18 21 24, 1 36 8 9, 1 108 127 144, 1 216 191 216, 1 648

(Ln+1, Wn+1) (B)

5 6, 1 6 35 36, 1 36 65 72, 1 72 97 108, 1 216 383 432, 1 432 287 324, 1 1296

symbol sn

B A N A N A

v = (0.111000100)b = 452

512

b = "111000100" s = "BANANA"

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 30 / 42

slide-31
SLIDE 31

Arithmetic Coding / Arithmetic Coding

Arithmetic Coding

Arithmetic Coding = Finite-Precision Variant of Iterative Shannon-Fano-Elias Coding Represent probabilities p(·) and interval width Wn with finite number of bits (V and U bits) Rounding down interval width Wn in each iteration step ( intervals must not overlap ! ) Encoding and decoding can be realized with standard integer arithmetic Bits of codeword are written / read during encoding / decoding Ln = 0.

zn−U bits

  • aaaaa · · · a
  • zn−cn−U

settled bits

0111111 · · · 1

  • cn
  • utstanding bits

xxxxx · · · x

  • U+V

active bits

00000 · · ·

  • trailing bits

trailing bits: bits equal to zero, may change later not stored active bits: directly modified by next update store as integer with U + V bits

  • utstanding bits:

may be modified by carry from active bits store as counter settled bits: cannot be modified in future updates directly write to bitstream

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 31 / 42

slide-32
SLIDE 32

CABAC / Basic Design

CABAC: Context-Based Adaptive Binary Arithmetic Coding

binary arithmetic coder bypass mode regular mode binarizer context modelling regular arithmetic coding engine bypass arithmetic coding engine

previous bin for context model update bin string bin bin bin context model

bitstream syntax element

CABAC in Video Coding Standard (H.264 | AVC, H.265 | HEVC, H.266 | VVC) Binary arithmetic coding: Simplest form of arithmetic coding, requires binarization Adaptive coding: Adaptive probability models (updated during encoding and decoding) Context-based coding: Switchable probability models (partly use conditional probabilities) Includes fast non-adaptive bypass mode that uses probabilities p(0) = p(1) = 0.5

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 32 / 42

slide-33
SLIDE 33

CABAC / Basic Design

Binarization

Represent syntax elements as sequence of binary decisions (bins) Can use any prefix code (note: entropy does not change) Use simple structured codes

truncated exponential Rice codes ( R = 0 : unary ) value fixed-length unary unary Golomb R = 1 R = 2 R = 3 000 1 1 1 10 100 1000 1 001 01 01 010 11 101 1001 2 010 001 001 011 010 110 1010 3 011 0001 0001 0010 0 011 111 1011 4 100 0000 1 0000 1 0010 1 0010 0100 1100 5 101 0000 01 0000 01 0011 0 0011 0101 1101 6 110 0000 001 0000 001 0011 1 0001 0 0110 1110 7 111 0000 0001 0000 000 0001 000 0001 1 0111 1111 8 0000 0000 1 0001 001 0000 10 0010 0 0100 0 9 0000 0000 01 0001 010 0000 11 0010 1 0100 1 10 0000 0000 001 0001 011 0000 010 0011 0 0101 0 11 0000 0000 0001 0001 100 0000 011 0011 1 0101 1 12 0000 0000 0000 1 0001 101 0000 0010 0001 00 0110 0 13 0000 0000 0000 01 0001 110 0000 0011 0001 01 0110 1 14 0000 0000 0000 001 0001 111 0000 0001 0 0001 10 0111 0 15 0000 0000 0000 0001 0000 1000 0 0000 0001 1 0001 11 0111 1 · · · · · · · · · · · · · · · · · ·

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 33 / 42

slide-34
SLIDE 34

CABAC / Basic Design

Adaptive Probability Models (Example: H.264 | AVC and H.265 | HEVC)

Concept of Adaptive Probability Models Estimate binary pmf {p0, 1 − p0} based on actually coded bins {b} (in encoder and decoder) Represent pmf as: pLPS: Probability of least probable symbol (LPS) with pLPS ∈ ( 0; 0.5] vMPS: Value of most probable symbol (MPS) Probability model is updated after each bin b ( based on a model of “exponential aging” ) pLPS =

  • α · pLPS

: b = vMPS 1 − α · (1 − pLPS) : b = vMPS with α =

63

  • 0.01875

0.5 ≈ 0.95 Implementation of Adaptive Probability Models 126 probability states with p ∈ [ 0.01875; 0.98125 ] Probability state pstate represented by 7 bits (6 bits for pLPS, 1 bit for vMPS) Update via table look-up: pstate = stateTable[ pstate ][ b ]

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 34 / 42

slide-35
SLIDE 35

CABAC / Basic Design

Arithmetic Coding Engine

Binary Arithmetic Coding Engine Most simple variant of arithmetic coding Decoder search is reduced to one comparison (whether value lies below threshold) Design allows low-complexity implementations CABAC: Two coding mode Regular coding mode: Coding with adaptive probability model Bypass mode: Coding with fixed pmf {0.5, 0.5} (lower complexity) Design of arithmetic coding for syntax elements What binarization? How many probability models (for which bins) ? How to select probability models (usage of conditional models) ?

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 35 / 42

slide-36
SLIDE 36

CABAC / Transform Coefficient Coding

H.264 | AVC: Transform Coefficient Coding with CABAC

9 0 0 -1

  • 6 0 0 0

3 1 0 0 0 0 0 0

zig-zag scan levels: 9 0 -6 3 0 -1 1 · · · significance map (forward order, interleaved) sig flag: 1 1 1 1 1 last flag: 1 absolute values and sign (reverse order, interleaved) abs minus 1: 8 5 2 sign flag: 1 1

Basic Approach of Coefficient Coding

1 Coded Block Flag (CBF):

Indicating whether block includes non-zero levels

2 Significance Map (forward):

sig flag (whether level is non-zero) last flag (whether last non-zero level in block)

3 Actual Values (reverse):

abs minus 1 (absolute values minus 1) sign flag (whether level is negative)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 36 / 42

slide-37
SLIDE 37

CABAC / Transform Coefficient Coding

H.264 | AVC: Binarization and Context Modeling

Binarization of Absolute Values Minus 1 Prefix part: Truncated unary code Suffix part: Exponential Golomb code (order 0) Context Modeling for Significance Map (1 model per flag) Probability model for sig flag and last flag depends

  • n x and y position inside transform block

Context Modeling for Actual Values (2 models per level) first prefix bin: Number of abs. values equal to 1 and whether any abs. value greater than 1 remaining prefix bins: Number of abs. values greater than 1 suffix bins and sign: Bypass mode of arithmetic coder

abs minus 1 abs sig prefix suffix 1 1 2 1 10 3 1 110 4 1 1110 5 1 11110 6 1 111110 7 1 1111110 8 1 11111110 9 1 111111110 10 1 1111111110 11 1 11111111110 12 1 111111111110 13 1 1111111111110 14 1 11111111111110 15 1 11111111111111 0 16 1 11111111111111 100 17 1 11111111111111 101 18 1 11111111111111 11000 19 1 11111111111111 11001 20 1 11111111111111 11010 · · · · · · · · · · · ·

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 37 / 42

slide-38
SLIDE 38

CABAC / Transform Coefficient Coding

Transform Coefficient Coding in H.265 | HEVC and H.266 | VVC

transform block 4×4 subblock first non-zero

Partitioning of transform blocks into 4×4 subblocks Coding order: Typically diagonal top-right scan of subblocks and coefficients inside subblock Coding of Syntax Elements:

Coded block flag (cbf) x and y coordinate of first non-zero coefficient Coded subblock flag (except for first and last subblock) Coefficient inside non-zero subblocks (multiple passes)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 38 / 42

slide-39
SLIDE 39

CABAC / Transform Coefficient Coding

H.265 | HEVC: Coding of Coefficients inside 4 × 4 Subblocks

Binarization of Absolute Values Unary part (up to 3 bins) Rice-Golomb code for remainder (Rice code + Exp-Golomb code) Coding Order: Five Passes over Scanning Positions

1 Significance bins sig for all scan positions (in regular mode) 2 Up to 8 greater-than-one bins gt1

(in regular mode)

3 Up to 1 greater-than-two bin gt2

(in regular mode)

4 Sign flags for non-zero coefficients

(in bypass mode)

5 Remainders rem using Rice-Golomb code (in bypass mode)

Probability Model Selection for Regular Bins depends on Transform block size and location inside transform block Values of coded subblock flags in direct neighbourhood

|q| sig gt1 gt2 rem – – – 1 1 – – 2 1 1 – 3 1 1 1 4 1 1 1 1 5 1 1 1 2 6 1 1 1 3 7 1 1 1 4 8 1 1 1 5 9 1 1 1 6 10 1 1 1 7 11 1 1 1 8 12 1 1 1 9 13 1 1 1 10 14 1 1 1 11 15 1 1 1 12 . . . . . . . . . . . . . . .

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 39 / 42

slide-40
SLIDE 40

CABAC / Transform Coefficient Coding

H.266 | VVC: Binarization and Coding Order inside 4 × 4 Subblocks

Binarization of Absolute Values Dedicated parity flag par (useful for special quantizer: TCQ) Reconstruction of absolute values: |q| = sig + gt1 + par + 2 · (gt3 + rem) Coding Order: Three Passes over Scanning Positions

1 Context-coded bins sig, gt1, par, gt3

(in regular adaptive mode of arithmetic coding engine)

2 Remainder rem using parametric Golomb-Rice code

(in bypass mode of arithmetic coding engine)

3 Sign flags for non-zero coefficients (in bypass mode) |q| sig gt1 par gt3 rem – – – – 1 1 – – – 2 1 1 – 3 1 1 1 – 4 1 1 1 5 1 1 1 1 6 1 1 1 1 7 1 1 1 1 1 8 1 1 1 2 9 1 1 1 1 2 10 1 1 1 3 11 1 1 1 1 3 12 1 1 1 4 13 1 1 1 1 4 14 1 1 1 5 15 1 1 1 1 5 . . . . . . . . . . . . . . . . . .

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 40 / 42

slide-41
SLIDE 41

CABAC / Transform Coefficient Coding

H.266 | VVC: Context Modeling

template dclass (luma) dclass (chroma)

Context Model for Binary Decision sig depends on (in total: 60 probability models)

1 Sum of partially reconstructed values inside local template (first pass: q∗ = sig + gt1 + par + 2 · gt3) 2 Class dclass for diagonal position (3 classes for luma, 2 classes for chroma) 3 For special quantizer (TCQ): Quantization state (0..3)

Context Model for Binary Decisions gt1, par, gt3 (in total: 32 probability models per flag) Similar: Depends on diagonal position, values in template, and whether first non-zero coefficient Rice Parameter for Remainder rem Rice parameter = Function of sum of absolute values in template

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 41 / 42

slide-42
SLIDE 42

Summary

Summary of Lecture

Run-Level Coding V2V codes with specific structure, combined with suitable scanning (e.g., zig-zag scan) Used in JPEG, MPEG-2 Video Extensions of Run-Level Coding Run-Level-Last Coding (H.263, MPEG-4 Visual) Context-Adaptive Variable-Length Coding (H.264 | AVC) Context-based Adaptive Binary Arithmetic Coding (CABAC) Simple binary arithmetic coder (requires binarization) Two operating mode: Adaptive probability models, bypass mode Transform Coefficient Coding in Modern Video Coding Standards Binary arithmetic coding with rather simple binarizations Sophisticated selection of adaptive probability models (goal: stable conditional probabilities)

Heiko Schwarz (Freie Universität Berlin) — Image and Video Coding: Transform Coefficient Coding 42 / 42