HDR Image and Video Compression
- dr. Francesco Banterle
francesco.banterle@isti.cnr.it
HDR Image and Video Compression dr. Francesco Banterle - - PowerPoint PPT Presentation
HDR Image and Video Compression dr. Francesco Banterle francesco.banterle@isti.cnr.it HDR Images and Frames The main problem with HDR images is that they require floating point encoding for representing all intensities values that HVS can
francesco.banterle@isti.cnr.it
require floating point encoding for representing all intensities values that HVS can see
Encoding Color Space Bpp Dynamic Range (log10) Relative Error (%) IEEE RGB full RGB 96 79 0.000003 RGBE positive RGB 32 76 1.0 LogLuv24 logY + (u,v) 24 4.8 1.1 LogLuv32 logY + (u,v) 32 38 0.3 Half RGB RGB 48 10.7 0.1
RGBE (32-bit per pixel or bpp)
Value: 0 Count: 10; Value: 10 Count: 2; Value: 0 Count: 1; Value: 9 Count: 2
0,0,0 0,0,0 0,0,0 0,10,10 0,9,9
loseless —> no loss of information
edges
pixel values locality and assuming two distributions per block
2 bytes (M0 and M1), 2 byte the block —> 4 byte This means 2bpp instead of 8bpp (for a gray scale image)
differently high and low frequencies
chrominance, and luminance in values
standard
MRGB→Y CrCb = 0.299 0.587 0.114 −0.169 0.331 0.5 0.5 −0.419 −0.081 Y Cr Cb = 128 128 + MRGB→Y CrCb R G B
(8x8 in JPEG) into low and high frequency bands.
F(u, v) = ✓ 2 N ◆ 1
2 ✓ 2
M ◆ 1
2 N−1
X
i=0 M−1
X
j=0
Λ(i)Λ(j) cos ✓ πu 2N (2i + 1) ◆ cos ✓ πv 2N (2j + 1) ◆ f(i, j) Λ(x) = (
1 √ 2
if x = 0 1
2D DCT
Quantization matrix
16 11 10 16 24 40 51 61 12 12 14 19 26 58 60 55 14 13 16 24 40 57 69 56 14 17 22 29 51 87 80 62 18 22 37 56 68 109 103 77 24 35 55 64 81 104 113 92 49 64 78 87 103 121 120 101 72 92 95 98 112 100 103 99
Values are in [-128, 128], then encoded in [0,255]
Similar frequencies are put together Values are encoded using:
mapped version using HDR [Ward and Simmons 2004]
64Kb
encoding per color channel!
Ce(x) = log2(C(x) + ✏) − log2(Cmax + ✏) log2(Cmax + ✏) − log2(Cmin + ✏) ✏ > 0 C0
e(x) =
⇠ (216 − 1)Ce(x) ⇡
R0
e
G0
e
B0
e
Encoded HDR Image
JPEG2000 Encoder
histogram and to encode them separately [Wang et al. 2007]
the histogram
10 20 30 40 50 60 70 80 90 100 0.5 1 1.5 2 Bucket Number of Pixels
Dark areas Bright areas
take advantage of spatial coherency? [Boschetti et
Em = ⇠ log2 max(R, G, B) + 128 ⇡ Rm = 256R 2Em−128 ⌫ Gm = 256G 2Em−128 ⌫ Bm = 256B 2Em−128 ⌫
E = meanR,G,B ✓ log2 IHDR ITMO + 1 + ✏ ◆ ✏ > 0
E = meanR,G,B ✓ log2 IHDR ITMO + 1 + ✏ ◆ ✏ > 0
M = IHDR EE − 1
segment a linear compression factor [Banterle et
strange; i.e. seams and no global contrast
Lossy Encoding Loseless Encoding TMO Parameters Input HDR Image Segmentation Tone Mapping
because the peak can be an outlier
(LDR images) that can be extracted from an HDR image [Munkberg et al. 2006]
MSE(I, ˆ I) = 1 n
n
X
j=1
✓ I(xj) − ˆ I(xj) ◆2 PSNR(I, ˆ I) = 10 log10 ✓ I2
max
MSE(I, ˆ I) ◆
T(v, c) = 255(2cv)
1 γ
255 MSE(I, ˆ I) = 1 n × p
p
X
c=1 n
X
i=1
✓ ∆R2
i,c + ∆G2 i,c + ∆B2 i,c
◆ mPSNR(I, ˆ I) = 10 log10 ✓ 3 × 2552 MSE(I, ˆ I) ◆
∆Ri,c = T(R(xi), c) − T( ˆ R∗(xi), c) ∆Gi,c = T(G(xi), c) − T( ˆ G∗(xi), c) ∆Bi,c = T(B(xi), c) − T( ˆ B∗(xi), c)
exacerbate per pixel differences
values influence
RMSE(I, ˆ I) = v u u t 1 n
n
X
i=1
✓ log2 R(xi) ˆ R(xi) ◆2 + ✓ log2 G(xi) ˆ G(xi) ◆2 + ✓ log2 B(xi) ˆ B(xi) ◆2
the response of the HVS to luminance values
PU sRGB
Reference Image Test Image Display Model Display Model Pu Encoding Pu Encoding Classic Metric
Reference Image Test Image Display Model Display Model Pu Encoding Pu Encoding Classic Metric
Pixel value
Reference Image Test Image Display Model Display Model Pu Encoding Pu Encoding Classic Metric
Pixel value Luminance Value
10918-1)
(H.262), MPEG-4 Part 2 (H.623), H.264 (AVC), H. 265 (HEVC)
encoded using JPEG
encoded and the I-frame
t t+1
Difference frame time t and t+1
t t+1 (u,v) stored per macroblock (16x16)
Difference frame time t and t+1 with motion estimation
Difference frame time t and t+1 with motion estimation
(11bit for encoding luma):
strong edges
dΨ(l) dl = 2tvi(Ψ(l)) f Ψ(0) = 10−4cd/m2 Ψ(lmax) = 108cd/m2 lmax = 2nbits − 1
compatible, why not using a similar idea from JPEG-HDR? [Mantiuk et al. 2006]
perceive!
filtered using the bilateral filter [Lee and Kim 2008]
residuals stream QPratio = 0.77QPd + 13.42 R(x) = log2 ✓Lw(x) Ld(x) ◆
content