A Detection-Theoretic and Computational Framework for Designing - - PowerPoint PPT Presentation

a detection theoretic and computational framework for
SMART_READER_LITE
LIVE PREVIEW

A Detection-Theoretic and Computational Framework for Designing - - PowerPoint PPT Presentation

A Detection-Theoretic and Computational Framework for Designing Geometrically Resilient Watermarking Systems Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/ moulin/talks/wacha05-slides.pdf WaCha, Barcelona June 8,


slide-1
SLIDE 1

A Detection-Theoretic and Computational Framework for Designing Geometrically Resilient Watermarking Systems Pierre Moulin University of Illinois at Urbana-Champaign www.ifp.uiuc.edu/˜moulin/talks/wacha05-slides.pdf WaCha, Barcelona June 8, 2005

1

slide-2
SLIDE 2

Outline

  • A communication model for geometric attacks

– Role of Information Theory and Detection Theory – “Complexity” of geometric attacks

  • Example: Unitary Geometric Attack Channels
  • Invariant vs GLRT vs Pilot-based WM schemes

2

slide-3
SLIDE 3

An Image Watermarking System

secret key k Encoder

  • riginal image S

watermarked image X Picture taken by Alice on January 1, 2000. This message is going to be embedded forever in this picture. I challenge you to remove the message without substantially altering the picture. 1001001101001110100...............101 binary representation Decoder Picture taken by Alice on January 1, 2000. This message is going to be embedded forever in this picture. I challenge you to remove the message without substantially altering the picture. Decoded message 1001001101001110100...............101 Decoded binary message secret key k Attack Pirate 11011000...01

3

slide-4
SLIDE 4

Attacks on Images

Original JPEG, QF=10 4 × 4 median filtering Gaussian filter (σ = 3) Rotated by 10 degrees Random bending

4

slide-5
SLIDE 5

A Communication Model for Geometric Attacks

  • Attacker maps watermarked X = (X1, · · · , Xn) into degraded

Y = (Y1, · · · , Yn) using stochastic mapping p(y|x).

  • Distortion function d(x, y)
  • Feasible mappings satisfy a distortion constraint

in average: E[d(X, Y)] ≤ D2

  • r with probability one:

d(X, Y) ≤ D2

  • Would like “geometrically-inspired” d(x, y)

5

slide-6
SLIDE 6

Attack Model and Distortion Function [MM’02]

Geometric θ

T ( ) y z x θ . A(z|x)

Attack Memoryless Channel

  • Geometric (desynchronization) parameter θ ∈ Θ
  • Tθ(·) smooth, invertible mapping
  • Additive distortion function da(x, z) = 1

n

n

i=1 da(xi, zi)

  • Distortion function d(x, y) = minθ∈Θ da(x, T −1

θ

(y)) invariant to geometric attacks in class {Tθ, θ ∈ Θ}

  • Maximum distortion level D2 for attacker

6

slide-7
SLIDE 7

Information-Theoretic Setup

n

Message Decoder Attack

x y g ( , ) p( | ) y x y k

n

f ( ,m, ) k s ^

Encoder Host Key

s k M M

  • Communications with side information (Gel’fand-Pinsker 1980)
  • M uniformly distributed over message set Mn

– Coding problem: R limn→∞ 1

n log2 |Mn| > 0

– Detection problem: Mn independent of n ⇒ R = 0

  • Distortion levels D1 and D2
  • Class of attacks: Pn {pY|X}
  • Attacker knows fn, gn, selects (AZ|X, θ) ∼ pY|X ∈ Pn

7

slide-8
SLIDE 8

n

Message Decoder Attack

x y g ( , ) p( | ) y x y k

n

f ( ,m, ) k s ^

Encoder Host Key

s k M M

  • Minmax probability of error:

P ∗

e (n, Mn, Pn) = inf fn,gn

sup

pY|X∈Pn

Pe(fn, gn, pY|X)

  • Rate R is achievable if lim supn→∞ P ∗

e (n, Mn, Pn) = 0

  • Supremum of achievable rates is capacity C(D1, D2)
  • Error exponent

e∗(R, D1, D2) = lim inf

n→∞ − 1

n log P ∗

e (n, Mn, Pn),

0 ≤ R ≤ C

  • Write P ∗

e (n, Mn, Pn) .

= 2−n e∗(R,D1,D2)

8

slide-9
SLIDE 9
  • Can derive expression for C(D1, D2) for various classes of

attacks involving additive distortion functions: – Memoryless attacks [MO’99] – Max-distortion attacks [CL’01, SM’03]

  • Can also derive upper and lower bounds on e∗(R, D1, D2)

[SM’04] [MW’04]

  • What happens under geometric attacks?

9

slide-10
SLIDE 10

Complexity of Geometric Attacks

Geometric θ

T ( ) y z x θ . A(z|x)

Attack Memoryless Channel

  • Consider two cases: receiver knows θ or not
  • If receiver knows θ, it can “undo” geometric attacks
  • If receiver doesn’t know θ but Θ is compact,

– there is no decrease in capacity; C(D1, D2) is achieved using traditional decoder, aided by pilot. – there is not even a decrease in e∗

r(R, D1, D2), i.e., there

exists a universal decoder against such geometric attacks

10

slide-11
SLIDE 11

Standard WM Codes and Their Limitations

  • Example: standard Quantization Index Modulation codes

perform well against additive Gaussian attacks but are vulnerable to scaling attacks, delays, warping, etc.

  • The main culprit is the minimum-Euclidean-distance decoder

11

slide-12
SLIDE 12

Unitary Geometric Attack Channels

  • Assume s, x, y ∈ Rn and da(x, y) = x − y2
  • Tθ is a unitary matrix

(geometric attack is linear and preserves signal energy)

  • Example: cyclic delay attack

– Attacker performs bandlimited interpolation of x, applies cyclic delay θ ∈ [0, n], and resamples signal

  • Assume S ∼ N(0, Σ) and TθΣT T

θ is independent of θ

⇒ statistics of S are invariant under Tθ

12

slide-13
SLIDE 13

Example: M-ary Watermark Detection in iid Gaussian Noise

  • Code rate R = 0
  • Additive spread-spectrum embedding rule x = s + wm
  • M ≤ n orthogonal watermarks wm ∈ Rn,

each with energy wm2 = nD1

  • Watermark constellation C = {wm};

transformed watermark constellation Cθ = {Tθwm}

  • Total noise at receiver ∼ N(0, σ2In)
  • Watermark to Noise ratio: WNR = D1/σ2
  • Minimum distance of Cθ: dmin =

√ 2nWNR, same for all θ

13

slide-14
SLIDE 14

Coherent Case: Detector knows θ

  • Hypothesis test: Hm : Y ∼ N(Tθwm, σ2In),

m ∈ M

  • Optimal likelihood ratio test (LRT) is a correlator-detector:

ˆ m = argmax

m∈M

yT Tθwm

  • Error probability:

Pe ≤ M − 1 2 Q(dmin/2) . = e−n W NR

4

  • Computational complexity: no search, just |M| correlations

⇒ |M| ops/sample

14

slide-15
SLIDE 15

Noncoherent Case: Detector doesn’t know θ

  • Hypothesis test:

Hm : Y ∼ N(Tθwm, σ2In), m ∈ M, θ ∈ Θ

  • Worst-case error probability maxθ∈Θ Pe(fn, gn, θ)
  • Can we do (nearly) as well as in the coherent case?
  • What kind of detector gn is (nearly) optimal?
  • What kind of watermark code fn should we use?

15

slide-16
SLIDE 16

Taxonomy for Practical WM Schemes

  • Invariant WM schemes
  • Generalized Likelihood Ratio Test (GLRT) detectors
  • Pilot-aided detection

16

slide-17
SLIDE 17

Invariant Watermarks

  • Invariant watermark: select embedding domain such that

p(y|θ, Hm) is independent of θ – θ is nonidentifiable

  • Detector has same performance as in coherent case

(against memoryless attacks in invariant domain)

  • No increase in computational complexity
  • Possible loss of robustness against memoryless attacks in
  • riginal image domain
  • And invariant domain does in general not exist!

17

slide-18
SLIDE 18

Invariant Detection Tests

  • Construct good detection statistics whose distribution is

independent of θ

  • Example: noncoherent detection of sinusoids (M-ary FSK)

subject to cyclic delay attacks: wm(i) =

  • 2D1 sin(2πfmi),

0 ≤ i < n, fm = (K + m)/n

  • Detection statistics zm =
  • n−1

i=0 y(n)ej2πfmi

  • 2

, m ∈ M

  • Detection test: ˆ

m = argmaxm∈M zm

  • Error probability Pe ≤ (M − 1) e−n W NR

4

  • No loss in error exponent wrt coherent case

18

slide-19
SLIDE 19

Generalized Likelihood Ratio Test (GLRT)

  • Step 1: Maximum-Likelihood Estimation:

ˆ θm

  • argmax

θ

p(y|θ, Hm) = argmin

θ

y − Tθwm, m ∈ M

  • Step 2: Correlator Detector:

ˆ m = argmax

m∈M

yT Tˆ

θmwm

  • Asymptotic optimality of GLRT: θ ∈ R, still Pe .

= e−n W NR

4

!

  • Computational complexity: mostly |M| full searches

19

slide-20
SLIDE 20

Pilot-Aided Detection

Signals Information-Bearing Pilot time

  • Pilot known to receiver, conveys info about channel law pY|X
  • Up to n − 1 orthogonal WM’s wm, each with energy nDw
  • Assume pilot p ∈ Rn is orthogonal to {wm}, has energy nDp.
  • Transmit watermarked signal x = s + wm + p
  • Embedding distortion = nD1 = n(Dw + Dp)

⇒ Dw < D1

20

slide-21
SLIDE 21
  • Computational complexity: mostly one full search (match p to y)
  • Reduces effective WNR by a factor of 1 − Dp/D1

and therefore decreases error exponent

  • Large estimation errors ˆ

θ − θ also contribute to Pe ⇒ optimal Dp results from large-deviations analysis

Detection vs computational-complexity tradeoff

21

slide-22
SLIDE 22

More General Geometric Attacks

  • Generally, Tθ is not unitary, not even linear

θ θ θ n 3 2 1

{T w } {T w } y {T w } . . .

  • How can we generalize the previous WM design/detection

approaches?

22

slide-23
SLIDE 23
  • Invariant WM’s: very hard if not impossible to construct
  • GLRT approach:

– due to invertibility and smoothness of Tθ(·), GLRT is asymptotically optimal as n → ∞ provided Θ is not “too complex” (e.g., Θ ∈ Rd where d ≪ n) – Proof is based on notion of competitive minimaxity [FM’02]

  • Pilot-based approach:

capacity-achieving, but lower error exponents

23

slide-24
SLIDE 24

Fast Search

  • Search for ˆ

θm = argmaxθ∈Θ p(y|θ, Hm), m ∈ M

  • Computational cost of full search (for discrete Θ) ∼ n |M| |Θ|
  • Replace full search by partial search
  • Analogous to classical signal processing problems such as fast

motion estimation in video, fast image registration, etc.

24

slide-25
SLIDE 25

More General Watermarking Codes [M’03]

  • Gelfand-Pinsker setup

. . . . . . . .

G

θ0

G

θ1

u Gθ0u’ Gθ1u’ u

  • How to make it practical?

25

slide-26
SLIDE 26

Conclusion

  • Detection performance vs complexity tradeoffs
  • Asymptotics (n → ∞):

– Assume θ is “low-dimensional” or belongs to compact set – Invariant watermarks may not exist... – Pilot-based schemes are capacity-achieving but cause loss in error exponents – GLRT is asymptotically optimal for our problem

  • In practice:

– GLRT-type detectors with fast search may be attractive – So are pilot-based schemes, if |Mn| is large

26

slide-27
SLIDE 27

References

  • [LN’98] Lapidoth and Narayan (IT 1998)
  • [FL’98] Feder and Lapidoth (IT 1998)
  • [MO’99] Moulin and O’Sullivan 1999 (IT 2003)
  • [MM’02] Moulin and Mihcak (IP 2002)
  • [CL’01] Cohen and Lapidoth 2001 (IT 2002)
  • [FM’02] Feder and Merhav (IT 2002)
  • [SM’03] Somekh-Baruch and Merhav (IT 2003)
  • [SM’04] Somekh-Baruch and Merhav (IT 2004)
  • [M’03] Moulin (SSP 2003)
  • [MW’04] Moulin and Wang (ITW 2004)

27