Statistics and Steganalysis CSM25 Secure Information Hiding Dr Hans - - PowerPoint PPT Presentation

statistics and steganalysis
SMART_READER_LITE
LIVE PREVIEW

Statistics and Steganalysis CSM25 Secure Information Hiding Dr Hans - - PowerPoint PPT Presentation

Statistics and Steganalysis CSM25 Secure Information Hiding Dr Hans Georg Schaathun University of Surrey Spring 2009 Week 2 Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 Week 2 1 / 54 Learning Outcomes After this


slide-1
SLIDE 1

Statistics and Steganalysis

CSM25 Secure Information Hiding Dr Hans Georg Schaathun

University of Surrey

Spring 2009 – Week 2

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 1 / 54

slide-2
SLIDE 2

Learning Outcomes

After this session, everyone should

have a basic understanding of statistical hypothesis testing understand how statistical methods apply to steganography be able to implement the basic χ2 test of steganalysis be able to interpret output from the χ2 test

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 2 / 54

slide-3
SLIDE 3

Suggested Reading

Core Reading Cox et al. Chapter 13. Suggested Reading Gouri K. Bhattacharyya and Richard A. Johnson: Statistical Concepts and Methods (Wiley Series in Probability and Statistics). Suggested Reading «Higher-order statistical steganalysis of palette images» by Jessica Fridrich, Miroslav Goljan, David Soukal in Proc. SPIE Electronic Imaging, Jan 2003, pp. 178-190

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 3 / 54

slide-4
SLIDE 4

Visual Steganalysis

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 4 / 54

slide-5
SLIDE 5

Visual Steganalysis The LSB plane

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 5 / 54

slide-6
SLIDE 6

Visual Steganalysis The LSB plane

The visual attack

Visual inspection is the simplest form of steganalysis

Consider complete image Extract LSB plane (or other bit planes) Histogramme etc.

In Exercise 2, you studied LSB planes

What did you see?

These slides present some images I have inspected.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 6 / 54

slide-7
SLIDE 7

Visual Steganalysis The LSB plane

The visual attack

Visual inspection is the simplest form of steganalysis

Consider complete image Extract LSB plane (or other bit planes) Histogramme etc.

In Exercise 2, you studied LSB planes

What did you see?

These slides present some images I have inspected.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 6 / 54

slide-8
SLIDE 8

Visual Steganalysis The LSB plane

Structure in the Image (I)

Example from Wayner’s book

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 7 / 54

slide-9
SLIDE 9

Visual Steganalysis The LSB plane

Structure in the Image (II)

Example from Wayner’s book

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 8 / 54

slide-10
SLIDE 10

Visual Steganalysis The LSB plane

Comparing

Which is the stego-object?

Example uses EzStego (GIF).

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 9 / 54

slide-11
SLIDE 11

Visual Steganalysis The LSB plane

A less good example (I)

Grayscale Spatial Domain LSB Any structure?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 10 / 54

slide-12
SLIDE 12

Visual Steganalysis The LSB plane

A less good example (II)

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 11 / 54

slide-13
SLIDE 13

Visual Steganalysis The LSB plane

Structure in the message

What are the vertical stripes? Structure? Of what? No relation to image... i.e. must relate to message. Conclusion: the plaintext is structured. What if we had compressed the plaintext?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 12 / 54

slide-14
SLIDE 14

Visual Steganalysis The LSB plane

Structure in the message

What are the vertical stripes? Structure? Of what? No relation to image... i.e. must relate to message. Conclusion: the plaintext is structured. What if we had compressed the plaintext?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 12 / 54

slide-15
SLIDE 15

Visual Steganalysis The LSB plane

Structure in the message

What are the vertical stripes? Structure? Of what? No relation to image... i.e. must relate to message. Conclusion: the plaintext is structured. What if we had compressed the plaintext?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 12 / 54

slide-16
SLIDE 16

Visual Steganalysis The LSB plane

Structure in the message

What are the vertical stripes? Structure? Of what? No relation to image... i.e. must relate to message. Conclusion: the plaintext is structured. What if we had compressed the plaintext?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 12 / 54

slide-17
SLIDE 17

Visual Steganalysis The LSB plane

Structure in the message

What are the vertical stripes? Structure? Of what? No relation to image... i.e. must relate to message. Conclusion: the plaintext is structured. What if we had compressed the plaintext?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 12 / 54

slide-18
SLIDE 18

Visual Steganalysis The LSB plane

Structure in the message

What are the vertical stripes? Structure? Of what? No relation to image... i.e. must relate to message. Conclusion: the plaintext is structured. What if we had compressed the plaintext?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 12 / 54

slide-19
SLIDE 19

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-20
SLIDE 20

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-21
SLIDE 21

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-22
SLIDE 22

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-23
SLIDE 23

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-24
SLIDE 24

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-25
SLIDE 25

Visual Steganalysis The LSB plane

Why is the message structured?

Message as 384x213 image The stripes are there. Why? How did we convert text to binary? What has happened,

All first-bits comes first All seventh-bits comes last Bit 6-7 is often zero

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 13 / 54

slide-26
SLIDE 26

Visual Steganalysis The LSB plane

Different message structure

Character by character

Same message Ordered character by character. Is there structure?

  • Maybe. Definitly harder to exploit.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 14 / 54

slide-27
SLIDE 27

Visual Steganalysis The LSB plane

Different message structure

Character by character

Same message Ordered character by character. Is there structure?

  • Maybe. Definitly harder to exploit.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 14 / 54

slide-28
SLIDE 28

Visual Steganalysis The LSB plane

Different message structure

Character by character

Same message Ordered character by character. Is there structure?

  • Maybe. Definitly harder to exploit.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 14 / 54

slide-29
SLIDE 29

Visual Steganalysis The LSB plane

Different message structure

Character by character

Same message Ordered character by character. Is there structure?

  • Maybe. Definitly harder to exploit.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 14 / 54

slide-30
SLIDE 30

Visual Steganalysis The LSB plane

Can you detect it?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 15 / 54

slide-31
SLIDE 31

Visual Steganalysis The LSB plane

Conclusion

Message structures are visible in the stego-image. Many kinds of structures

Ratio of 1-s versus 0-s. Location of 0-s and 1-s.

Such structure disappear with compression Textbooks focus on LSB of coverimage

Visible structures in the cover disappear in the stego-image

Less obvious if the message is randomly and sparsely distributed. We are looking for the unusual

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 16 / 54

slide-32
SLIDE 32

Visual Steganalysis The Histogram

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 17 / 54

slide-33
SLIDE 33

Visual Steganalysis The Histogram

A typical image

Image histogram made by imhist in Matlab Gives number of pixels per colour-value

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 18 / 54

slide-34
SLIDE 34

Visual Steganalysis The Histogram

And a stego-image

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 19 / 54

slide-35
SLIDE 35

Visual Steganalysis The Histogram

And a stego-image

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 19 / 54

slide-36
SLIDE 36

Visual Steganalysis The Histogram

What happened?

Histogram of stego-image: More ragged Every other bar sticks out. Why? 50.8% 1-s in the binary message.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 20 / 54

slide-37
SLIDE 37

Visual Steganalysis The Histogram

What happened?

Histogram of stego-image: More ragged Every other bar sticks out. Why? 50.8% 1-s in the binary message.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 20 / 54

slide-38
SLIDE 38

Visual Steganalysis The Histogram

What happened?

Histogram of stego-image: More ragged Every other bar sticks out. Why? 50.8% 1-s in the binary message.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 20 / 54

slide-39
SLIDE 39

Visual Steganalysis The Histogram

What happened?

Histogram of stego-image: More ragged Every other bar sticks out. Why? 50.8% 1-s in the binary message.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 20 / 54

slide-40
SLIDE 40

Visual Steganalysis The Histogram

What happened?

Histogram of stego-image: More ragged Every other bar sticks out. Why? 50.8% 1-s in the binary message.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 20 / 54

slide-41
SLIDE 41

Visual Steganalysis The Histogram

What is characteristic?

Pairs of values

Consider colour 2i (i = 0, 1, . . . , 127)

What happens under LSB embedding? 2i → 2i, 2i + 1 Never 2i → 2i − 1.

Likewise 2i + 1 → 2i, 2i + 1 (2i, 2i + 1) is a Pair of Values A pixel in (2i, 2i + 1) before embedding

... is a pixel in (2i, 2i + 1) after embedding

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 21 / 54

slide-42
SLIDE 42

Visual Steganalysis The Histogram

What is characteristic?

Pairs of values

Consider colour 2i (i = 0, 1, . . . , 127)

What happens under LSB embedding? 2i → 2i, 2i + 1 Never 2i → 2i − 1.

Likewise 2i + 1 → 2i, 2i + 1 (2i, 2i + 1) is a Pair of Values A pixel in (2i, 2i + 1) before embedding

... is a pixel in (2i, 2i + 1) after embedding

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 21 / 54

slide-43
SLIDE 43

Visual Steganalysis The Histogram

What is characteristic?

Pairs of values

Consider colour 2i (i = 0, 1, . . . , 127)

What happens under LSB embedding? 2i → 2i, 2i + 1 Never 2i → 2i − 1.

Likewise 2i + 1 → 2i, 2i + 1 (2i, 2i + 1) is a Pair of Values A pixel in (2i, 2i + 1) before embedding

... is a pixel in (2i, 2i + 1) after embedding

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 21 / 54

slide-44
SLIDE 44

Visual Steganalysis Limitations

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 22 / 54

slide-45
SLIDE 45

Visual Steganalysis Limitations

Visual methods

Advantages and Limitations

Human perception is very flexible

can exploit the unexpected you don’t have to know what you look for

Manual work

the process cannot be automated or computerised

How do you check a million images?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 23 / 54

slide-46
SLIDE 46

Statistics

Outline

1

Visual Steganalysis

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 24 / 54

slide-47
SLIDE 47

Statistics Statistical models

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 25 / 54

slide-48
SLIDE 48

Statistics Statistical models

The remit of statistics

Statistics can estimate ‘normal’ behaviour

and compare behaviours

Advantages

Automated decisions Extract detail Exact, quantifiable features Aggregate measures

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 26 / 54

slide-49
SLIDE 49

Statistics Statistical models

The remit of statistics

Statistics can estimate ‘normal’ behaviour

and compare behaviours

Advantages

Automated decisions Extract detail Exact, quantifiable features Aggregate measures

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 26 / 54

slide-50
SLIDE 50

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image. Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-51
SLIDE 51

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image.

Is the image a stegogramme?

Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-52
SLIDE 52

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image.

Is it a probable, natural image? Is it a probable stegogramme?

Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-53
SLIDE 53

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image.

Is it a probable, natural image? Is it a probable stegogramme?

Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-54
SLIDE 54

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image.

Is it a probable, natural image? Is it a probable stegogramme?

Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-55
SLIDE 55

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image.

Is it a probable, natural image? Is it a probable stegogramme?

Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-56
SLIDE 56

Statistics Statistical models

The fundamental question

Wendy the Warden intercepts an image.

Is it a probable, natural image? Is it a probable stegogramme?

Depends on a model for natural images

Statistical models and probability distributions

With a perfect model,

cipher with ciphertexts distributed as natural images

If Wendy has a better model than Alice and Bob,

then she can do effective steganalysis

In reality, we do not know what a natural image looks like

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 27 / 54

slide-57
SLIDE 57

Statistics Pairs of Values

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 28 / 54

slide-58
SLIDE 58

Statistics Pairs of Values

Pairs of Values

The statistic

Image X. Random variable Yk = #{(x, y)|Xxy = k}

The Yk-s is the Histogramme.

Recall that (2l, 2l + 1) is a pair of values.

First 7 pixel bits determined by image colour.

i.e. which pair

Last bit (LSB) determined by message

i.e. which half of the pair

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 29 / 54

slide-59
SLIDE 59

Statistics Pairs of Values

Pairs of Values

Expected behaviour

Sum Y2l + Y2l+1 unaffected by embedding. For a random message steganogram,

Expect 50-50 2l and 2l + 1 i.e. E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1)

For a natural image, what can we expect? In a given image, we can observe Y2l.

Is the observation probable?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 30 / 54

slide-60
SLIDE 60

Statistics Pairs of Values

Pairs of Values

Expected behaviour

Sum Y2l + Y2l+1 unaffected by embedding. For a random message steganogram,

Expect 50-50 2l and 2l + 1 i.e. E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1)

For a natural image, what can we expect? In a given image, we can observe Y2l.

Is the observation probable?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 30 / 54

slide-61
SLIDE 61

Statistics Pairs of Values

Pairs of Values

Expected behaviour

Sum Y2l + Y2l+1 unaffected by embedding. For a random message steganogram,

Expect 50-50 2l and 2l + 1 i.e. E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1)

For a natural image, what can we expect? In a given image, we can observe Y2l.

Is the observation probable?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 30 / 54

slide-62
SLIDE 62

Statistics Pairs of Values

Pairs of Values

Expected behaviour

Sum Y2l + Y2l+1 unaffected by embedding. For a random message steganogram,

Expect 50-50 2l and 2l + 1 i.e. E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1)

For a natural image, what can we expect? In a given image, we can observe Y2l.

Is the observation probable?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 30 / 54

slide-63
SLIDE 63

Statistics Pairs of Values

Hypothesis testing

The principle

We have two possible hypotheses

1

H0 The image is a steganogram with random message

Known distribution: E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1) 2

H1 The image is a natural image

Unknown distribution

Statistics allows us to answer

is the observed Y2l-s likely under H0?

We cannot ask a similar question under H1.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 31 / 54

slide-64
SLIDE 64

Statistics Pairs of Values

Hypothesis testing

The principle

We have two possible hypotheses

1

H0 The image is a steganogram with random message

Known distribution: E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1) 2

H1 The image is a natural image

Unknown distribution

Statistics allows us to answer

is the observed Y2l-s likely under H0?

We cannot ask a similar question under H1.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 31 / 54

slide-65
SLIDE 65

Statistics Pairs of Values

Hypothesis testing

The principle

We have two possible hypotheses

1

H0 The image is a steganogram with random message

Known distribution: E(Y2l) = E(Y2l+1) = 1

2(Y2l + Y2l+1) 2

H1 The image is a natural image

Unknown distribution

Statistics allows us to answer

is the observed Y2l-s likely under H0?

We cannot ask a similar question under H1.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 31 / 54

slide-66
SLIDE 66

Statistics Pairs of Values

The χ2 test

Statistical hypothesis tests exist for many purposes The χ2 test can

compare different distributions

i.e. the H0 distribution and the observed distribution

aggregate several numbers

i.e. Y2l for every l

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 32 / 54

slide-67
SLIDE 67

Statistics Pairs of Values

The χ2 statistic

Several random variable F0, F1, . . . , Fm Known expectations E(F0), E(F1), . . . , E(Fm) S =

m

  • i=0

(Fo − E(Fo))2 E(Fo) Definition SPoV =

127

  • l=0

(Y2l − 1

2(Y2l + Y2l+1))2 1 2(Y2l + Y2l+1)

=

127

  • l∈0

1 2(Y2l − Y2l+1)2

Y2l + Y2l+1 χ2 distributed with m degrees of freedom

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 33 / 54

slide-68
SLIDE 68

Statistics Pairs of Values

Making Conclusions

If the observed S is likely under χ2 distribution,

the assumed distribution (and thus H0) is plausible

If the observed S is unlikely under χ2 distribution,

H0 is implausible

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 34 / 54

slide-69
SLIDE 69

Statistics Pairs of Values

Making Conclusions

If the observed S is likely under χ2 distribution,

the assumed distribution (and thus H0) is plausible

If the observed S is unlikely under χ2 distribution,

H0 is implausible

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 34 / 54

slide-70
SLIDE 70

Statistics Pairs of Values

The χ2 PDF

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 35 / 54

slide-71
SLIDE 71

Statistics Pairs of Values

The Pairs-of-Values χ2 Distribution

χ2 PDF 127 degrees of freedom Red: 2% prob. +Green: 5% +Blue: 10% Cumulative Density Function (CDF)

Area under the curve

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 36 / 54

slide-72
SLIDE 72

Statistics Pairs of Values

The Pairs-of-Values χ2 Distribution

χ2 PDF 127 degrees of freedom Red: 2% prob. +Green: 5% +Blue: 10% Cumulative Density Function (CDF)

Area under the curve

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 36 / 54

slide-73
SLIDE 73

Statistics Pairs of Values

The Pairs-of-Values χ2 Distribution

χ2 PDF 127 degrees of freedom Red: 2% prob. +Green: 5% +Blue: 10% Cumulative Density Function (CDF)

Area under the curve

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 36 / 54

slide-74
SLIDE 74

Statistics Pairs of Values

The p-value

Let S be a stochastic χ2 distributed variable Let s be the observed χ2 statistic Define p-value: p = P(S < s) I.e. low p-value ⇒ s is unusually small

Improbable if the image is a stegogramme. Conclusion: probably natural image

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 37 / 54

slide-75
SLIDE 75

Statistics Pairs of Values

p-value illustrated

We read the statistic (χ2) on the x-axis The p-value is the area under the PDF to the right Compute it with chi2cdf

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 38 / 54

slide-76
SLIDE 76

Statistics Pairs of Values

Corrections

You may have to exclude pixel values which do not occur

have at least four pixels of each pair of values used

This keeps the χ2 distribution a good approximation This reduces the degrees of freedom

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 39 / 54

slide-77
SLIDE 77

Statistics Pairs of Values

χ2 in Matlab

Defined in the Statistics toolbox Simplified functions available on website:

chi2pdf (the PDF) chi2cdf(s,v) – P(S ≤ s) when S ∼ χ2(v) chi2inv(p,v) – s such that P(S ≤ s) = p

Note that the p-value is P(S ≥ s) = 1 − P(S ≤ s)

use chi2cdf to calculate it

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 40 / 54

slide-78
SLIDE 78

Statistics Pairs of Values

χ2 in Matlab

Defined in the Statistics toolbox Simplified functions available on website:

chi2pdf (the PDF) chi2cdf(s,v) – P(S ≤ s) when S ∼ χ2(v) chi2inv(p,v) – s such that P(S ≤ s) = p

Note that the p-value is P(S ≥ s) = 1 − P(S ≤ s)

use chi2cdf to calculate it

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 40 / 54

slide-79
SLIDE 79

Statistics I visual approach

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 41 / 54

slide-80
SLIDE 80

Statistics I visual approach

Part-image

The χ2 statistic is effective when the image is full of hidden information

What happens if only a small part is used?

Basic LSB embedding uses the first N pixels We calculate the χ2 and p values for every N

The result can be plotted use plot or fplot in Matlab

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 42 / 54

slide-81
SLIDE 81

Statistics I visual approach

Plots

No message

χ2 statistic p-value

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 43 / 54

slide-82
SLIDE 82

Statistics I visual approach

Plots

30% of capacity

χ2 statistic p-value

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 44 / 54

slide-83
SLIDE 83

Statistics I visual approach

Plots

60% of capacity

χ2 statistic p-value

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 45 / 54

slide-84
SLIDE 84

Statistics I visual approach

Plots

100% of capacity

χ2 statistic p-value

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 46 / 54

slide-85
SLIDE 85

Statistics Error types

Outline

1

Visual Steganalysis The LSB plane The Histogram Limitations

2

Statistics Statistical models Pairs of Values I visual approach Error types

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 47 / 54

slide-86
SLIDE 86

Statistics Error types

Classification errors

Steganalysis is a binary classification problem identify an unknown object (image) as either

suspicious innocent

Two error types False positive an innocent image wrongly accused False negative a «guilty» image not identified Which type is most severe?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 48 / 54

slide-87
SLIDE 87

Statistics Error types

Classification errors

Steganalysis is a binary classification problem identify an unknown object (image) as either

suspicious innocent

Two error types False positive an innocent image wrongly accused False negative a «guilty» image not identified Which type is most severe?

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 48 / 54

slide-88
SLIDE 88

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-89
SLIDE 89

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-90
SLIDE 90

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-91
SLIDE 91

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-92
SLIDE 92

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-93
SLIDE 93

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-94
SLIDE 94

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-95
SLIDE 95

Statistics Error types

Hypothesis testing and errors

Hypothesis testing is a recurring theme in statistics. Typical null hypotheses

Treatment A makes patients recover no more quickly than no treatment. The climate in South-East Britain is as warm/cold today as it was a 100 years ago. The image sent by Alice is a natural (innocent) image.

When the hypothesis has been phrased,

experiments can tell us whether it is plausible or not.

Wrongly accepting the null hypothesis is the least serious error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 49 / 54

slide-96
SLIDE 96

Statistics Error types

Asymmetry of hypothesis testing

Treatment A makes patients recover more quickly than no treatment. One error is more serious than another.

Type I: Accepting the hypothesis when it is wrong

Patients get ineffective (or unhealthy) medicine.

Type II: Rejecting the hypothesis when it is right

More research will be made to optimise the treatment.

H0 retained H0 rejected H0 true No error Error Type I H0 false Error Type II No error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 50 / 54

slide-97
SLIDE 97

Statistics Error types

Asymmetry of hypothesis testing

Treatment A makes patients recover more quickly than no treatment. One error is more serious than another.

Type I: Accepting the hypothesis when it is wrong

Patients get ineffective (or unhealthy) medicine.

Type II: Rejecting the hypothesis when it is right

More research will be made to optimise the treatment.

H0 retained H0 rejected H0 true No error Error Type I H0 false Error Type II No error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 50 / 54

slide-98
SLIDE 98

Statistics Error types

Asymmetry of hypothesis testing

Treatment A makes patients recover more quickly than no treatment. One error is more serious than another.

Type I: Accepting the hypothesis when it is wrong

Patients get ineffective (or unhealthy) medicine.

Type II: Rejecting the hypothesis when it is right

More research will be made to optimise the treatment.

H0 retained H0 rejected H0 true No error Error Type I H0 false Error Type II No error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 50 / 54

slide-99
SLIDE 99

Statistics Error types

Asymmetry of hypothesis testing

Treatment A makes patients recover more quickly than no treatment. One error is more serious than another.

Type I: Accepting the hypothesis when it is wrong

Patients get ineffective (or unhealthy) medicine.

Type II: Rejecting the hypothesis when it is right

More research will be made to optimise the treatment.

H0 retained H0 rejected H0 true No error Error Type I H0 false Error Type II No error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 50 / 54

slide-100
SLIDE 100

Statistics Error types

Asymmetry of hypothesis testing

Treatment A makes patients recover more quickly than no treatment. One error is more serious than another.

Type I: Accepting the hypothesis when it is wrong

Patients get ineffective (or unhealthy) medicine.

Type II: Rejecting the hypothesis when it is right

More research will be made to optimise the treatment.

H0 retained H0 rejected H0 true No error Error Type I H0 false Error Type II No error

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 50 / 54

slide-101
SLIDE 101

Statistics Error types

The weirdness of the steganalysis

H0: The message is a steganogram. We consider it (implicitely) serious to declare the message innocent when it is a stegogramme. Why?

Makes strong surveillance regime. Might be appropriate for prison scenario.

Real reason

Probability distribution known only for stegogrammes. We require known distribution under H0.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 51 / 54

slide-102
SLIDE 102

Postlogue

Outline

1

Visual Steganalysis

2

Statistics

3

Postlogue

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 52 / 54

slide-103
SLIDE 103

Postlogue

Randomised location

PoV assumes embedding in consecutive bits Generalised χ2 proposes a fix Fridrich et al (2003) suggests an implementation No rigid hypothesis test or statistical theory

works experimentally

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 53 / 54

slide-104
SLIDE 104

Postlogue

Summary

Steganalysis can be cast as a problem of statistics

standard statistical theory applies

The Pairs-of-Values χ2 test is a simple example The weekly exercise is to implement and test this steganalysis technique.

See website for detailed assignment.

Dr Hans Georg Schaathun Statistics and Steganalysis Spring 2009 – Week 2 54 / 54