Walls Have Ears: Traffic-based Side-channel Attack in Video - - PowerPoint PPT Presentation

walls have ears traffic based side channel attack in
SMART_READER_LITE
LIVE PREVIEW

Walls Have Ears: Traffic-based Side-channel Attack in Video - - PowerPoint PPT Presentation

Walls Have Ears: Traffic-based Side-channel Attack in Video Streaming Jiaxi Gu , Jiliang Wang , Zhiwen Yu , Kele Shen Northwestern Polytechnical University, P .R. China Tsinghua University, P .R. China 1 Outline


slide-1
SLIDE 1

Walls Have Ears: Traffic-based Side-channel Attack in Video Streaming

Jiaxi Gu∗, Jiliang Wang†, Zhiwen Yu∗, Kele Shen†

∗ Northwestern Polytechnical University, P

.R. China

† Tsinghua University, P

.R. China

1
slide-2
SLIDE 2

Outline

๏ Background & Motivation ๏ Objective ๏ Methodology ๏ Experiments ๏ Conclusion

2
slide-3
SLIDE 3

Background & Motivation

3
slide-4
SLIDE 4

Booming Video Industry

Globally, IP video traffic will be 82 percent of all consumer Internet traffic by 2021, up from 73 percent in 2016.

— Cisco VNI. “Forecast and methodology, 2016-2021, white paper.” (2017)

4
slide-5
SLIDE 5 Monitoring network traffjc Server: stores video data Video streaming Attacker: has video fingerprints Client: fetches video data

“Walls have ears”

1 2 3

5
slide-6
SLIDE 6

Why does it matter?

  • 1. Users’ watchlists can be obtained by

malicious adversaries.

  • 2. ISP or enterprise administrators can spy on

their customers or employees.

  • 3. Profit of companies providing streaming

services can be damaged.

6
slide-7
SLIDE 7

What makes it worse!

  • 1. Traffic-based video identification is

ubiquitous despite data encryption.

  • 2. Video streaming has a longer life cycle than
  • ther online services, e.g., web browsing.
  • 3. Variable bitrate encoding and segmentation

make video streams identifiable.

7
slide-8
SLIDE 8

Objective

8
slide-9
SLIDE 9

Objective

An eavesdropped traffic trace

A traffic pattern

Downloaded video files

Fingerprints

  • f videos

Shape matching by calculating distance (similarity)

Normally for a period
  • f time during video
streaming. 9
slide-10
SLIDE 10

VBR

VBR (Variable Bit-Rate encoding)

Data amount per time slot changes owing to VBR. The bitrate variation trends show similar patterns between different quality levels. 10
slide-11
SLIDE 11

DASH

DASH (Dynamic Adaptive Streaming over HTTP)

๏ Encoding: Videos are encoded in multiple quality levels. ๏ Segmenting: Video copies of multiple qualities are chunked into segments. ๏ Streaming: Video segments in adaptive quality levels are transmitted in order.

11
slide-12
SLIDE 12

DASH

12 Bandwidth 1500 kbps 1000 kbps 500 kbps Time Time Time Server: Reply High Low Quality levels Downloaded segments Client: Request

๏ 😋 Transmitted segments are length-fixed and in-order. ๏ 😕 Quality level while streaming is adaptively switched.

slide-13
SLIDE 13

Methodology

13
slide-14
SLIDE 14

Network Traffic of DASH

๏ Video segments are transmitted in order. ๏ Video segment length is fixed while streaming. ๏ Traffic pattern owing to VBR is preserved.

Network traffic of streaming 3 different videos. 14
slide-15
SLIDE 15

Segment Aggregation

Time (s) T h r
  • u
g h p u t ( m b p s ) Bits per second bt Bits per period pi Network traffic data amount per second Exceed maximum period time τ Data amount less than ε

๏ One segment may take seconds for transmission. ๏ There are gaps between video segment transmissions. ๏ Noises are got rid of by threshold.

15
slide-16
SLIDE 16

Bitrate Differential

500 kbps 1500 kbps 1000 kbps 2000 kbps An example of data amount sequence in video segment.

s0

i = sisi−1 si−1

16
slide-17
SLIDE 17

Normalization

Min-max-normalization: M(xi) =

xi−min(x) max(x)−min(x)

Z(xi) = xi−µ

σ

Z-normalization:

S(xi) =

1 1+e−xi

Sigmoid-normalization:

17
slide-18
SLIDE 18

Video Fingerprinting

Segmentation Normalization Aggregation

Differential

18
slide-19
SLIDE 19

Distance Calculation

Video fingerprints Traffic pattern

๏ Eavesdropping can hardly start from the beginning. ๏ It is time-consuming to eavesdrop the entire video.

19
slide-20
SLIDE 20

Dynamic Time Warping (DTW)

Sequence Y1..N Sequence X1..M

d(i-1, j) d(i-1, j-1) d(i, j) d(i, j-1)

d(i, j) = kXi Yjk + min      d(i, j 1) d(i 1, j 1) d(i 1, j)      DTW Matrix Step pattern

Insertion Match Deletion 20
slide-21
SLIDE 21

Partial Sequence Problem

We need to relax the constraint of matching each pair of elements to support partial matching between sequences.

21
slide-22
SLIDE 22

P(artial)-DTW

  • Query sequence (Traffic pattern): X = (X1, X2, … XM)
  • Template sequence (Video fingerprints): Y = (Y1, Y2, … YN)

fp−dtw(X1..M, Y1..N) = min

1≤p≤q≤N D(X1..M, Yp..q)

22
slide-23
SLIDE 23

Normalized Distance

Sequence Yp..q Sequence X1..M

d(i-1, j) d(i-1, j-1) d(i, j) d(i-1, j-2)

Step pattern

d / M

❎ ❎

23
slide-24
SLIDE 24

Experiments

24
slide-25
SLIDE 25

Experimental Settings

๏ 200 videos for fingerprinting. ๏ 12 out of 200 videos for streaming. ๏ Quality levels: 500, 1000, 1500, 2000 kbps. ๏ Segment lengths: 4, 6, 8 seconds.

25
slide-26
SLIDE 26

Discriminability

4 6 8 0.00 0.02 0.04 Segment length (s) 4 6 8 0.00 0.02 0.04 matched unmatched dist 60 90 120 150 180 0.00 0.02 0.04 Eavesdropping time (s) 60 90 120 150 180 0.00 0.02 0.04 matched unmatched dist Video index matched unmatched dist 1 2 3 4 5 6 7 8 9 10 11 12 0.00 0.04 0.08 0.12

Distance calculation by P-DTW has a good discriminability on multiple variables.

26
slide-27
SLIDE 27

Distance Threshold

0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.0 0.2 0.4 0.6 0.8 Threshold False rate False Positive False Negative 0.00 0.02 0.04 0.06 0.08 0.10 DTW MVM P−DTW Method name Normalized similarity distance matched unmatched

๏ P-DTW shows more discriminability. ๏ The threshold is calculated accordingly.

27
slide-28
SLIDE 28

Accuracy

๏ It is greatly influenced by segment length. ๏ Video quality level has a limited impact.

28
slide-29
SLIDE 29

Different Videos

Final performance of our method by streaming 12 videos with DASH. 29
slide-30
SLIDE 30

Conclusion

30
slide-31
SLIDE 31

Conclusion

Contributions

๏ A differential bitrate pattern extraction method. ๏ An effective shape matching method for identifying videos. ๏ Considerable accuracy with enough eavesdropping.

Future Work

๏ More work needs to be done with various encoders and DASH strategies. ๏ Countermeasures considering network efficiency and video QoE are worth studying.

31
slide-32
SLIDE 32

Fin.

gujiaxi@mail.nwpu.edu.cn 32