Walls Have Ears: Traffic-based Side-channel Attack in Video Streaming
Jiaxi Gu∗, Jiliang Wang†, Zhiwen Yu∗, Kele Shen†
∗ Northwestern Polytechnical University, P.R. China
† Tsinghua University, P.R. China
1
Walls Have Ears: Traffic-based Side-channel Attack in Video - - PowerPoint PPT Presentation
Walls Have Ears: Traffic-based Side-channel Attack in Video Streaming Jiaxi Gu , Jiliang Wang , Zhiwen Yu , Kele Shen Northwestern Polytechnical University, P .R. China Tsinghua University, P .R. China 1 Outline
Walls Have Ears: Traffic-based Side-channel Attack in Video Streaming
Jiaxi Gu∗, Jiliang Wang†, Zhiwen Yu∗, Kele Shen†
∗ Northwestern Polytechnical University, P.R. China
† Tsinghua University, P.R. China
1Outline
๏ Background & Motivation ๏ Objective ๏ Methodology ๏ Experiments ๏ Conclusion
2Background & Motivation
3Booming Video Industry
Globally, IP video traffic will be 82 percent of all consumer Internet traffic by 2021, up from 73 percent in 2016.
— Cisco VNI. “Forecast and methodology, 2016-2021, white paper.” (2017)
4“Walls have ears”
1 2 3
5Why does it matter?
malicious adversaries.
their customers or employees.
services can be damaged.
6What makes it worse!
ubiquitous despite data encryption.
make video streams identifiable.
7Objective
8Objective
An eavesdropped traffic trace
A traffic pattern
Downloaded video files
Fingerprints
Shape matching by calculating distance (similarity)
Normally for a periodVBR
VBR (Variable Bit-Rate encoding)
Data amount per time slot changes owing to VBR. The bitrate variation trends show similar patterns between different quality levels. 10DASH
DASH (Dynamic Adaptive Streaming over HTTP)
๏ Encoding: Videos are encoded in multiple quality levels. ๏ Segmenting: Video copies of multiple qualities are chunked into segments. ๏ Streaming: Video segments in adaptive quality levels are transmitted in order.
11DASH
12 Bandwidth 1500 kbps 1000 kbps 500 kbps Time Time Time Server: Reply High Low Quality levels Downloaded segments Client: Request๏ 😋 Transmitted segments are length-fixed and in-order. ๏ 😕 Quality level while streaming is adaptively switched.
Methodology
13Network Traffic of DASH
๏ Video segments are transmitted in order. ๏ Video segment length is fixed while streaming. ๏ Traffic pattern owing to VBR is preserved.
Network traffic of streaming 3 different videos. 14Segment Aggregation
Time (s) T h r๏ One segment may take seconds for transmission. ๏ There are gaps between video segment transmissions. ๏ Noises are got rid of by threshold.
15Bitrate Differential
500 kbps 1500 kbps 1000 kbps 2000 kbps An example of data amount sequence in video segment.s0
i = sisi−1 si−1
16Normalization
Min-max-normalization: M(xi) =
xi−min(x) max(x)−min(x)
Z(xi) = xi−µ
σ
Z-normalization:
S(xi) =
1 1+e−xi
Sigmoid-normalization:
17Video Fingerprinting
Segmentation Normalization Aggregation
Differential
18Distance Calculation
Video fingerprints Traffic pattern๏ Eavesdropping can hardly start from the beginning. ๏ It is time-consuming to eavesdrop the entire video.
19Dynamic Time Warping (DTW)
Sequence Y1..N Sequence X1..M
d(i-1, j) d(i-1, j-1) d(i, j) d(i, j-1)
d(i, j) = kXi Yjk + min d(i, j 1) d(i 1, j 1) d(i 1, j) DTW Matrix Step pattern
Insertion Match Deletion 20Partial Sequence Problem
We need to relax the constraint of matching each pair of elements to support partial matching between sequences.
21P(artial)-DTW
fp−dtw(X1..M, Y1..N) = min
1≤p≤q≤N D(X1..M, Yp..q)
22Normalized Distance
Sequence Yp..q Sequence X1..M
d(i-1, j) d(i-1, j-1) d(i, j) d(i-1, j-2)
Step pattern
d / M
❎ ❎
23Experiments
24Experimental Settings
๏ 200 videos for fingerprinting. ๏ 12 out of 200 videos for streaming. ๏ Quality levels: 500, 1000, 1500, 2000 kbps. ๏ Segment lengths: 4, 6, 8 seconds.
25Discriminability
4 6 8 0.00 0.02 0.04 Segment length (s) 4 6 8 0.00 0.02 0.04 matched unmatched dist 60 90 120 150 180 0.00 0.02 0.04 Eavesdropping time (s) 60 90 120 150 180 0.00 0.02 0.04 matched unmatched dist Video index matched unmatched dist 1 2 3 4 5 6 7 8 9 10 11 12 0.00 0.04 0.08 0.12Distance calculation by P-DTW has a good discriminability on multiple variables.
26Distance Threshold
0.010 0.015 0.020 0.025 0.030 0.035 0.040 0.0 0.2 0.4 0.6 0.8 Threshold False rate False Positive False Negative 0.00 0.02 0.04 0.06 0.08 0.10 DTW MVM P−DTW Method name Normalized similarity distance matched unmatched๏ P-DTW shows more discriminability. ๏ The threshold is calculated accordingly.
27Accuracy
๏ It is greatly influenced by segment length. ๏ Video quality level has a limited impact.
28Different Videos
Final performance of our method by streaming 12 videos with DASH. 29Conclusion
30Conclusion
Contributions
๏ A differential bitrate pattern extraction method. ๏ An effective shape matching method for identifying videos. ๏ Considerable accuracy with enough eavesdropping.
Future Work
๏ More work needs to be done with various encoders and DASH strategies. ๏ Countermeasures considering network efficiency and video QoE are worth studying.
31Fin.
gujiaxi@mail.nwpu.edu.cn 32