Feature Selection in Website Fingerprinting Junhua Yan Advisor: - PowerPoint PPT Presentation

Feature Selection in Website Fingerprinting Junhua Yan Advisor: Prof. Jasleen Kaur July 24, 2019 1/23

Website Fingerprinting Goal: determine the visited website by inspecting network traffic on client side client web Figure: Attacker scenario in website fingerprinting. 2/23

Website Fingerprinting Goal: determine the visited website by inspecting network traffic on client side Application: • network manager: protect enterprise networks • Internet Service Providers: gauge user interests • malicious entities: exploit private user data • ... ... client web Figure: Attacker scenario in website fingerprinting. 2/23

Website Fingerprinting Goal: determine the visited website by inspecting network traffic on client side Application: • network manager: protect enterprise networks • Internet Service Providers: gauge user interests • malicious entities: exploit private user data • ... ... TCP/IP Payload Header Figure: IP Packet client web Figure: Attacker scenario in website fingerprinting. 2/23

Website Fingerprinting Methodology 1 Deep Packet Inspection Figure: Unencrypted payload over HTTP TCP/IP Payload Header Figure: IP Packet client web Figure: Attacker scenario in website fingerprinting. 3/23

Website Fingerprinting Methodology 1 Deep Packet Inspection Figure: Encrypted payload over HTTPS TCP/IP Payload Header Figure: IP Packet client web Figure: Attacker scenario in website fingerprinting. 3/23

Website Fingerprinting Methodology 1 Deep Packet Inspection 2 TCP/IP signature-based identification • Extract features from TCP/IP headers • Apply supervised machine learning algorithm TCP/IP Payload Header Figure: IP Packet client web Figure: Attacker scenario in website fingerprinting. 3/23

Website Fingerprinting Methodology TCP/IP Header Field Function Total Length Total length of IP datagram Source The IP address of the original Address source of the IP datagram Destination The IP address of the final Address destination of the IP datagram Source Port TCP port of sending host 1 Deep Packet Inspection Destination Port TCP port of Destination host Table: Five key fields in TCP/IP header. 2 TCP/IP signature-based identification • Extract features from TCP/IP headers • Apply supervised machine learning algorithm TCP/IP Payload Header Figure: IP Packet client web Figure: Attacker scenario in website fingerprinting. 3/23

Related Work & Limitations Author Scenario Features Classifier Liberatore et al. 2006 ( L ) SSH packet size count Naive Bayes Herrmann et al. 2009 ( H ) SSH, Tor packet size frequency Multinomial Bayes Panchenko et al. 2011 ( P ) SSH, Tor burst markers, HTML markers, # of markers, ratio of incoming packets, occurring packet sizes, transmitted bytes, # of packets SVM Dyer et al. 2012 ( Vng++ ) SSH per-direction bandwidth, transmission time, burst markers Naive Bayes Wang et al. 2013 ( FLSVM ) Tor Tor cell instances Distance-based SVM Feghhi et al. 2016 ( DTW ) SSH uplink timing information Dynamic Time Warping Panchenko et al. 2016 Tor # of incoming & outgoing packets, sum of incoming ( CUMUL ) & outgoing packet sizes, interpolant of cumulative packet size SVM # of packets, ratio of incoming & outgoing packets , Hayes et al. 2016 ( k-FP ) Tor packet ordering, concentration of outgoing packets, # of Random Forests packets per second, inter-arrival time, transmission time Trevisan et al. 2016 ( T ) HTTP server IP address count, hostname count * Table: Summary of prior work evaluated in our work. 4/23

Related Work & Limitations Author Scenario Features Classifier Liberatore et al. 2006 ( L ) SSH packet size count Naive Bayes Herrmann et al. 2009 ( H ) SSH , Tor packet size frequency Multinomial Bayes Panchenko et al. 2011 ( P ) SSH , Tor burst markers, HTML markers, # of markers, ratio of incoming packets, occurring packet sizes, transmitted bytes, # of packets SVM Dyer et al. 2012 ( Vng++ ) SSH per-direction bandwidth, transmission time , burst markers Naive Bayes Wang et al. 2013 ( FLSVM ) Tor Tor cell instances Distance-based SVM Feghhi et al. 2016 ( DTW ) SSH uplink timing information Dynamic Time Warping Panchenko et al. 2016 Tor # of incoming & outgoing packets , sum of incoming ( CUMUL ) & outgoing packet sizes, interpolant of cumulative packet size SVM # of packets , ratio of incoming & outgoing packets , Hayes et al. 2016 ( k-FP ) Tor packet ordering, concentration of outgoing packets, # of Random Forests packets per second, inter-arrival time, transmission time Trevisan et al. 2016 ( T ) HTTP server IP address count, hostname count * Table: Summary of prior work evaluated in our work. • Limited set of features studied 4/23

Related Work & Limitations Author Scenario Features Classifier Liberatore et al. 2006 ( L ) SSH packet size count Naive Bayes Herrmann et al. 2009 ( H ) SSH , Tor packet size frequency Multinomial Bayes Panchenko et al. 2011 ( P ) SSH , Tor burst markers, HTML markers, # of markers, ratio of incoming packets, occurring packet sizes, transmitted bytes, # of packets SVM Dyer et al. 2012 ( Vng++ ) SSH per-direction bandwidth, transmission time , burst markers Naive Bayes Wang et al. 2013 ( FLSVM ) Tor Tor cell instances Distance-based SVM Feghhi et al. 2016 ( DTW ) SSH uplink timing information Dynamic Time Warping Panchenko et al. 2016 Tor # of incoming & outgoing packets , sum of incoming ( CUMUL ) & outgoing packet sizes, interpolant of cumulative packet size SVM # of packets , ratio of incoming & outgoing packets , Hayes et al. 2016 ( k-FP ) Tor packet ordering, concentration of outgoing packets, # of Random Forests packets per second, inter-arrival time, transmission time Trevisan et al. 2016 ( T ) HTTP server IP address count, hostname count * Table: Summary of prior work evaluated in our work. • Limited set of features studied What’s the extent of website fingerprint-ability? 4/23

What is the extent of website fingerprint-ability? • Are there other features can be used to achieve comparable accuracy with state-of-the-art? • What if we hide some of informative features, e.g., packet size? • Can features that are informative in one scenario (e.g., Tor) be used to accurately identify websites in another scenario (e.g., SSH)? 5/23

What is the extent of website fingerprint-ability? • Are there other features can be used to achieve comparable accuracy with state-of-the-art? ◦ Extract a comprehensive list of TCP/IP header features • What if we hide some of informative features, e.g., packet size? • Can features that are informative in one scenario (e.g., Tor) be used to accurately identify websites in another scenario (e.g., SSH)? 5/23

What is the extent of website fingerprint-ability? • Are there other features can be used to achieve comparable accuracy with state-of-the-art? ◦ Extract a comprehensive list of TCP/IP header features • What if we hide some of informative features, e.g., packet size? ◦ Consider eight different communication scenarios • Can features that are informative in one scenario (e.g., Tor) be used to accurately identify websites in another scenario (e.g., SSH)? 5/23

Feature Selection in Website Fingerprinting Junhua Yan Advisor: - PowerPoint PPT Presentation

Feature Selection in Website Fingerprinting Junhua Yan Advisor: Prof. Jasleen Kaur July 24, 2019 1/23 Website Fingerprinting Goal: determine the visited website by inspecting network traffic on client side client web Figure: Attacker

k -fingerprinting: a Robust Scalable Website Fingerprinting Technique George Danezis Jamie Hayes

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake Runzer Acoustic Fingerprinting

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

Feature Selection: ROC and Subset Selection Theodoridis 5.5-5.7 Using ROC for Feature Selection

Website fingerprinting attacks against Tor Browser Bundle: a comparison between HTTP/1.1 and

Website Fingerprinting Attacks and Defenses in the Tor Onion Space Marc Juarez imec-COSIC KU

Bayes, not Nave Security Bounds on Website Fingerprinting Defenses Giovanni Cherubin Privacy

Fingerprinting hardware devices Fingerprinting hardware devices using clock-skewing using

CO 447 | LEC6 BLOCKCHAIN SECURITY Dr. Benjamin Livshits Stateless Fingerprinting 2 EFF

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Mutual Information an Adequate Tool for Feature Selection ? Benot Frnay November 15, 2013

Earth: The Feature Presentation - feature, landscape, topography Earth: The Feature Presentation

Enabling Privacy-Aware Zone Exchanges Among Authoritative and Recursive DNS Servers Nikos

Deduplication CSCI 333 Spring 2019 Logistics Lab 2a/b Final Project Final Exam

Website Fingerprinting at Internet Scale Andriy Panchenko 1 , Fabian Lanze 1 , Andreas Zinnen 2 ,

Fingerprinting ECUs for Vehicle Intrusion Detection Kyong-Tak Cho, Kang G. Shin, University of

Visualization for Biometric Evaluation Romain Giot <romain.giot@u-bordeaux.fr> Romain

Fingerprinting Requirements for Increased Controls Licensees Chris Einberg, Senior Project

Automatic Fingerprinting Of Vulnerable BLE IoT Devices With Static UUIDs From Mobile Apps Chaoshun

Picasso: Light-weight device class fingerprinting for web clients Elie Bursztein , Artem Malyshev,

Feature Selection in Website Fingerprinting Junhua Yan Advisor: - PowerPoint PPT Presentation

Feature Selection in Website Fingerprinting Junhua Yan Advisor: Prof. Jasleen Kaur July 24, 2019 1/23 Website Fingerprinting Goal: determine the visited website by inspecting network traffic on client side client web Figure: Attacker

k -fingerprinting: a Robust Scalable Website Fingerprinting Technique George Danezis Jamie Hayes

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake Runzer Acoustic Fingerprinting

Outline Reducing Dimensionality Feature Selection 1 Steven J Zeil Feature Extraction 2

Reducing Dimensionality Steven J Zeil Old Dominion Univ. Fall 2010 1 Feature Selection

Decision Tree Prof. Seungchul Lee Industrial AI Lab. Feature Test Feature 1 Feature 2 Feature

Feature Selection: ROC and Subset Selection Theodoridis 5.5-5.7 Using ROC for Feature Selection

Website fingerprinting attacks against Tor Browser Bundle: a comparison between HTTP/1.1 and

Website Fingerprinting Attacks and Defenses in the Tor Onion Space Marc Juarez imec-COSIC KU

Bayes, not Nave Security Bounds on Website Fingerprinting Defenses Giovanni Cherubin Privacy

Fingerprinting hardware devices Fingerprinting hardware devices using clock-skewing using

CO 447 | LEC6 BLOCKCHAIN SECURITY Dr. Benjamin Livshits Stateless Fingerprinting 2 EFF

A Distinctive Feature of A Distinctive Feature of A Distinctive Feature of A Distinctive Feature

Week 3 Video 4 Automated Feature Generation Automated Feature Selection Automated Feature

ERP Selection KIRTANE &amp; PANDIT Suhas Deshpande Why ERP Selection is important ?

Mutual Information an Adequate Tool for Feature Selection ? Benot Frnay November 15, 2013

Earth: The Feature Presentation - feature, landscape, topography Earth: The Feature Presentation

Enabling Privacy-Aware Zone Exchanges Among Authoritative and Recursive DNS Servers Nikos

Deduplication CSCI 333 Spring 2019 Logistics Lab 2a/b Final Project Final Exam

Website Fingerprinting at Internet Scale Andriy Panchenko 1 , Fabian Lanze 1 , Andreas Zinnen 2 ,

Fingerprinting ECUs for Vehicle Intrusion Detection Kyong-Tak Cho, Kang G. Shin, University of

Visualization for Biometric Evaluation Romain Giot &lt;romain.giot@u-bordeaux.fr&gt; Romain

Fingerprinting Requirements for Increased Controls Licensees Chris Einberg, Senior Project

Automatic Fingerprinting Of Vulnerable BLE IoT Devices With Static UUIDs From Mobile Apps Chaoshun

Picasso: Light-weight device class fingerprinting for web clients Elie Bursztein , Artem Malyshev,

ERP Selection KIRTANE & PANDIT Suhas Deshpande Why ERP Selection is important ?

Visualization for Biometric Evaluation Romain Giot <romain.giot@u-bordeaux.fr> Romain