CloudLeak: Large-Scale Deep Learning Models Stealing Through - PowerPoint PPT Presentation

CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples Honggang Yu 1 , Kaichen Yang 1 , Teng Zhang 2 , Yun-Yun Tsai 3 , Tsung-Yi Ho 3 , Yier Jin 1 1 University of Florida, 2 University of Central Florida, 3 National Tsing Hua University Email: yier.jin@ece.ufl.edu 1 April 3, 2020 – NDSS 2020

Outline Background and Motivation § AI Interface API in Cloud § Existing Attacks and Defenses Adversarial Examples based Model Stealing § Adversarial Examples § Adversarial Active Learning § FeatureFool § MLaaS Model Stealing Attacks Case Study § Commercial APIs hosted by Microsoft, Face++, IBM, Google and Clarifai Defenses Conclusions 2 April 3, 2020

Success of DNN “Perceptron” “Multi-Layer Perceptron” “Deep Convolutional Neural Network” DNN based systems are widely used in various applications: Revolution of DNN Struture 1E+10 160 152 7M 61M 60M 100000000 138M # Parameters 120 # Layers 1000000 80 10000 40 22 100 16 8 1 0 AlexNet VGG-16 GoogLeNet ResNet-152 Parameters Layers 3 April 3, 2020

Commercialized DNN Machine Learning as a Service (MLaaS) § Google Cloud Platform, IBM Watson Visual Recognition, and Microsoft Azure Intelligent Computing System (ICS) § TensorFlow Lite, Pixel Visual Core (in Pixel 2), and Nvidia Jetson TX 4 April 3, 2020

Machine Learning as a Service Training API Prediction API Sensitive Data Black-box Inputs Outputs Dataset Goal 1: Rich Goal 2: Model Prediction API Confidentiality User Suppliers $$$ per query Overview of MLaaS Working Flow 5 April 3, 2020

Machine Learning as a Service Model Confidence Services Products and Solutions Customization Function Black-box Monetize Types Scores √ √ √ √ Custom Vision Traffic Recognition NN Microsoft √ √ √ √ Custom Vision Flower Recognition NN Face Emotion × √ √ √ Face++ Emotion Recognition API NN Verification √ √ √ √ IBM Watson Visual Recognition Face Recognition NN √ √ √ √ Google AutoML Vision Flower Recognition NN Offensive Content × √ √ √ Clarifai Not Safe for Work (NSFW) NN Moderation 6 April 3, 2020

Model Stealing Attacks Various model stealing attacks have been developed None of them can achieve a good tradeoffs among query counts, accuracy, cost, etc. Proposed Attacks Parameter Size Queries Accuracy Black-box? Stealing Cost √ F. Tramer (USENIX’16) ~ 45k ~ 102k High Low √ Juuti (EuroS&P’19) ~10M ~ 111k High - √ Correia-Silva (IJCNN’18) ~ 200M ~66k High High √ Papernot (AsiaCCS’17) ~ 100M ~7k Low - 7 April 3, 2020

Adversarial Example based Model Stealing 8 April 3, 2020

Adversarial Examples in DNN Adversarial examples are model inputs generated by an adversary to fool deep learning models. “source example” “adversarial perturbation” “advesarial example” “target label” � = = + Goodfellow et al, 2014 9 April 3, 2020

Adversarial Examples Source Non-Feature-based Adversarial § Projected Gradient Descent (PGD) attack § C&W Attack Source Adversarial Feature-based Carlini et al, 2017 § Feature adversary attack § FeatureFool Source Guide Adversarial Adversarial Perturbation Perturbation Guide Source 10 April 3, 2020

A Simplified View of Adversarial Examples f x < ( ) 0 Source example Medium-confidence legitimate example Minimum-confidence legitimate example Minimum-confidence adversarial example Medium-confidence adversarial example Maximum-confidence adversarial example f x > ( ) 0 A high-level illustration of the adversarial example generation 11 April 3, 2020

Adversarial Active Learning We gather a set of “useful examples” to train a substitute model with the performance similar to the black-box model. f x < ( ) 0 Source example Medium-confidence legitimate example Medium-confidence adversarial example Maximum-confidence adversarial example Minimum-confidence legitimate example Minimum-confidence adversarial example “Useful examples” f x > ( ) 0 Illustration of the margin-based uncertainty sampling strategy. 12 April 3, 2020

FeatureFool: Margin-based Adversarial Examples To reduce the scale of the perturbation, we further propose a feature-based attack to generate more robust adversarial examples. § Attack goal: Low confidence score for true class (we use ! to control the confidence score). * , ( ) + - . /011 2,3 ( ) * minimize ' ( ) * ∈ [0,1] ? such that ( ) * , we formally define it as: For the triplet loss /011 2,3 ( ) * = max(C ∅ E ( ) * , ∅ E ( F /011 2,3 ( ) − * , ∅ E ( ) C ∅ E ( ) + !, 0) § In order to solve the reformulated optimization problem above, we apply the box- constrained L-BFGS for finding a minimum of the loss function. 13 April 3, 2020

FeatureFool: A New Adversarial Attack (a) Source image (b) Adversarial perturbation (d) Feature Extractor (e) Salient Features + &(" # + $) " # $ (c) Guide Image &(" % ) L-BFGS " % (1) Input an image and extract the corresponding n-th layer feature mapping using the feature extractor; (2) Compute the class salience map to decide which points of feature mapping should be modified; (3) Search for the minimum perturbation that satisfies the optimization formula. 14 April 3, 2020

FeatureFool: A New Adversarial Attack Source Adversarial Source Guide Guide Adversarial Source Adversarial Guide Neutral: Happy: Happy: 0.99 √ 0.98 √ 0.01 × 15 April 3, 2020

MLaaS Model Stealing Attacks Our attack approach: § Use all adversarial examples to generate the malicious inputs; § Obtain input-output pairs by querying black-box APIs with malicious inputs; § Retrain the substitute models which are generally chosen from candidate Model Zoo. Candidate Library MLaaS Adversary Model Zoo (AlexNet, VGGNet, Inputs ResNet) Search Malicious Examples Outputs (PGD, C&W, FeatureFool) Illustration of the proposed MLaaS model stealing attacks 16 April 3, 2020

MLaaS Model Stealing Attacks Overview of the transfer framework for the model theft attack (a) Unlabeled Synthetic Datatset (c) Synthetic (d) Feature Transfer (b) MLaaS (e) Prediction Source Domain Dataset with Query Stolen Labels ? DB Problem Domain Retrained Layers Reused Layers Layer copied from Teacher Layer trained by Student (Adversary) (1) Generate unlabeled dataset (2) Query MLaaS (3) Use transfer learning method to retrain the substitute model 17 April 3, 2020

Example: Emotion Classification Procedure to extract a copy of the Emotion Classification model 1) Choose a more complex/relevant network, e.g., VGGFace. 2) Generate/Collect images relevant to the classification problem in source domain and in problem domain (relevant queries). 3) MLaaS query. 4) Local model training based on the cloud query results. Architecture Choice for stealing Face++ Emotion Classification API (A = 0.68k; B = 1.36k; C = 2.00k) 18 April 3, 2020

Experimental Results Adversarial perturbations result in a more successful transfer set. In most cases, our FeatureFool method achieves the same level of accuracy with fewer queries than other methods Dataset Service Model Price ($) Queries RS PGD CW FA FF 0.43k 10.21% 10.49% 12.10% 11.64% 15.96% 0.43 1.29k 45.30% 59.91% 61.25% 49.25% 66.91% 1.29 Traffic 2.15k 70.03% 72.20% 74.94% 71.30% 76.05% 2.15 Microsoft 0.51k 26.27% 27.84% 29.41% 28.14% 31.86% 1.53 1.53k 64.02% 68.14% 69.22% 68.63% 72.35% 4.59 Flower 2.55k 79.22% 83.24% 89.20% 84.12% 88.14% 7.65 Comparison of performance on the victim model (Microsoft) and their local substitute models. 19 April 3, 2020

Comparison with Existing Attacks Our attack framework can steal large-scale deep learning models with high accuracy, few queries and low costs simultaneously. The same trend appears while we use different transfer architectures to steal black-box target model. Proposed Attacks Parameter Size Queries Accuracy Black-box? Stealing Cost √ F. Tramer (USENIX’16) ~ 45k ~ 102k High Low √ Juuti (EuroS&P’19) ~10M ~ 111k High - √ Correia-Silva (IJCNN’18) ~ 200M ~66k High High √ Papernot (AsiaCCS’17) ~ 100M ~7k Low - √ Our Method ~ 200M ~3k High Low A Comparison to prior works. 20 April 3, 2020

Evading Defenses Evasion of PRADA Detection § Our attacks can easily bypass their defense by carefully selecting the parameter M from 0.1 ' to 0.8 ' . § Other types of adversarial attacks can also bypass the PRADA defense if * is small. Queries made until detection Model ( ! value) FF PGD CW FA " = 0.8' " = 0.5' " = 0.1' Traffic ( * = 0.92 ) missed missed missed missed 150 130 Traffic ( * = 0.97 ) 110 110 110 110 110 110 Flower ( * = 0.87 ) 110 missed 220 missed 290 140 Flower ( * = 0.90 ) 110 340 220 350 120 130 Flower ( * = 0.94 ) 110 340 220 350 120 130 21 April 3, 2020

CloudLeak: Large-Scale Deep Learning Models Stealing Through - PowerPoint PPT Presentation

CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples Honggang Yu 1 , Kaichen Yang 1 , Teng Zhang 2 , Yun-Yun Tsai 3 , Tsung-Yi Ho 3 , Yier Jin 1 1 University of Florida, 2 University of Central Florida, 3 National

Commercial MLaaSPlatforms Yun-Yun Tsai & Tsung-Yi Ho National Tsing Hua University #BHUSA

by learning the Distributions of Adversarial Examples Boqing Gong Joint work with Yandong Li,

Words & their Meaning: Distributional Semantics CMSC 470 Marine Carpuat Slides credit: Dan

OpenStack based telco clouds Todays Panel Sunil Sood Moderator Ericsson VP, IT & Cloud

CSE 610 Special Topics: System Security - Attack and Defense for Binaries Instructor: Dr. Ziming

Five ways not to fool yourself Tim Harris 23-Jun-18 Five ways not to fool yourself A

Outline Introduction Dynamic Symbolic Execution Binsec/SE Demo CEA - - 2/11 Introduction The

Joey Stanley University of Georgia joeystan@uga.edu @joey_stan joeystanley.com Special thanks

Robustness and geometry of deep neural networks Alhussein Fawzi DeepMind May 23rd 2019 The

Agile Adoption and Parenting Max Keeler September 28, 2009 The Goal 2 September 28, 2009

Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification Lianhui

Making Default Address Selection More Robust FoolProof

Fool Proof Luke 24:13-35 April 1, 2018 1957 Swiss Spaghetti Harvest 1976 Zero-G Day

Three Fools and a Wise Woman (1 Samuel 25) I hope youre enjoying our sermon series on

Near-Optimal Pseudorandom Generators for Constant-Depth Read-Once Formulas Dean Doron 1 Pooya

Adversaries & Interpretability SIDN: An IAP Practicum Shibani Santurkar Dimitris Tsipras

Outsourcing Source Code Distribution Requirements Alexios Zavras, Stefano Zacchiroli Intel,

For the wise men of old, the cardinal problem of human life was how to conform the soul to

Living in a fools wireless - secured paradise Stefan Kiese Topics Wireless (consumer)

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George

Data Sovereignty The importance of geolocating data in the cloud Zachary N J Peterson Mark

Mapping of the space change of Oued Ali Mountains (Mascara, Algeria) by Landsat optical imagery

Bayesian Learning By Harivinod N Vivekananda College of Engineering Technology, Puttur 15CS73

CloudLeak: Large-Scale Deep Learning Models Stealing Through - PowerPoint PPT Presentation

CloudLeak: Large-Scale Deep Learning Models Stealing Through Adversarial Examples Honggang Yu 1 , Kaichen Yang 1 , Teng Zhang 2 , Yun-Yun Tsai 3 , Tsung-Yi Ho 3 , Yier Jin 1 1 University of Florida, 2 University of Central Florida, 3 National

Commercial MLaaSPlatforms Yun-Yun Tsai &amp; Tsung-Yi Ho National Tsing Hua University #BHUSA

by learning the Distributions of Adversarial Examples Boqing Gong Joint work with Yandong Li,

Words &amp; their Meaning: Distributional Semantics CMSC 470 Marine Carpuat Slides credit: Dan

OpenStack based telco clouds Todays Panel Sunil Sood Moderator Ericsson VP, IT &amp; Cloud

CSE 610 Special Topics: System Security - Attack and Defense for Binaries Instructor: Dr. Ziming

Five ways not to fool yourself Tim Harris 23-Jun-18 Five ways not to fool yourself A

Outline Introduction Dynamic Symbolic Execution Binsec/SE Demo CEA - - 2/11 Introduction The

Joey Stanley University of Georgia joeystan@uga.edu @joey_stan joeystanley.com Special thanks

Robustness and geometry of deep neural networks Alhussein Fawzi DeepMind May 23rd 2019 The

Agile Adoption and Parenting Max Keeler September 28, 2009 The Goal 2 September 28, 2009

Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification Lianhui

Making Default Address Selection More Robust FoolProof

Fool Proof Luke 24:13-35 April 1, 2018 1957 Swiss Spaghetti Harvest 1976 Zero-G Day

Three Fools and a Wise Woman (1 Samuel 25) I hope youre enjoying our sermon series on

Near-Optimal Pseudorandom Generators for Constant-Depth Read-Once Formulas Dean Doron 1 Pooya

Adversaries &amp; Interpretability SIDN: An IAP Practicum Shibani Santurkar Dimitris Tsipras

Outsourcing Source Code Distribution Requirements Alexios Zavras, Stefano Zacchiroli Intel,

For the wise men of old, the cardinal problem of human life was how to conform the soul to

Living in a fools wireless - secured paradise Stefan Kiese Topics Wireless (consumer)

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes &amp; George

Data Sovereignty The importance of geolocating data in the cloud Zachary N J Peterson Mark

Mapping of the space change of Oued Ali Mountains (Mascara, Algeria) by Landsat optical imagery

Bayesian Learning By Harivinod N Vivekananda College of Engineering Technology, Puttur 15CS73

Commercial MLaaSPlatforms Yun-Yun Tsai & Tsung-Yi Ho National Tsing Hua University #BHUSA

Words & their Meaning: Distributional Semantics CMSC 470 Marine Carpuat Slides credit: Dan

OpenStack based telco clouds Todays Panel Sunil Sood Moderator Ericsson VP, IT & Cloud

Adversaries & Interpretability SIDN: An IAP Practicum Shibani Santurkar Dimitris Tsipras

Learning Universal Adversarial Perturbations with Generative Models Jamie Hayes & George