AI and Security: Lessons, Challenges & Future Directions Dawn - - PowerPoint PPT Presentation

ai and security lessons challenges future directions
SMART_READER_LITE
LIVE PREVIEW

AI and Security: Lessons, Challenges & Future Directions Dawn - - PowerPoint PPT Presentation

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AI and Security Enabler AI Security Enabler AI enables security applications Security enables better AI Integrity: produces intended/correct


slide-1
SLIDE 1

Dawn Song

UC Berkeley

AI and Security: Lessons, Challenges & Future Directions

slide-2
SLIDE 2

AI and Security

Security AI

Enabler Enabler

  • AI enables security applications
  • Security enables better AI
  • Integrity: produces intended/correct results (adversarial machine learning)
  • Confidentiality/Privacy: does not leak users’ sensitive data (secure, privacy-

preserving machine learning)

  • Preventing misuse of AI
slide-3
SLIDE 3

AI and Security: AI in the presence of attacker

slide-4
SLIDE 4

AI and Security: AI in the presence of attacker

  • Important to consider the presence of attacker
  • History has shown attacker always follows footsteps of new technology development (or

sometimes even leads it)

  • The stake is even higher with AI
  • As AI controls more and more systems, attacker will have higher & higher incentives
  • As AI becomes more and more capable, the consequence of misuse by attacker will become more and more

severe

slide-5
SLIDE 5

AI and Security: AI in the presence of attacker

  • Attack AI
  • Cause learning system to not produce intended/correct results
  • Cause learning system to produce targeted outcome designed by attacker
  • Learn sensitive information about individuals
  • Need security in learning systems
  • Misuse AI
  • Misuse AI to attack other systems
  • Find vulnerabilities in other systems
  • Target attacks
  • Devise attacks
  • Need security in other systems
slide-6
SLIDE 6

AI and Security: AI in the presence of attacker

  • Attack AI:
  • Cause learning system to not produce intended/correct results
  • Cause learning system to produce targeted outcome designed by attacker
  • Learn sensitive information about individuals
  • Need security in learning systems
  • Misuse AI
  • Misuse AI to attack other systems
  • Find vulnerabilities in other systems
  • Target attacks
  • Devise attacks
  • Need security in other systems
slide-7
SLIDE 7

Deep Learning Systems Are Easily Fooled

Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., & Fergus, R. Intriguing properties of neural networks. ICLR 2014.

  • strich
slide-8
SLIDE 8
slide-9
SLIDE 9

STOP Signs in Berkeley

slide-10
SLIDE 10

Adversarial Examples in Physical World

10

Can we generate adversarial examples in the physical world that remain effective under different viewing conditions and viewpoints, including viewing distances and angles?

slide-11
SLIDE 11

Adversarial Examples in Physical World

11

Subtle Perturbations

Evtimov, Ivan, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, and Dawn Song. “Robust Physical-World Attacks on Machine Learning Models.” arXiv preprint arXiv:1707.08945 (2017).

slide-12
SLIDE 12

Adversarial Examples in Physical World

12

Subtle Perturbations

slide-13
SLIDE 13

13

Camouflage Perturbations

Adversarial Examples in Physical World

slide-14
SLIDE 14

14

Camouflage Perturbations

slide-15
SLIDE 15

Adversarial Examples in Physical World

15

Adversarial perturbations are possible in physical world under different viewing conditions and viewpoints, including viewing distances and angles.

Deep loss function:

slide-16
SLIDE 16

Adversarial Examples Prevalent in Deep Learning Systems

  • Most existing work on adversarial examples:
  • Image classification task
  • Target model is known
  • Our investigation on adversarial examples:

Blackbox Attacks

Weaker Threat Models (Target model is unknown)

Generative Models Deep Reinforcement Learning VisualQA/ Image-to-code

Other tasks and model classes

New Attack Methods

Provide more diversity of attacks

slide-17
SLIDE 17

Generative models

  • VAE-like models (VAE, VAE-GAN) use an intermediate latent

representation

  • An encoder: maps a high-dimensional input into lower-

dimensional latent representation z.

  • A decoder: maps the latent representation back to a high-

dimensional reconstruction.

slide-18
SLIDE 18

Adversarial Examples in Generative Models

  • An example attack scenario:
  • Generative model used as a compression scheme
  • Attacker’s goal: for the decompressor to reconstruct a

different image from the one that the compressor sees.

slide-19
SLIDE 19

Adversarial Examples for VAE-GAN in MNIST

Target Image

Jernej Kos, Ian Fischer, Dawn Song: Adversarial Examples for Generative Models Original images Reconstruction of original images Adversarial examples Reconstruction of adversarial examples

slide-20
SLIDE 20

Adversarial Examples for VAE-GAN in SVHN

Target Image

Jernej Kos, Ian Fischer, Dawn Song: Adversarial Examples for Generative Models

Original images Reconstruction of original images Adversarial examples Reconstruction of adversarial examples

slide-21
SLIDE 21

Target Image

Jernej Kos, Ian Fischer, Dawn Song: Adversarial Examples for Generative Models

Original images Reconstruction of original images Adversarial examples Reconstruction of adversarial examples

Adversarial Examples for VAE-GAN in SVHN

slide-22
SLIDE 22

Deep Reinforcement Learning Agent (A3C) Playing Pong

Original Frames

Jernej Kos and Dawn Song: Delving into adversarial attacks on deep policies [ICLR Workshop 2017].

slide-23
SLIDE 23

Adversarial Examples on A3C Agent on Pong

Jernej Kos and Dawn Song: Delving into adversarial attacks on deep policies [ICLR Workshop, 2017]

  • No. of steps

Score

slide-24
SLIDE 24

Blindly injecting adversarial perturbations every 10 frames.

  • No. of steps
  • No. of steps

Score Score

Attacks Guided by Value Function

Injecting adversarial perturbations guided by the value function.

slide-25
SLIDE 25

Agent in Action

Original Frames With FGSM perturbations (𝜗 = 0.005) inject in every frame With FGSM perturbations (𝜗 = 0.005) inject based

  • n value function

Jernej Kos and Dawn Song: Delving into adversarial attacks on deep policies [ICLR Workshop 2017].

slide-26
SLIDE 26

Visual Q&A

Given a question and an image, predict the answer.

slide-27
SLIDE 27

Studied VQA Models

Model 1: MCB ( https://arxiv.org/abs/1606.01847 )

  • Uses Multimodal Compact Bilinear pooling to combine the image feature and

question embedding.

slide-28
SLIDE 28

Studied VQA Models

Model 2: NMN ( https://arxiv.org/abs/1704.05526 )

  • A representative of neural module networks
  • First predicts a network layout according to the question, then predicts the

answer using the obtained network.

slide-29
SLIDE 29

Question: What color is the sky? Original answer: MCB - blue, NMN - blue. Target: gray. Answer after attack: MCB - gray, NMN - gray.

benign image adversarial image for MCB adversarial image for NMN

Xiaojun Xu, Xinyun Chen, Chang Liu, Anna Rohrbach, Trevor Darell, Dawn Song: Can you fool AI with adversarial examples on a visual Turing test?

slide-30
SLIDE 30

Question: Is it raining? Original answer: MCB - no, NMN - no. Target: yes. Answer after attack: MCB - yes, NMN - yes.

benign image adv image for MCB adv image for NMN

slide-31
SLIDE 31

Question: What is on the ground? Original answer: MCB - sand, NMN - sand. Target: snow. Answer after attack: MCB - snow, NMN - snow.

benign image adv image for MCB adv image for NMN

slide-32
SLIDE 32

Question: Where is the plane? Original answer: MCB - runway, NMN - runway. Target: sky. Answer after attack: MCB - sky, NMN - sky.

benign image adv image for MCB adv image for NMN

slide-33
SLIDE 33

Question: What color is the traffic light? Original answer: MCB - green, NMN - green. Target: red. Answer after attack: MCB - red, NMN - red. benign image adv image for MCB adv image for NMN

slide-34
SLIDE 34

Question: What does the sign say? Original answer: MCB - stop, NMN - stop. Target: one way. Answer after attack: MCB - one way, NMN - one way.

benign image adv image for MCB adv image for NMN

slide-35
SLIDE 35

Question: How many cats are there? Original answer: MCB - 1, NMN - 1. Target: 2. Answer after attack: MCB - 2, NMN - 2.

benign image adv image for MCB adv image for NMN

slide-36
SLIDE 36

Adversarial Examples Prevalent in Deep Learning Systems

  • Most existing work on adversarial examples:
  • Image classification task
  • Target model is known
  • Our investigation on adversarial examples:

Blackbox Attacks

Weaker Threat Models (Target model is unknown)

Generative Models Deep Reinforcement Learning VisualQA/ Image-to-code

Other tasks and model classes

New Attack Methods

Provide more diversity of attacks

slide-37
SLIDE 37

A General Framework for Black-box attacks

  • Zero-Query Attack (Previous methods)
  • Random perturbation
  • Difference of means
  • Transferability-based attack
  • Practical Black-Box Attacks against Machine Learning [Papernot et al. 2016]
  • Ensemble transferability-based attack [Yanpei Liu, Xinyun Chen, Chang Liu, Dawn Song:

Delving into Transferable Adversarial Examples and Black-box Attacks, ICLR 2017]

  • Query Based Attack (new method)
  • Finite difference gradient estimation
  • Query reduced gradient estimation
  • A general active query game model

The zero-query attack can be viewed as a special case for the query based attack, where the number of queries made is zero

slide-38
SLIDE 38

Query Based attacks

  • Finite difference gradient estimation
  • Given d-dimensional vector x, we can make 2d queries to estimate the

gradient as below

  • An example of approximate FGS with finite difference
  • Query reduced gradient estimation
  • Random grouping
  • PCA

xadv = x + ✏ · sign (FDx(`f(x, y), ))

Similarly, we can also approximate for logit-based loss by making 2d queries [Bhagoji, Li, He, Song, 2017]

slide-39
SLIDE 39

Query Based Attacks

Finite Differences method outperforms other black-box attacks and achieves similar attack success rate with the white-box attack Gradient estimation method with query reduction performs approximately similar as without query reduction

slide-40
SLIDE 40

Black-box Attack on Clarifai

The Gradient-Estimation black-box attack on Clarifai’s Content Moderation Model

Original image, classified as “drug” with a confidence of 0.99 Adversarial example, classified as “safe” with a confidence of 0.96

slide-41
SLIDE 41

Adversarial Examples Prevalent in Deep Learning Systems

  • Most existing work on adversarial examples:
  • Image classification task
  • Target model is known
  • Our investigation on adversarial examples:

Blackbox Attacks

Weaker Threat Models (Target model is unknown)

Generative Models Deep Reinforcement Learning VisualQA/ Image-to-code

Other tasks and model classes

New Attack Methods

Provide more diversity of attacks

slide-42
SLIDE 42

Generating Adversarial Examples with Adversarial Networks

L = Lf

adv + αLGAN + βLhinge

LGAN = Ex∼Pdata(x) log D(x) + Ex∼Pdata(x) log(1 − D(x + G(x)))

Black-box can be performed here via distillation

[Xiao, Li, Zhu, He, Liu, Song, 2017]

slide-43
SLIDE 43

Semi-white box attack on MNIST Black-box attack on MNIST The perturbed images are very close to the original ones. The original images lie on the diagonal.

slide-44
SLIDE 44

The perturbed images are very close to the original ones. The original images lie on the diagonal.

slide-45
SLIDE 45

Numerous Defenses Proposed

  • Input processing
  • Gaussian blur, median blur
  • Quantization
  • Adversary re-training
  • Re-train with generated adversarial examples
  • Detecting adversarial examples
  • Detecting anomalous high-frequency patterns in input
  • Detecting anomalous activations
  • Detecting low confidence output
slide-46
SLIDE 46

Numerous Defenses Proposed

Ensemble Normalization Distributional detection PCA detection Secondary classification Stochastic Generative Training process Architecture Retrain Pre-process input

Detection Prevention

slide-47
SLIDE 47

No Sufficient Defense Today

  • Strong, adaptive attacker can easily evade today’s defenses
  • Ensemble of weak defenses does not (by default) lead to strong defense
  • Warren He, James Wei, Xinyun Chen, Nicholas Carlini, Dawn Song [WOOT 2017]
  • Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods
  • Nicholas Carlini and David Wagner
slide-48
SLIDE 48

Adversarial Machine Learning

  • Adversarial machine learning:
  • Learning in the presence of adversaries
  • Inference time: adversarial example fools learning system
  • Evasion attacks
  • Evade malware detection; fraud detection
  • Training time:
  • Attacker poisons training dataset (e.g., poison labels) to fool learning system to learn wrong model
  • Poisoning attacks: e.g., Microsoft’s Tay twitter chatbot
  • Attacker selectively shows learner training data points (even with correct labels) to fool learning

system to learn wrong model

  • Data poisoning is particularly challenging with crowd-sourcing & insider attack
  • Difficult to detect when the model has been poisoned
  • Adversarial machine learning particularly important for security critical system
slide-49
SLIDE 49

Security will be one of the biggest challenges in Deploying AI

slide-50
SLIDE 50

Security of Learning Systems

  • Software level
  • Learning level
  • Distributed level
slide-51
SLIDE 51

Challenges for Security at Software Level

  • No software vulnerabilities (e.g., buffer overflows & access control issues)
  • Attacker can take control over learning systems through exploiting software

vulnerabilities

slide-52
SLIDE 52

Challenges for Security at Software Level

  • No software vulnerabilities (e.g., buffer overflows & access control issues)
  • Existing software security/formal verification techniques apply

Proactive Defense: Bug Finding Proactive Defense: Secure by Construction Reactive Defense

Automatic worm detection & signature/patch generation Automatic malware detection & analysis Progression of different approaches to software security over last 20 years

slide-53
SLIDE 53

Security of Learning Systems

  • Software level
  • Learning level
  • Distributed level
slide-54
SLIDE 54

Challenges for Security at Learning Level

  • Evaluate system under adversarial events, not just normal events
slide-55
SLIDE 55

Regression Testing vs. Security Testing in Traditional Software System

Regression Testing Security Testing Operation Run program on normal inputs Run program on abnormal/adversarial inputs Goal Prevent normal users from encountering errors Prevent attackers from finding exploitable errors

slide-56
SLIDE 56

Regression Testing vs. Security Testing in Learning System

Regression Testing Security Testing Training Train on noisy training data: Estimate resiliency against noisy training inputs Train on poisoned training data: Estimate resiliency against poisoned training inputs Testing Test on normal inputs: Estimate generalization error Test on abnormal/adversarial inputs: Estimate resiliency against adversarial inputs

slide-57
SLIDE 57

Challenges for Security at Learning Level

  • Evaluate system under adversarial events, not just normal events
  • Regression testing vs. security testing
  • Reason about complex, non-symbolic programs
slide-58
SLIDE 58

Decades of Work on Reasoning about Symbolic Programs

  • Symbolic programs:
  • E.g., OS, File system, Compiler, web application, mobile application
  • Semantics defined by logic
  • Decades of techniques & tools developed for logic/symbolic reasoning
  • Theorem provers, SMT solvers
  • Abstract interpretation
slide-59
SLIDE 59

Era of Formally Verified Systems

IronClad/IronFleet FSCQ CertiKOS EasyCrypt CompCert miTLS/Everest

Verified: Micro-kernel, OS, File system, Compiler, Security protocols, Distributed systems

slide-60
SLIDE 60

Powerful Formal Verification Tools + Dedicated Teams

Coq

Why3 Z3

slide-61
SLIDE 61

No Sufficient Tools to Reason about Non-Symbolic Programs

  • Symbolic programs:
  • Semantics defined by logic
  • Decades of techniques & tools developed for logic/symbolic reasoning
  • Theorem provers, SMT solvers
  • Abstract interpretation
  • Non-symbolic programs:
  • No precisely specified properties & goals
  • No good understanding of how learning system works
  • Traditional symbolic reasoning techniques do not apply
slide-62
SLIDE 62

Challenges for Security at Learning Level

  • Evaluate system under adversarial events, not just normal events
  • Regression testing vs. security testing
  • Reason about complex, non-symbolic programs
  • Design new architectures & approaches with stronger generalization &

security guarantees

slide-63
SLIDE 63

Example Applications:

  • End-user programming
  • Performance optimization of code
  • Virtual assistant

Neural Program Synthesis

Program Intent Program Synthesizer

Can we teach computers to write code?

“Software is eating the world” --- az16 Program synthesis can automate this & democratize idea realization

slide-64
SLIDE 64

Neural Program Synthesis

Training data 452 345 123 234 357 Input Output 797 612 367 979

slide-65
SLIDE 65

Neural Program Synthesis

Neural Program Architecture Learned neural program

Test input Test output 120 Training data 452 345 123 234 357 Input Output 797 612 367 979 50 70

slide-66
SLIDE 66

Neural Program Architectures

Neural Turing Machine (Graves et al) Neural Programmer (Neelankatan et al) Neural Programmer-Interpreter (Reed et al) Neural GPU (Kaiser et al) Stack Recurrent Nets (Joulin et al) Learning Simple Algorithms from Examples (Zaremba et al) Differentiable Neural Computer (Graves et al)

Neural Program Synthesis Tasks: Copy, Grade-school addition, Sorting, Shortest Path

Nov 2014 May 2015 Dec 2015 May 2016 June 2016 Oct 2016 Reinforcement Learning Neural Turing Machines (Zaremba et al)

slide-67
SLIDE 67

Challenge 1: Generalization

Training data 452 345 123 234 357 Input Output 797 612 367 979

length = 5 length = 3

Neural Program Architecture Learned neural program

Test input Test output 54321 34216 24320

slide-68
SLIDE 68

58536

Challenge 2: No Proof of Generalization

Training data 452 345 123 234 357 Input Output 797 612 367 979

length = 3 length = 5

Neural Program Architecture Learned neural program

Test input Test output 34216 24320

slide-69
SLIDE 69

Our Approach: Introduce Recursion

Learn recursive neural programs

Jonathon Cai, Richard Shin, Dawn Song: Making Neural Programming Architectures Generalize via Recursion [ICLR 2017, Best Paper Award ]

slide-70
SLIDE 70

Recursion

Quicksort

  • Fundamental concept in Computer Science and Math
  • Solve whole problem by reducing it to smaller subproblems (reduction rules)
  • Base cases (smallest subproblems) are easier to reason about
slide-71
SLIDE 71
  • Proof of Generalization:
  • Recursion enables provable guarantees about neural programs
  • Prove perfect generalization of a learned recursive program via a verification procedure
  • Explicitly testing on all possible base cases and reduction rules (Verification set)
  • Learn & generalize faster as well
  • Trained on same data, non-recursive programs do not generalize well

Jonathon Cai, Richard Shin, Dawn Song: Making Neural Programming Architectures Generalize via Recursion [ICLR 2017, Best Paper Award ]

Our Approach: Making Neural Programming Architectures Generalize via Recursion

Accuracy on Random Inputs for Quicksort

slide-72
SLIDE 72

Lessons

  • Program architecture impacts generalization & provability
  • Recursive, modular neural architectures are easier to reason, prove, generalize
  • Explore new architectures and approaches enabling strong generalization & security

properties for broader tasks

slide-73
SLIDE 73

Challenges for Security at Learning Level

  • Evaluate system under adversarial events, not just normal events
  • Reason about complex, non-symbolic programs
  • Design new architectures & approaches with stronger generalization &

security guarantees

  • Reason about how to compose components
slide-74
SLIDE 74

Compositional Reasoning

  • Building large, complex systems require compositional reasoning
  • Each component provides abstraction
  • E.g., pre/post conditions
  • Hierarchical, compositional reasoning proves properties of whole system
  • How to do abstraction, compositional reasoning for non-symbolic programs?
slide-75
SLIDE 75

Security of Learning Systems

  • Software level
  • Learning level
  • Evaluate system under adversarial events, not just normal events
  • Reason about complex, non-symbolic programs
  • Design new architectures & approaches with stronger generalization & security guarantees
  • Reason about how to compose components
  • Distributed level
  • Each agent makes local decisions; how to make good local decisions achieve good global decision?
slide-76
SLIDE 76

AI and Security: AI in the presence of attacker

  • Attack AI
  • Integrity:
  • Cause learning system to not produce intended/correct results
  • Cause learning system to produce targeted outcome designed by attacker
  • Confidentiality:
  • Learn sensitive information about individuals
  • Need security in learning systems
  • Misuse AI
  • Misuse AI to attack other systems
  • Find vulnerabilities in other systems
  • Target attacks
  • Devise attacks
  • Need security in other systems
slide-77
SLIDE 77

Misused AI can make attacks more effective

Deep Learning Empowered Bug Finding Deep Learning Empowered Phishing Attacks

slide-78
SLIDE 78

Misused AI for large-scale, automated, targeted manipulation

slide-79
SLIDE 79

Future of AI and Security

How to better understand what security means for AI, learning systems? How to detect when a learning system has been fooled/compromised? How to build better resilient systems with stronger guarantees? How to build privacy-preserving learning systems?

slide-80
SLIDE 80

Security will be one of the biggest challenges in Deploying AI

slide-81
SLIDE 81