Abuses and misuses of AI: prevention vs reaction Red Teaming in the - - PowerPoint PPT Presentation

abuses and misuses of ai prevention vs reaction
SMART_READER_LITE
LIVE PREVIEW

Abuses and misuses of AI: prevention vs reaction Red Teaming in the - - PowerPoint PPT Presentation

Abuses and misuses of AI: prevention vs reaction Red Teaming in the AI world Cristian Canton Ferrer Research Manager (AI Red Team @ Facebook) Abuses and misuses of AI: prevention vs reaction Red Teaming in the AI world ...with Manipulated


slide-1
SLIDE 1

Cristian Canton Ferrer Research Manager (AI Red Team @ Facebook)

Abuses and misuses of AI: prevention vs reaction

Red Teaming in the AI world

slide-2
SLIDE 2

Cristian Canton Ferrer Research Manager (AI Red Team @ Facebook)

Abuses and misuses of AI: prevention vs reaction

Red Teaming in the AI world

...with Manipulated Media as an example

slide-3
SLIDE 3

Outline

Introduction Abuses Misuses Prevention Reaction and Mitigation

slide-4
SLIDE 4

Introduction

slide-5
SLIDE 5

What is the current situation of AI?

Credits: Nicolas Carlini for the graph (https://nicholas.carlini.com/)

Research on adversarial attacks has growth since the advent of DNNs

slide-6
SLIDE 6

Adversarial attack ⇏ GAN

slide-7
SLIDE 7

Input image Category: Panda (57.7% confidence) Adversarial noise Attacked image Category: Gibbon (99.3% confidence)

+ =

Credit: Goodfellow et al. "Explaining and harnessing adversarial examples", ICLR 2015.

Abuse of an AI system to force it to make a calculated mistake

slide-8
SLIDE 8

What is a Red Team?

slide-9
SLIDE 9

What is a Red Team?

"A Red Team is a group that helps organizations to improve themselves by providing opposition to the point of view of the organization that they are helping." Wikipedia T

slide-10
SLIDE 10

What is a Red Team?

Pope Sixtus V (1521-1590)

At the origin, everything started with the: "Advocatus Diaboli"

slide-11
SLIDE 11

What is a Red Team?

The advent of Red Teaming in the modern era: The Yom Kippur War and the 10th Man Rule

slide-12
SLIDE 12

What is a Red Team?

The advent of Red Teaming in the modern era: The Yom Kippur War and the 10th Man Rule Bryce G. Hoffman, "Red Teaming", 2017. Micah Zenko, "Red Team", 2015.

slide-13
SLIDE 13

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

slide-14
SLIDE 14

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
slide-15
SLIDE 15

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
  • Identify, evaluate and prioritize risks and feasible attacks
slide-16
SLIDE 16

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
  • Identify, evaluate and prioritize risks and feasible attacks
  • Conceive worst case scenarios derived from abuses and misuses of AI
slide-17
SLIDE 17

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
  • Identify, evaluate and prioritize risks and feasible attacks
  • Conceive worst case scenarios derived from abuses and misuses of AI
  • Conform a group of experts across all involved aspects of a real system
slide-18
SLIDE 18

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
  • Identify, evaluate and prioritize risks and feasible attacks
  • Conceive worst case scenarios derived from abuses and misuses of AI
  • Conform a group of experts across all involved aspects of a real system
  • Convince stakeholders of the importance and potential impact of a worst

case scenario and ideate solutions: preventions or mitigations

slide-19
SLIDE 19

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
  • Identify, evaluate and prioritize risks and feasible attacks
  • Conceive worst case scenarios derived from abuses and misuses of AI
  • Conform a group of experts across all involved aspects of a real system
  • Convince stakeholders of the importance and potential impact of a worst

case scenario and ideate solutions: preventions or mitigations

  • Define iterative and periodic interactions with stakeholders
slide-20
SLIDE 20

What does an AI Red Team do?

  • Bring the "loyal" adversarial mentality into the AI world, specially for systems

in production

  • Understand the risk landscape of your company
  • Identify, evaluate and prioritize risks and feasible attacks
  • Conceive worst case scenarios derived from abuses and misuses of AI
  • Conform a group of experts across all involved aspects of a real system
  • Convince stakeholders of the importance and potential impact of a worst

case scenario and ideate solutions: preventions or mitigations

  • Define iterative and periodic interactions with stakeholders
  • Defenses? No: that's for the blue team!
slide-21
SLIDE 21

Red Queen Dynamics

"...it takes all the running you can do, to keep in the same place. If you want to get somewhere else, you must run at least twice as fast as that!"

Lewis Carroll, Through the Looking-Glass

slide-22
SLIDE 22

Red Queen Dynamics

slide-23
SLIDE 23

AI Risk = Severity x Likelihood

Risk estimation

slide-24
SLIDE 24

Risk estimation

AI Risk = Severity x Likelihood

  • Core metrics for your company
  • Financial
  • Data leakage, privacy
  • PR
  • Human
  • Mitigation cost, response time
  • ...
slide-25
SLIDE 25

Risk estimation

AI Risk = Severity x Likelihood

  • Discoverability
  • Implementation cost / Feasibility
  • Motivation
  • ...
slide-26
SLIDE 26

Risk estimation

AI Risk = Severity x Likelihood

slide-27
SLIDE 27

A first (real) example

This is"objectionable content" (99%)

slide-28
SLIDE 28

A first (real) example

This is safe content (95%)

slide-29
SLIDE 29

Abuses

Maximum speed 60 MPH

Eykholt et al. "Robust Physical-World Attacks on Deep Learning Visual Classification", 2018.

slide-30
SLIDE 30

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019.

slide-31
SLIDE 31

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019.

slide-32
SLIDE 32

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019. Sitawarin et al., "DARTS: Deceiving Autonomous Cars with Toxic Signs", 2018.

slide-33
SLIDE 33

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019. Wu et al., "Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors", 2020.

slide-34
SLIDE 34

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019.

Origina

Alberti et al., "Are You Tampering With My Data?", 2018.

slide-35
SLIDE 35

Origina

slide-36
SLIDE 36

Attacking dateset biases

De Vries et al., "Does Object RecognitionWork for Everyone?", 2019.

slide-37
SLIDE 37

Attacking dateset biases

De Vries et al., "Does Object RecognitionWork for Everyone?", 2019.

slide-38
SLIDE 38

Attacking dateset biases

De Vries et al., "Does Object RecognitionWork for Everyone?", 2019.

Geographical distribution of classification accuracy

slide-39
SLIDE 39

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019.

Origina

Original Poisoned

Alberti et al., "Are You Tampering With My Data?", 2018.

slide-40
SLIDE 40

Tabassi et al., "A Taxonomy and Terminology of Adversarial Machine Learning", 2019.

slide-41
SLIDE 41

Misuses

slide-42
SLIDE 42

Example case: Synthetic people

Karras et al. "A Style-Based Generator Architecture for Generative Adversarial Networks", 2019. Karras et al. "Analyzing and Improving the Image Quality of StyleGAN", 2020.

StyleGAN Disclaimer: None of these individuals exist!

slide-43
SLIDE 43

Example case: Synthetic people

Plenty of potential good uses:

  • Creative purposes
  • Virtual characters
  • Semantic face editing

Karras et al. "A Style-Based Generator Architecture for Generative Adversarial Networks", 2019. Karras et al. "Analyzing and Improving the Image Quality of StyleGAN", 2020.

Smile edition

Shen et al. "Interpreting the Latent Space of GANs for Semantic Face Editing", 2020.

Disclaimer: None of these individuals exist!

slide-44
SLIDE 44

Example case: Synthetic people

Karras et al. "A Style-Based Generator Architecture for Generative Adversarial Networks", 2019. Karras et al. "Analyzing and Improving the Image Quality of StyleGAN", 2020.

Disclaimer: None of these individuals exist!

Potentially "easy" to spot:

  • Generator residuals (in the image)
slide-45
SLIDE 45

Example case: Synthetic people

Karras et al. "A Style-Based Generator Architecture for Generative Adversarial Networks", 2019. Karras et al. "Analyzing and Improving the Image Quality of StyleGAN", 2020.

Disclaimer: None of these individuals exist!

Potentially "easy" to spot:

  • Generator residuals (in the image)
  • Patterns in the frequency domain

Wang et al. "CNN-generated images are surprisingly easy to spot... for now", 2020.

slide-46
SLIDE 46

Example case: Synthetic people

Disclaimer: None of these individuals exist!

Andrew Waltz Katie Jones Matilda Romero

slide-47
SLIDE 47

Example case: Synthetic people

Disclaimer: None of these individuals exist!

Andrew Waltz Katie Jones Matilda Romero

"Real" profile pictures from fake social media users

slide-48
SLIDE 48

Example case: Synthetic people

Disclaimer: None of these individuals exist!

Carlini and Farid "Evading Deepfake-Image Detectors with White- and Black-Box Attacks", 2020.

87% Fake

slide-49
SLIDE 49

Example case: Synthetic people

Disclaimer: None of these individuals exist!

Carlini and Farid "Evading Deepfake-Image Detectors with White- and Black-Box Attacks", 2020.

87% Fake

+ =

1% Fake

Adversarial noise (magnified x1000)

slide-50
SLIDE 50

Example case: DeepFakes

slide-51
SLIDE 51

Example case: DeepFakes

Pairwise

Swap the faces of two individuals - the face of person A is put on the body of person B. Requires many photos of person A and B.

Identity-free

With a few reference photos of person A, put this face onto any other person. Many methods use GANs.

slide-52
SLIDE 52

Example case: DeepFakes

slide-53
SLIDE 53

Prevention

slide-54
SLIDE 54

Ask the experts

Example - DFDC competition

slide-55
SLIDE 55

Ask the experts

Example - DFDC competition

slide-56
SLIDE 56

Ask the experts

Example - DFDC competition - Dataset

slide-57
SLIDE 57

Ask the experts

Example - DFDC competition - Dataset

slide-58
SLIDE 58

Domain gap + Distribution shift

slide-59
SLIDE 59

Domain gap + Distribution shift

The test distribution you constructed to validate your algorithm

slide-60
SLIDE 60

Domain gap + Distribution shift

The test distribution you constructed to validate your algorithm The real distribution

slide-61
SLIDE 61

Domain gap + Distribution shift

The test distribution you constructed to validate your algorithm Your algorithm's goal The real distribution

slide-62
SLIDE 62

Domain gap + Distribution shift

Dolhansky et al. "The DeepFake Detection Challenge Dataset", https:/ /arxiv.org/abs/2006.07397

slide-63
SLIDE 63

Domain gap + Distribution shift

Dolhansky et al. "The DeepFake Detection Challenge Dataset", https:/ /arxiv.org/abs/2006.07397

slide-64
SLIDE 64

Domain gap + Distribution shift

Dolhansky et al. "The DeepFake Detection Challenge Dataset", https:/ /arxiv.org/abs/2006.07397

slide-65
SLIDE 65

Domain gap + Distribution shift

Dolhansky et al. "The DeepFake Detection Challenge Dataset", https:/ /arxiv.org/abs/2006.07397

slide-66
SLIDE 66

Domain gap + Distribution shift

Dolhansky et al. "The DeepFake Detection Challenge Dataset", https:/ /arxiv.org/abs/2006.07397

(and know your metrics!)

In general, classification metrics cannot tell the whole story for detection problems. Detecting DeepFakes from a large pool of real videos is a problem with extreme class imbalance. Even with an extremely small false positive rate (which accuracy does not really account for), many more false positives will be detected than real DeepFakes.

slide-67
SLIDE 67

Domain gap + Distribution shift

Dolhansky et al. "The DeepFake Detection Challenge Dataset", https:/ /arxiv.org/abs/2006.07397

(and know your metrics!)

slide-68
SLIDE 68

A practical case: Risk-a-thons

  • What is a risk-a-thon? Why is it necessary?
slide-69
SLIDE 69

A practical case: Risk-a-thons

  • What is a risk-a-thon? Why is it necessary?
  • For DeepFakes detection:
  • Generalization attacks
  • Adversarial noise
  • Sub-population attacks (burns, vitiligo, skin conditions,...)
  • Make-up, scarfs, hats, etc.
slide-70
SLIDE 70

Open vs Closed sourcing

Pros: Good as how well you can keep it secret Cons: Underestimation of the adversarial agent

slide-71
SLIDE 71

Open vs Closed sourcing

Pros: Good as how well you can keep it secret Cons: Underestimation of the adversarial agent

Neekhara et al. "Adversarial Deepfakes: Evaluating Vulnerability of Deepfake Detectors to Adversarial Examples", 2020.

Open source DeepFake detectors: XceptionNet and MesoNet

slide-72
SLIDE 72

Reaction

Duct tape fix on Apollo 17 mission

slide-73
SLIDE 73

Mitigation

  • Sometimes, been preventive about every potential adversity is unfeasible!
slide-74
SLIDE 74

Mitigation

  • Sometimes, been preventive about every potential adversity is unfeasible!
  • Define mitigations for the most (unaddressed) risky scenarios
slide-75
SLIDE 75

Mitigation

  • Sometimes, been preventive about every potential adversity is unfeasible!
  • Define mitigations for the most (unaddressed) risky scenarios
  • Build defensive systems that are able to rapidly incorporate new adversarial

samples, even if there's few of them

Yang et al. "One-Shot Domain Adaptation For Face Generation", 2020.

slide-76
SLIDE 76

Mitigation

  • Sometimes, been preventive about every potential adversity is unfeasible!
  • Define mitigations for the most (unaddressed) risky scenarios
  • Build defensive systems that are able to rapidly incorporate new adversarial

samples, even if there's few of them

  • Define coordination strategies (if possible) to mitigate potential AI-centric

attacks across multiple surfaces

slide-77
SLIDE 77

Conclusions

slide-78
SLIDE 78

Conclusions

  • Assume an adversarial mindset when developing systems built on the top of

AI.

  • Understand your risk manifold, quantify it and made informed decisions to

prioritize defenses and mitigation strategies

  • The scope of may AI Red Team is very broad, focus on the relevant areas for

your industry

  • Stress tess mercilessly. Develop a strategy to convince stakeholders of the

value of being ready against a worst-case-scenario

  • The more you sweat in training, the less you bleed in battle.
slide-79
SLIDE 79

Cristian Canton (@cristiancanton) Research Manager (AI Red Team), Facebook AI

Thanks! Q&A