[PPT] - Perceptual Ad-Blocking: Meet Adversarial Machine Learning Florian PowerPoint Presentation

SLIDE 1

Perceptual Ad-Blocking: Meet Adversarial Machine Learning

Florian Tramèr Palo Alto Networks February 22nd 2019 Joint work with Pascal Dupré, Gili Rusak, Giancarlo Pellegrino and Dan Boneh

SLIDE 2

2

The Future of Ad-Blocking?

easylist.txt …markup… …URLs…

???

This is an ad

Human distinguishability of ads

> Legal requirement (U.S. FTC, EU E-Commerce) > Industry self-regulation on ad-disclosure

SLIDE 3

§ Ad Highlighter by Storey et al.

> Visually detects ad-disclosures > Traditional Computer Vision techniques > Simplified version implementable in Adblock Plus

§ Sentinel by Adblock Plus

> Locates ads in Facebook screenshots using neural networks > Not yet deployed

Perceptual Ad-Blocking

3

SLIDE 4

§ Ad Highlighter by Storey et al.

> Visually detects ad-disclosures > Traditional Computer Vision techniques

§ Sentinel by Adblock Plus

> Locates ads in Facebook screenshots using neural networks > Not yet deployed

Perceptual Ad-Blocking

4

SLIDE 5

5

How Secure is Perceptual Ad-Blocking?

Jerry uploads malicious content … … so that Tom’s post gets blocked

SLIDE 6

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

6

Outline

SLIDE 7

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

7

Outline

SLIDE 8

How does a Perceptual Ad-Blocker Work?

8

https://www.example.com Ad Disclosure

Data Collection and Training Page Segmentation Action Classifier Classifier

Ad

Classification

Ø Element-based (e.g., find all <img> tags) [Storey et al. 2017] Ø Frame-based (segment rendered webpage into “frames”) Ø Page-based (unsegmented screenshots à-la-Sentinel) Template matching, OCR, DNNs, Object detector networks

SLIDE 9

§ Sentinel is not yet deployed so we rolled our own § Let’s aim bigger than just Facebook!

(data collection on FB is a pain / privacy issue anyways...)

§ We trained an object detector neural network (YOLO-v3)

n news websites from all G20 nations

> Use filter lists to create a labelled dataset for training > Crop & replace ads for data augmentation (increase data diversity)

9

Building a Page-Based Perceptual Ad-Blocker

SLIDE 10

Our Ad-Detector in Action

10

Video taken from 5 websites not used during training

SLIDE 11

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

11

Outline

SLIDE 12

ML works well on average

≠

ML works well on adversarial data

12

The Current State of ML

as long as there is no adversary

SLIDE 13

Adversarial Examples

§ How?

> Training ⟹ “tweak model parameters such that !( ) = %&'(&” > Attacking ⟹ “tweak input pixels such that !( ) = )*++,'”

13

Szegedy et al., 2014 Goodfellow et al., 2015

≈

⁄ 2 255

SLIDE 14

14

(Meaningful) Defenses

SLIDE 15

15

Adversarial Examples for Perceptual Ad-Blockers

SLIDE 16

§ Challenge 1: Input of model is a webpage screenshot

> Attack must be implemented in HTML

§ Challenge 2: ! can’t control or predict the full input

> E.g., publishers can’t modify or know contents of ad frames

Attacking Page-Based Classifiers

16 <div> <p> <img> <div> <img>

Classifier

SLIDE 17

§ Goal: Make ads unrecognizable by ad-blocker § ! = Website publisher

> Abilities: Inspect ad-blocker classifier(s) offline Change page DOM, CSS, JavaScript... Cannot modify content of ad frames

§ ! = Ad network (or advertisers)

> Abilities: Inspect ad-blocker classifier(s) offline Arbitrary changes to content of ad frames

17

Ad-Block Evasion

SLIDE 18

Evasion 1: Universal Transparent Overlay

18

Use HTML tiling to minimize perturbation size (20 KB)

Ø 100% success rate on 20 webpages not used to create the overlay Ø The attack is universal: the overlay is computed once and works for all (or most) websites

§ Web publisher perturbs every rendered pixel

SLIDE 19

Evasion 2: Perturbed Ads

19

Original

§ Ad network perturbs served ads

> Creating a single perturbation that works for every ad on every website is hard > Target a specific domain

Ø 100% success rate for ads served on BBC.com Ø No CSS: Ad image is directly perturbed on the server Ø The perturbation is universal: It works for all ads (on this domain)

With adversarial ad

Alternative attack Ø Publisher perturbs background below ad frame Ø 100% success in evading ads

SLIDE 20

20

Ad-Block Detection

§ Goal: Trigger ad-blocker on “honeypot” content

> Detect ad-blocking in client-side JavaScript or on server > Applicability of these attacks depends on ad-blocker type

§ ! = Website publisher

> Abilities: Inspect ad-blocker classifier(s) offline Change page DOM, CSS Use client-side JavaScript to detect DOM changes

SLIDE 21

Detection: Perturb fixed page layout

21

riginal

§ Publisher adds honeypot in page-region with fixed layout

> E.g., page header

With honeypot header

SLIDE 22

22

New Threats: Privilege Abuse

… so that Tom’s post gets blocked Jerry uploads malicious content …

What happened?

Ø Object detector model generates box predictions from full page inputs Ø Content from one user can affect predictions anywhere on page Ø Model’s segmentation is not aligned with web-security boundaries

§ Ad-block evasion & detection is a well-known arms race. But there’s more!

SLIDE 23

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

23

Outline

SLIDE 24

Ø ! has white-box access to ad-blocker Ø ! can exploit False Negatives and False Positives in classification pipeline Ø ! prepares attacks offline ó Ø ! can take part in crowd-sourced data collection

24

https://www.example.com Ad Disclosure

Data Collection and Training Page Segmentation Action Classifier Classifier

Ad

Classification

A Challenging Threat Model

The ad-blocker must defend against attacks in real-time in the user’s browser Data Poisoning Ø DOM Obfuscation Ø Resource Exhaustion Adversarial Examples Privilege Abuse

SLIDE 25

§ Attacks are easy if ! has access to the ML model

> Hide model from adversary?

§ Obfuscate the ad-blocker?

> It isn’t hard to create adversarial examples for black-box classifiers

§ Randomize the ad-blocker?

> Deploy different models

Adversarial examples that work against multiple models

> Randomly change page before classifying

Adversarial examples robust to random transformations

25

Defense Strategy 1: Obfuscate the Model

(1) Page Segme

(3) Action

SLIDE 26

§ If ad-blocker is attacked (evasion or detection), collect adversarial samples and re-train the model

> Or train on adversarial examples proactively

§ This is called Adversarial Training

> New arms-race: ! finds new attacks and ad-blocker re-trains > Mounting a new attack is much easier than updating the model > On-going research: so far ! always wins!

26

Defense Strategy 2: Anticipate and Adapt

1600 citations, 800 in 2018! Broke 7 defenses, a few days after they were accepted for publication

SLIDE 27

§ Storey et al: recognize ad-disclosures

> Simpler computer vision problem than full-page ad-detection > Light-weight and mature techniques (OCR, perceptual hashing, SIFT)

§ Adversarial Examples still exist

27

Defense Strategy 3: Simplify the Problem

SLIDE 28

Take Away

§ Emulating human-like detection of ads could be the end-game for ad-blockers § But very hard with current computer vision techniques

> Resisting adversarial examples is one of the most challenging open problems in ML security

§ Perceptual ad-blockers have to survive a strong threat model

> Evasion & detection with adversarial examples > Privilege abuse attacks from arbitrary content providers > Similar threats for other ML-based ad-blockers (e.g., AdGraph?)

28

http://arxiv.org/abs/1811.03194 https://github.com/ftramer/ad-versarial Ø Train a page-based ad-blocker Ø Download pre-trained models Ø Attack demos

Perceptual Ad-Blocking: Meet Adversarial Machine Learning

The Future of Ad-Blocking?

???

Human distinguishability of ads

§ Ad Highlighter by Storey et al.

§ Sentinel by Adblock Plus

Perceptual Ad-Blocking

§ Ad Highlighter by Storey et al.

§ Sentinel by Adblock Plus

Perceptual Ad-Blocking

How Secure is Perceptual Ad-Blocking?

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

Outline

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

Outline

How does a Perceptual Ad-Blocker Work?

Building a Page-Based Perceptual Ad-Blocker

Our Ad-Detector in Action

Video taken from 5 websites not used during training

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

Outline

ML works well on average

≠

ML works well on adversarial data

The Current State of ML

*as long as there is no adversary *

Adversarial Examples

§ How?

(Meaningful) Defenses

Adversarial Examples for Perceptual Ad-Blockers

§ Challenge 1: Input of model is a webpage screenshot

§ Challenge 2: ! can’t control or predict the full input

Attacking Page-Based Classifiers

§ Goal: Make ads unrecognizable by ad-blocker § ! = Website publisher

§ ! = Ad network (or advertisers)

Ad-Block Evasion

Evasion 1: Universal Transparent Overlay

§ Web publisher perturbs every rendered pixel

Evasion 2: Perturbed Ads

§ Ad network perturbs served ads

Ad-Block Detection

§ Goal: Trigger ad-blocker on “honeypot” content

§ ! = Website publisher

Detection: Perturb fixed page layout

New Threats: Privilege Abuse

§ Perceptual ad-blockers: how they work § Attacking perceptual ad-blockers § Why defending is hard

Outline

A Challenging Threat Model

Defense Strategy 1: Obfuscate the Model

§ If ad-blocker is attacked (evasion or detection), collect adversarial samples and re-train the model

§ This is called Adversarial Training

Defense Strategy 2: Anticipate and Adapt

§ Storey et al: recognize ad-disclosures

§ Adversarial Examples still exist

Defense Strategy 3: Simplify the Problem

Take Away

§ Emulating human-like detection of ads could be the end-game for ad-blockers § But very hard with current computer vision techniques

§ Perceptual ad-blockers have to survive a strong threat model

as long as there is no adversary