Privacy Accounting and Quality Control in the Sage Di ff erentially - PowerPoint PPT Presentation

Privacy Accounting and Quality Control in the Sage Di ff erentially Private ML Platform Mathias Lécuyer With: Riley Spahn, Kiran Vodrahalli, Roxana Geambasu, and Daniel Hsu

Machine Learning (ML) introduces a dangerous double standard for data protection Example: messaging app 2

Example: messaging app ML platform (e.g. TFX) auto- ad recommendation Traditional complete targeting model model model code messages, database Growing Database likes, clicks... 3

Example: messaging app user's messages user's messages (per access control restrictions) API ML platform (e.g. TFX) auto- ad recommendation Traditional complete targeting model model model code Access control messages, database Growing Database likes, clicks... 4

Example: messaging app models and/or predictions (based on everyone's messages, likes, clicks...) API ML platform (e.g. TFX) ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP recommendation ad auto- recommendation Traditional complete targeting ad auto- recommendation model targeting complete model model model targeting complete model code model model model model Access control messages, database Growing Database likes, clicks... 5

Example: messaging app ML should only captures general trends from the data, but often captures specific information about individual entries in the dataset. API ML platform (e.g. TFX) ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP recommendation ad auto- recommendation Traditional complete targeting ad auto- recommendation model targeting complete model model model targeting complete model code model model model model Access control messages, database Growing Database likes, clicks... 6

Example: messaging app Language models over users’ emails leak secrets. (Carlini+ '18) API ML platform (e.g. TFX) ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP recommendation ad auto- recommendation Traditional complete targeting ad auto- recommendation model targeting complete model model model targeting complete model code model model model model Access control messages, database Growing Database likes, clicks... 7

Example: messaging app Recommenders leak information across users. Membership in a training set can be inferred [Calandrino'11] through prediction APIs. (Shokri+17) API ML platform (e.g. TFX) ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP recommendation ad auto- recommendation Traditional complete targeting ad auto- recommendation model targeting complete model model model targeting complete model code model model model model Access control messages, database Growing Database likes, clicks... 8

Example: messaging app Recommenders leak information across users. Language models over users’ emails leak secrets. Recommenders leak information across users. [Calandrino'11] (Carlini+ '18) (Calandrino'11) API ML platform (e.g. TFX) ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP recommendation ad auto- recommendation Traditional complete targeting ad auto- recommendation model targeting complete model model model targeting complete model code model model model model Access control messages, database Growing Database likes, clicks... 9

Example: messaging app • Making individual training algorithms Differentially Privacy (DP) is good but insufficient, because old data is reused many times. • No system exists for managing multiple DP training algorithms to enforce a global DP guarantee. API ML platform (e.g. TFX) ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP recommendation ad auto- recommendation Traditional complete targeting ad auto- recommendation model targeting complete model model model targeting complete model code model model model model Access control messages, database Growing Database likes, clicks... 10

Example: messaging app • Making individual training algorithms Differentially Privacy (DP) is good but insufficient, because old data is reused many times. • No system exists for managing multiple DP training algorithms to enforce a global DP guarantee. API ML platform (e.g. TFX) ( ε , δ )-DP ( ε , δ )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε , δ )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ad auto- recommendation recommendation Traditional ad auto- recommendation complete targeting targeting complete model model targeting complete model code model model model model model model Access control messages, database Growing Database likes, clicks... 11

Can we make Di ff erential Privacy practical for ML applications? 12

Sage • Enforces a global ( ε g , δ g )-DP guarantee across all models ever released from a growing database. API ML platform (e.g. TFX) • Tackles in practical ways two difficult ( ε , δ )-DP ( ε , δ )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε , δ )-DP DP challenges: ( ε g , δ g )-DP ( ε g , δ g )-DP ( ε g , δ g )-DP auto- ad ad auto- recommendation recommendation Traditional ad auto- recommendation complete targeting targeting complete model 1. “Running out of budget” model targeting complete model code model model model model model model 2. “Privacy-utility tradeoff.” Access Sage access control global ( ε g , δ g )-DP control messages, database Growing Database likes, clicks... 13

Outline Motivation Di ff erential Privacy Two practical challenges Sage design Evaluation 14

Differential Privacy (DP) (Dwork+ '06) • Developed to allow privacy-preserving statistical analyses on sensitive datasets (e.g., census, drug purchases, …). • First (and only) rigorous definition of privacy suitable for this use case. 15

Definition • DP is a stability constraint on computations running on datasets: it requires that no single data point in an input dataset has a significant influence on the output. • To achieve stability, randomness is added into the computation. 16

Definition • DP is a stability constraint on computations running on datasets: it requires that no single data point in an input dataset has a significant influence on the output. • To achieve stability, randomness is added into the computation. • A randomized computation f: D → O, is ( ε , δ )-DP if for any pair of datasets D and D' di ff ering in one entry, and for any output set S ⊂ O: P(f(D) ∈ S) ≤ e ε P(f(D') ∈ S) + δ 17

DP in ML • Approach: make training algorithms DP . • It prevents membership query and reconstruction attacks (Steinke-Ullman '14; Dwork+ '15; Carlini+ '18). • DP versions exist for most ML training algorithms: • Stochastic gradient descent (SGD) (Abadi+16, Yu+19). • Various regressions (Chaudhuri+08, Kifer+12, Nikolaenko+13, Talwar+15). • Collaborative filtering (McSherry+09). • Language models (McMahan+18). • Feature and model selection (Chaudhuri+13, Smith+13). • Model evaluation (Boyd+15). • Tensorflow/privacy implements several of these algorithms (McMahan+19). 18

Outline Motivation Di ff erential Privacy Two practical challenges Sage design Evaluation 19

Challenge 1 - Running out of privacy budget ML platform (e.g. TFX) ( ε g , δ g )-DP model ( ε g , δ g )-DP model ( ε g , δ g )-DP model ( ε , δ )-DP model Most DP work focuses on a fixed database model: global ( ε g , δ g )-DP • Each model consumes some privacy budget. Privacy loss • When the budget is exhausted, the data cannot be used anymore: the system can "run out of budget". Time Fixed Dataset 20

Challenge 2 - Privacy/utility trade-off 24

Privacy Accounting and Quality Control in the Sage Di ff erentially - PowerPoint PPT Presentation

Privacy Accounting and Quality Control in the Sage Di ff erentially Private ML Platform Mathias Lcuyer With: Riley Spahn, Kiran Vodrahalli, Roxana Geambasu, and Daniel Hsu Machine Learning (ML) introduces a dangerous double standard for data

The Greater Sage The Greater Sage-grouse: The Greater Sage The Greater Sage grouse: grouse:

Sage Department Update July 18, 2018 2018-2019 Sage Data Elementary 2018-2019 Sage Data

Sage Inventory What is Sage Inventory Advisor? Advisor Sage Inventory Advisor is an affordable

Greater Sage-Grouse Gunnison Sage-Grouse Kathy Griffin Statewide Grouse Conservation

Introduction to Sage Sage Days 45: Multiple Dirichlet Series, Combinatorics, and Representation

Progression & Sage Gateshead Progression & Sage Gateshead A whistle stop introduction to

Elliptic Curves in Sage John Cremona University of Warwick, UK Sage Days 10 11 October, 2008

Elliptic Curves in Sage William Stein Sage Project Functionality William Stein Demo

INTRODUCTION TO ACCOUNTING Session 01 Session Outline Definition of Accounting History

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

Sage CRM and Sage 300 CRE How CRM can help your business Agenda What is CRM? Common business

Enrolment How Sage can help you and your clients Intrinsic AE Training June/July 2014 Sage -

School Administration Personnel Training Sage 50 Accounts Workshop June 2018 Sage 50 - Monthly

Impact of New UK GAAP on Sage vendor products Abdullah Wamala 3 June 2015 An Overview of the

Accounting Todays Agenda Business Combination Accounting - Accounting Refresher - Pushdown

Accounting 280 Week 1 PowerPoint Professor Arlint Chapter 1-1 Accounting in Action Using the

One Year with Sagemaker Our experience of enhancing SaaS products based on custom Deep Learning

ZpL : a p-adic precision package Xavier Caruso, David Roe, Tristan Vaccon Univ. Rennes 1 Univ.

A subfield lattice attack on overstretched NTRU assumptions Cryptanalysis of some FHE and Graded

Financial Year 2018 Samantha Jameson Founder, Soapsmith #SageResults Safe harbour The

POTENTIAL USES OF CA Di Discussion: W n: Wha hat d do y you a alr lready kno y know a

SAGE : Can we detect gravitational waves with CubeSats? S . Lacour, P . Bourget, M. Nowak, F .

Global Observations of Aerosol and Ozone from SAGE III ISS A First Year Showcase 46 th Global

New Customer Acquisition at Sage: A More Scientific Approach Dan Taylor, Customer Insights

Privacy Accounting and Quality Control in the Sage Di ff erentially - PowerPoint PPT Presentation

Privacy Accounting and Quality Control in the Sage Di ff erentially Private ML Platform Mathias Lcuyer With: Riley Spahn, Kiran Vodrahalli, Roxana Geambasu, and Daniel Hsu Machine Learning (ML) introduces a dangerous double standard for data

The Greater Sage The Greater Sage-grouse: The Greater Sage The Greater Sage grouse: grouse:

Sage Department Update July 18, 2018 2018-2019 Sage Data Elementary 2018-2019 Sage Data

Sage Inventory What is Sage Inventory Advisor? Advisor Sage Inventory Advisor is an affordable

Greater Sage-Grouse Gunnison Sage-Grouse Kathy Griffin Statewide Grouse Conservation

Introduction to Sage Sage Days 45: Multiple Dirichlet Series, Combinatorics, and Representation

Progression &amp; Sage Gateshead Progression &amp; Sage Gateshead A whistle stop introduction to

Elliptic Curves in Sage John Cremona University of Warwick, UK Sage Days 10 11 October, 2008

Elliptic Curves in Sage William Stein Sage Project Functionality William Stein Demo

INTRODUCTION TO ACCOUNTING Session 01 Session Outline Definition of Accounting History

Data privacy: Privacy models Vicen c Torra March, 2019 Hamilton Institute, Maynooth

Sage CRM and Sage 300 CRE How CRM can help your business Agenda What is CRM? Common business

Enrolment How Sage can help you and your clients Intrinsic AE Training June/July 2014 Sage -

School Administration Personnel Training Sage 50 Accounts Workshop June 2018 Sage 50 - Monthly

Impact of New UK GAAP on Sage vendor products Abdullah Wamala 3 June 2015 An Overview of the

Accounting Todays Agenda Business Combination Accounting - Accounting Refresher - Pushdown

Accounting 280 Week 1 PowerPoint Professor Arlint Chapter 1-1 Accounting in Action Using the

One Year with Sagemaker Our experience of enhancing SaaS products based on custom Deep Learning

ZpL : a p-adic precision package Xavier Caruso, David Roe, Tristan Vaccon Univ. Rennes 1 Univ.

A subfield lattice attack on overstretched NTRU assumptions Cryptanalysis of some FHE and Graded

Financial Year 2018 Samantha Jameson Founder, Soapsmith #SageResults Safe harbour The

POTENTIAL USES OF CA Di Discussion: W n: Wha hat d do y you a alr lready kno y know a

SAGE : Can we detect gravitational waves with CubeSats? S . Lacour, P . Bourget, M. Nowak, F .

Global Observations of Aerosol and Ozone from SAGE III ISS A First Year Showcase 46 th Global

New Customer Acquisition at Sage: A More Scientific Approach Dan Taylor, Customer Insights

Progression & Sage Gateshead Progression & Sage Gateshead A whistle stop introduction to