Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint - PowerPoint PPT Presentation

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint works from NeurIPS-18, AAAI-19, ALT-19 with Dimitrios Diochnos Saeed Mahloujifar 1

Success of Machine Learning • Machine learning (ML) has changed our lives • Health • Language processing • Finance/Economy • Vision and image classification • Computer Security • Etc. etc.,.. Not primarily designed for adversarial contexts! 3

Secure (Adversarially Robust) Machine Learning • Is achieving low risk still possible in presence of malicious adversaries? • Subverting spam filter by poisoning training data [Nelson et. al. 2008] • Evading PDF malware detectors [Xu et. al. 2016] • Making image classifiers misclassify by adding small perturbations [Szegedy et. al. 2014] Dog Camel ! 6

Arms Race of Attacks vs. Defenses • A repeated cycle of new attacks followed by new defenses: Nelson et. al. 2008, Wittel et al. 2004, Dalvi et al. 2004 Rubinstein et. al. 2009 Lowd et al. 2005, Globerson et al. 2006 Kloft et. al. 2010 Globerson et al. 2008, Dekel et al. 2010 Biggio et. al. 2012 Biggio et al. 2013, Szegedy et al. 2013 Xiao et. al. 2012 Srndic et al. 2014, Goodfellow et al. 2014 Kloft et. al. 2012 Kurakin et al. 2016, Sharma et al. 2017 Biggio et. al. 2014 Kurakin et al. 2016, Carlini et al. 2017 Newell et. al .2014 Papernot et al. 2017, Carlini et al. 2017 Xiao et. al. 2015 Tramer et al. 2018, Madry et al. 2018 Mei et. al. 2015 Raghunathan et al. 2018, Sinha et al. 2018 Burkard et. al. 2017 Na et al. 2018, Gou et al. 2018 Koh et. al. 2017 Dhillon et al. 2018, Xie et al. 2018 Laishram et. al. 2018 Song et al. 2018,Madry et al. 2018 Munoz-Gonz et. al. 2018 Samangouei et al. 2018, Athalye et al. 2018 …. …. … … . … . … 7

Are there inherent reasons enabling adversarial examples and poisoning attacks? Candidate reason: Concentration of Measure! 9

Are there inherent reasons enabling Polynomial-time attacks? Candidate reason: Computational Concentration of Measure! Related to certain polynomial-time attacks on coin-tossing protocols. 10

Talk Outline 1a. Defining evasion attacks formally 1b. Evasion attacks from measure concentration of instances 2a. Defining poisoning attacks formally 2b. Poisoning attacks from measure concentration of products 3a. Poly-time attacks from computational concentration of products 3b. Connections to attacks on coin-tossing protocols 11

Evasion Attacks Finding Adversarial Examples 𝑌 • Metric 𝑁 𝑦 ← 𝐸 𝑦 ෤ • ෤ 𝑒 = (𝑦, 𝑑(𝑦)) 𝑦 close to 𝑦 w.r.t. 𝑁 𝑐 • i.e. ෤ 𝑦 𝑦 ∈ 𝐶𝑏𝑚𝑚 𝑐 𝑦 for small 𝑐 𝑦 𝑦 ෤ 𝐶𝑏𝑚𝑚 𝑐 (𝑦) 𝑦 ෤ • Error-region Adversarial Risk: Learning h 𝐵𝑒𝑤𝑆𝑗𝑡𝑙 𝑐 ℎ = Pr 𝑦←𝐸 [∃෤ 𝑦 ∈ 𝐶𝑏𝑚𝑚 𝑐 𝑦 ; ℎ ෤ 𝑦 ≠ 𝑑(෤ 𝑦)] Algorithm ෨ ℓ 𝐵𝑒𝑤𝑆𝑗𝑡𝑙 0 ℎ = 𝑆𝑗𝑡𝑙(ℎ) 𝑦←𝐸 [ ෨ 𝑆𝑗𝑡𝑙 ℎ = Pr ℓ ≠ 𝑑(෤ 𝑦)] 13

Comparing Definitions of Adversarial Examples 𝑦 𝑑(𝑦) Corrupted inputs Corrupted Inputs • [Feige Mansour Shapire 15] • [Madry et al., 17] • [Feige Mansour Shapire 18] • [Attias Kontorovich Mansour 19] ℎ(෤ 𝑦) Error region • [Diochnos M Mahmoody 18] • [Gilmer et al., 18] Error Region • [Bubeck Price Razenshtein 18] • [Degwekar Vaikuntanatan, 19] 𝑑(෤ 𝑦) 𝑦 ෤

Adversarial Examples from Expansion of Error Region • Define error region 𝐹 Class A • Error region 𝐹 = {𝑦; ℎ 𝑦 ≠ 𝑑(𝑦)} • Risk ℎ = Pr[𝐹] 𝑐 • Risk 𝑐 ℎ = Pr[𝑐 - expansion 𝑝𝑔 𝐹] 𝑐 𝐹 Adversarial examples almost Class B always exist if the expansion of 𝐹 covers almost all inputs 𝑐 expansion of set 𝐹 15

Concentration of Measure • Metric probability space 𝑁, 𝐸 over set 𝑌 𝑌 • Example: 𝑜 -dimensional Gaussian with ℓ 2 • 𝑐 -expansion of set 𝑇 ⊆ 𝑌 𝑐 𝑇 𝑇 𝑐 = 𝑦 ∈ 𝐸; min 𝑡∈𝑇 𝑁 𝑦, 𝑡 ≤ 𝑐 𝑇 𝑐 • For any set 𝑇 with constant probability • 𝑇 𝑐 converges to 1 very fast as 𝑐 grows • i.e. Pr 𝑇 𝑐 ≈ 1 for small 𝑐 ≪ Diam 𝑁 (𝑌) 16

Examples of Concentrated Distributions • Normal Lévy families are concentrated distributions [Lévy 1951] • with dimension and diameter 𝑜 • Such that for any 𝑇 such that Pr 𝑇 = 0.01 • and for b ≈ 𝑜 we have Pr 𝑇 𝑐 = 0.99 • Examples [Amir & Milman 1980], [Ledoux 2001]: • 𝑜 -dimensional isotropic Gaussian with Euclidean distance • 𝑜 -dimensional Spheres with geodesics distance • Any product distribution with Hamming distance (e.g. uniform over Hypercube) • And many more … 17

Main Theorem 1: : Adversarial examples for Lévy families If (𝐸, 𝑁) is Lévy family with both dimension and “typical norm” 𝑜 : 𝑦 ← 𝐸 𝑒 = (𝑦, 𝑑(𝑦)) … then Adversary can add “small” perturbations 𝑐 ≈ 𝑜 ,… 𝑦 𝑦 ෤ …and increase risk of any classifier with non -negligible (original) 𝑦 ෤ risk Risk(ℎ) ≈ 1/100 to adversarial risk AdvRisk 𝑐 (ℎ) ≈ 1 , Learning h Algorithm ෨ ℓ 18

Previous Work on Provable Evasion Attacks • Similar attacks using isoperimetric inequalities • [Gilmer et al 2017]: Use isoperimetric inequality on n-dimensional spheres • [Fawzi et al 2018]: Use isoperimetric inequality on gaussian • [Diochnos, Mahloujifar, M 2018]: Use isoperimetric inequality on Hypercube • Our (Normal Levy) theorem generalizes previous works as special cases and covers many more distributions. 19

Poisoning Attacks: Definition • Hypothesis space 𝐼 𝑦 𝑗 ← 𝐸 • ෩ 𝑒 𝑗 = 𝑦 𝑗 , 𝑑 𝑦 𝑗 𝐼 ⊆ 𝐼 : containing “bad” hypotheses (e.g., those that give me the loan) … … 𝑒 1 𝑒 2 𝑒 𝑗 𝑒 𝑜 Adversary wants to change training set 𝑇 = (𝑒 1 , … , 𝑒 𝑜 ) Learning 𝑇 such that ෨ into a “close” (Hamming distance) ሚ ℎ ∈ ෩ 𝐼 Algorithm ෨ ℎ Adversary can depend on 𝐸 and 𝑑 (but not on ℎ as it is not produced yet) 21

Why is concentration also relevant to poisoning? training sets that are 𝑐 - close to a bad trainint set Space of all hypotheses Space of all training sets 𝑐 ෩ 𝑐𝑏𝑒 𝑡𝑓𝑢𝑡 𝐼 Learner Distribution from which a training set 𝑇 is sampled is 𝑌 𝑛 for 𝑌 = (𝐸, 𝑑 𝑑 ) 22

Recall: Examples of Concentrated Distributions • Normal Lévy families are concentrated distributions [Lévy 1951] • with dimension and diameter 𝑜 • Such that for any 𝑇 such that Pr 𝑇 = 0.01 • and for b ≈ 𝑜 we have Pr[𝑇 𝑐 ] ≈ 1 • Examples [Amir & Milman 1980], [Ledoux 2001]: • 𝑜 -dimensional isotropic Gaussian with Euclidean distance • 𝑜 -dimensional Spheres with geodesics distance • Any product distribution with Hamming distance • And many more … 23

Main in Theorem 2: : Poisoning attacks from concentration of products • For any deterministic learner 𝑀 and any ෩ 𝐼 where Pr ෩ 𝐼 = 1/100 𝑦 𝑗 ← 𝐸 𝑒 𝑗 = 𝑦 𝑗 , 𝑑 𝑦 𝑗 Adv can change ≈ 𝑛 fraction of training data … … 𝑒 1 𝑒 2 𝑒 𝑗 𝑒 𝑜 and make probability of getting ෨ ℎ ∈ ෩ 𝐼 ≈ 1 while the poisoned data are still correctly labeled ! Learning Algorithm ෨ ℎ 24

Other works on “clean label” poisoning attacks: • [Mahloujifar, M TCC-2017] Defined p-tampering poisoning attacks, which are Valiant’s malicious noise but only using correct/clean labels. • [Mahloujifar, Diochnos, M ALT-2018] positive and negative results for PAC-learning under p-tampering attacks • [Shafahi et al, NeurIPS-2018] practical attacks using clean labels • [Turner et al, ICLR-2018] backdoor attacks using clean labels 25

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint - PowerPoint PPT Presentation

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint works from NeurIPS-18, AAAI-19, ALT-19 with Dimitrios Diochnos Saeed Mahloujifar 1 2 Success of Machine Learning Machine learning (ML) has changed our lives Health

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Interactive Proofs Lecture 17 IP = PSPACE 1 So far 2 So far IP 2 So far IP AM, MA 2 So

A STOR A STOR STORY SO FAR STORY SO FAR Y SO FAR SO FAR Brian Bruce Brian Bruce

Far Eastern National University Far Eastern National University Russian Distance Learning

Far Cry and DirectX Far Cry and DirectX Carsten Wenzel Carsten Wenzel Far Cry uses the latest

Interactive Proofs Lecture 19 And Beyond 1 So far 2 So far IP = PSPACE = AM[poly] 2 So far

, , Weakly Supervised Classification Robust Learning and More: Robust Learning and More:

Weakly Supervised Classification Weakly Supervised Classification and Robust Learning and Robust

Robust Deep Learning Based on Meta-learning Deyu Meng Xian Jiaotong University

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

Action Robust Reinforcement Learning and Applications in Continuous Control Chen Tessler *,

Building robust machine learning systems Or, how to sleep well when running machine learning

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Robust Learning from Untrusted Sources Nikola Konstantinov Christoph H. Lampert ICML, June 2019

FAR EASTERN FEDERAL UNIVERSITY: STRIVING FOR SUCCESS FEFU HIST FEFU HISTOR ORY Far Eastern

What do we believe about collaboration? 1 Stand and Declare Collaboration with

Nutrition Educators as Advocates: A Day on Capitol Hill ACPP Pre-Conference Workshop Thursday,

Doing and Teaching Data Science for Social Good Rayid Ghani Center for Data Science & Public

Vital Role of HIE in Public Health Doug Dietzman, BS Chief Executive Officer, Great Lakes Health

Adversarial Robustness: Theory and Practice Zico Kolter Aleksander Mdry madry-lab.ml

CIS700: Security and Privacy of Machine Learning Prof. Ferdinando Fioretto ffiorett@syr.edu

MONTEFIORE EINSTEIN: Vertical Consolidation in the Health Care Industry Steven M. Safyer, MD

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AI and

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint - PowerPoint PPT Presentation

Ho Far Can Robust Learning Go? Mohammad Mahmoody based on joint works from NeurIPS-18, AAAI-19, ALT-19 with Dimitrios Diochnos Saeed Mahloujifar 1 2 Success of Machine Learning Machine learning (ML) has changed our lives Health

Outlier Outlier Outlier- Outlier - -robust - robust robust robust identification

Short Course in Supervised Learning Robust Optimization and Machine Learning Robust Supervised

Interactive Proofs Lecture 17 IP = PSPACE 1 So far 2 So far IP 2 So far IP AM, MA 2 So

A STOR A STOR STORY SO FAR STORY SO FAR Y SO FAR SO FAR Brian Bruce Brian Bruce

Far Eastern National University Far Eastern National University Russian Distance Learning

Far Cry and DirectX Far Cry and DirectX Carsten Wenzel Carsten Wenzel Far Cry uses the latest

Interactive Proofs Lecture 19 And Beyond 1 So far 2 So far IP = PSPACE = AM[poly] 2 So far

, , Weakly Supervised Classification Robust Learning and More: Robust Learning and More:

Weakly Supervised Classification Weakly Supervised Classification and Robust Learning and Robust

Robust Deep Learning Based on Meta-learning Deyu Meng Xian Jiaotong University

Robust Location and Scatter Estimators Outline for Multivariate Data Analysis Background

Action Robust Reinforcement Learning and Applications in Continuous Control Chen Tessler *,

Building robust machine learning systems Or, how to sleep well when running machine learning

PCA CS 446 Supervised learning So far, weve done supervised learning: Given (( x i , y i )) ,

Robust Learning from Untrusted Sources Nikola Konstantinov Christoph H. Lampert ICML, June 2019

FAR EASTERN FEDERAL UNIVERSITY: STRIVING FOR SUCCESS FEFU HIST FEFU HISTOR ORY Far Eastern

What do we believe about collaboration? 1 Stand and Declare Collaboration with

Nutrition Educators as Advocates: A Day on Capitol Hill ACPP Pre-Conference Workshop Thursday,

Doing and Teaching Data Science for Social Good Rayid Ghani Center for Data Science &amp; Public

Vital Role of HIE in Public Health Doug Dietzman, BS Chief Executive Officer, Great Lakes Health

Adversarial Robustness: Theory and Practice Zico Kolter Aleksander Mdry madry-lab.ml

CIS700: Security and Privacy of Machine Learning Prof. Ferdinando Fioretto ffiorett@syr.edu

MONTEFIORE EINSTEIN: Vertical Consolidation in the Health Care Industry Steven M. Safyer, MD

AI and Security: Lessons, Challenges &amp; Future Directions Dawn Song UC Berkeley AI and

Doing and Teaching Data Science for Social Good Rayid Ghani Center for Data Science & Public

AI and Security: Lessons, Challenges & Future Directions Dawn Song UC Berkeley AI and