Split Learning A resource efficient distributed deep learning - PowerPoint PPT Presentation

Split Learning A resource efficient distributed deep learning method without sensitive data sharing Praneeth Vepakomma vepakom@mit.edu

‘Invisible’ Health Image Data ‘Small Data’ ‘Small Data’ ‘Small Data’

ML for Health Images Low a. Distributed Data Bandwidth b. Patient privacy c. Incentives d. ML Expertise ‘Small’ Data e. Efficiency Low Compute

Train Neural Nets No Exchange of Raw Patient Images Gupta, Raskar ‘Distributed training of deep neural network over several agents’, 2017

Intelligent Computing Security, Privacy & Safety

Regulations GDPR: General Data Protection Regulation HIPAA: Health Insurance Portability and Accountability Act, 1996 SOX : Sarbanes-Oxley Act, 2002 PCI : Payment Card Industry Data Security Standard, 2004 SHIELD: Stop Hacks and Improve Electronic Data Security Act, Jan 1 2019

Challenges for Distributed Data + AI + Health Distributed Data Regulations Multi-Modal Incentives Incomplete Data Cooperation Ease Ledgering Resource-constraints Smart contracts Memory, Compute, Bandwidth, Maintenance Convergence, Synchronization, Leakage

AI: Bringing it all together Training No sharing of Deep Raw Images Networks Server Client Invisible Data / Data Friction

Overcoming Data Friction Ease Incentive Trust Regulation Blockchain AI/ SplitNN

Anonymize Obfuscate Encrypt Protect Data

Data Utility Train Model Share Wisdom Infer Statistics Data Protect Anonymize Obfuscate Smash Encrypt Hide Raw Add Noise Private Data

Federated Learning Nets trained at Clients Merged at Server Differential Privacy Split Learning (MIT) Obfuscate with noise Nets split over network Hide unique samples Trained at both Homomorphic Encryption Basic Math over Encrypted Data (+, x)

Federated Learning Server Client1 Client2 Client3 ..

Protect Partial Differential Homomorphic data Oblivious Transfer, Garbled Distributed Leakage Circuits Privacy Encryption Training Federated Learning Split Learning Inference but no training Praneeth Vepakomma, Tristan Swedish, Otkrist Gupta, Abhi Dubey, Raskar 2018

When to use split learning? Large number of clients: Split learning shows positive results Split Memory Compute Bandwidth Federated Convergence Project Page and Papers: https://splitlearning.github.io/

Label Sharing No Label Sharing

Gupta, Otkrist, and Raskar, Ramesh. "Secure Training of Multi-Party Deep Neural Network." U.S. Patent Application No. 15/630,944.

Distribution of parameters in AlexNet

Versatile Configurations of Split Learning Split learning for health: Distributed deep learning without sharing raw patient data, Praneeth Vepakomma, Otkrist Gupta, Tristan Swedish, Ramesh Raskar, (2019)

NoPeek SplitNN: Reducing Leakage in Distributed Deep Learning Reducing leakage in distributed deep learning for sensitive health data, Praneeth Vepakomma, Otkrist Gupta, Abhimanyu Dubey, Ramesh Raskar (2019)

No peak deep learning with conditioning variable Setup: Ideal Goal: To find such a conditioning variable Z within the framework of deep learning such that the following directions are approximately satisfied: 1. Y X | Z (Utility property as X can be thrown away given Z to obtain prediction E(Y|Z)) 2. X Z (One-way property preventing proper reconstruction of raw data X from Z) Note: denotes statistical independence

Possible measures of non-linear dependence ● COCO: Constrained Covariance ● HSIC: Hilbert-Schmidt Independence Criterion ● DCOR: Distance Correlation ● MMD: Maximum Mean Discrepancy ● KTA: Kernel Target Alignment ● MIC: Maximal Information Coefficient ● TIC: Total Information Coefficient

Why is it called distance correlation?

Praneeth Vepakomma, Chetan Tonde, Ahmed Elgammal, Electronic Journal of Statistics, 2018

Colorectal histology image dataset (Public data)

Leakage Reduction in Action Reduced leakage during training Reduced leakage during training over colorectal histology image over colorectal histology image data from 0.96 in traditional CNN to data from 0.92 in traditional CNN to 0.19 in NoPeek SplitNN 0.33 in NoPeek SplitNN Reducing leakage in distributed deep learning for sensitive health data, Praneeth Vepakomma, Otkrist Gupta, Abhimanyu Dubey, Ramesh Raskar (2019)

Similar validation performance

Effect of leakage reduction on convergence

Robustness to reconstruction

Proof of one-Way Property: We show: Minimizing regularized distance covariance minimizes the difference of Kullback-Leibler divergences

Project Page and Papers: https://splitlearning.github.io/ Thanks and acknowledgements to: Otkrist Gupta (MIT/LendBuzz), Ramesh Raskar (MIT), Jayashree Kalpathy-Cramer (Martinos/Harvard), Rajiv Gupta (MGH), Brendan McMahan (Google), Jakub Kone č ný (Google), Abhimanyu Dubey (MIT), Tristan Swedish (MIT), Sai Sri Sathya (S20.ai) , Vitor Pamplona (MIT/EyeNetra), Rodmy Paredes Alfaro (MIT), Kevin Pho (MIT), Elsa Itambo (MIT)

THANK YOU

Split Learning A resource efficient distributed deep learning - PowerPoint PPT Presentation

Split Learning A resource efficient distributed deep learning method without sensitive data sharing Praneeth Vepakomma vepakom@mit.edu Invisible Health Image Data Small Data Small Data Small Data ML for Health

SPL SPLIT IT CA CAST ST Installation of Split Cast Kit Split Cast Kit Rf QM 2000 2 screws

PRODUCT DECOMPOSITION Ante Rozga, University of Split, Faculty of Economics/Split - Cvite

U i U i University of Split University of Split i i f S li f S li Livanjska 5 Livanjska 5

Split Packing: An Algorithm for Packing Circles with up to Critical Density Sebastian Morr

Split Rock to Lakefield Junction 345 kV Transmission Line SD PUC Docket No. EL05-023 Split

Theories within Theories Berislav Zarni c University of Split, Croatia (Hrvatska) Physics

I-65/I 65/I-70 70 Nor North Split th Split Pr Project oject Noise Barrier Neighborhood

Give us 4,000 -5,000 lbs and remain split from the Central Coast Meeting Goals: 1)

I-65/I 65/I-70 N 70 Nor orth th Split Split Pr Project oject Public Open House October 10,

Intrinsic Schreier split extensions Andrea Montoli Diana Rodelo Tim van der Linden Centre for

Split-Dollar Life Insurance Arrangements: Exciting Estate Planning Opportunities What Allows

Multidimensional scaling and flat split systems Monika Balvoi ut e joint work with

Optimizing over the Split Closure Optimizing over the Split Closure Anureet Saxena ACO PhD

Using latin squares Using latin squares to color split graphs to color split graphs Sheila

April 21, 2016 4MGEN14 contamination Sleuthing the problem 4MGEN14 seed overwinter Lewisetta

SPLIT ARRAY CACHES FOR EMBEDDED APPLICATIONS Euromicro DSD 2010 Alice M. Tokarnia, Marina

ENCRYPTIONS EVOLUTION IN TODAYS GDPR COMPLIANT WORLD Mark Christie, Senior Systems

CYBERSECURITY LAWS, REGULATIONS, AND POLICIES: FROM "BEST PRACTICES" TO ACTUAL

You Never Want to Go Back: The Promise of Health IT Journalism Workshop on Health

Supply Chain Sponsored By: The Digital Ecosystem How to Manage Cyber Effects on Your Supply

ELECTRONIC VISIT VERIFICATION Informational session for providers February 2018 1 Introductions

Week 1 - Regulations and Ethics HCI4H - Winter 2019 Nadir Weibel, PhD Doing the right

What your nonprofit needs to know about data and privacy 2100 Building, Seattle, WA March 30,

The Ethics Void Mike Gerwitz LibrePlanet 2018 Mike Gerwitz The Ethics Void Us vs.

Split Learning A resource efficient distributed deep learning - PowerPoint PPT Presentation

Split Learning A resource efficient distributed deep learning method without sensitive data sharing Praneeth Vepakomma vepakom@mit.edu Invisible Health Image Data Small Data Small Data Small Data ML for Health

SPL SPLIT IT CA CAST ST Installation of Split Cast Kit Split Cast Kit Rf QM 2000 2 screws

PRODUCT DECOMPOSITION Ante Rozga, University of Split, Faculty of Economics/Split - Cvite

U i U i University of Split University of Split i i f S li f S li Livanjska 5 Livanjska 5

Split Packing: An Algorithm for Packing Circles with up to Critical Density Sebastian Morr

Split Rock to Lakefield Junction 345 kV Transmission Line SD PUC Docket No. EL05-023 Split

Theories within Theories Berislav Zarni c University of Split, Croatia (Hrvatska) Physics

I-65/I 65/I-70 70 Nor North Split th Split Pr Project oject Noise Barrier Neighborhood

Give us 4,000 -5,000 lbs and remain split from the Central Coast Meeting Goals: 1)

I-65/I 65/I-70 N 70 Nor orth th Split Split Pr Project oject Public Open House October 10,

Intrinsic Schreier split extensions Andrea Montoli Diana Rodelo Tim van der Linden Centre for

Split-Dollar Life Insurance Arrangements: Exciting Estate Planning Opportunities What Allows

Multidimensional scaling and flat split systems Monika Balvoi ut e joint work with

Optimizing over the Split Closure Optimizing over the Split Closure Anureet Saxena ACO PhD

Using latin squares Using latin squares to color split graphs to color split graphs Sheila

April 21, 2016 4MGEN14 contamination Sleuthing the problem 4MGEN14 seed overwinter Lewisetta

SPLIT ARRAY CACHES FOR EMBEDDED APPLICATIONS Euromicro DSD 2010 Alice M. Tokarnia, Marina

ENCRYPTIONS EVOLUTION IN TODAYS GDPR COMPLIANT WORLD Mark Christie, Senior Systems

CYBERSECURITY LAWS, REGULATIONS, AND POLICIES: FROM &quot;BEST PRACTICES&quot; TO ACTUAL

You Never Want to Go Back: The Promise of Health IT Journalism Workshop on Health

Supply Chain Sponsored By: The Digital Ecosystem How to Manage Cyber Effects on Your Supply

ELECTRONIC VISIT VERIFICATION Informational session for providers February 2018 1 Introductions

Week 1 - Regulations and Ethics HCI4H - Winter 2019 Nadir Weibel, PhD Doing the right

What your nonprofit needs to know about data and privacy 2100 Building, Seattle, WA March 30,

The Ethics Void Mike Gerwitz LibrePlanet 2018 Mike Gerwitz The Ethics Void Us vs.

CYBERSECURITY LAWS, REGULATIONS, AND POLICIES: FROM "BEST PRACTICES" TO ACTUAL