Brokered Agreements in in Mult lti-Party Machine Learnin ing - PowerPoint PPT Presentation

Brokered Agreements in in Mult lti-Party Machine Learnin ing 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys 2019) Clement Fung, Ivan Beschastnikh University of British Columbia 1

The emerging ML economy With the explosion of machine learning (ML), data is the new currency! ● Good quality data is vital to the health of ML ecosystems ○ Improve models with more data from more sources! ● 2

Actors in in th the ML economy Data providers: ● Owners of potentially private datasets ○ Contribute data to the ML process ○ Model owners: ● Define model task and goals ○ Deploy and profit from trained model ○ Infrastructure providers: ● Host training process and model ○ Expose APIs for training and prediction ○ 3

Actors in today’s ML economy Data providers supply data for model owners ● Model owners: ● Manage infrastructure to host computation ○ Provide privacy and security for data providers ○ Use the model for profit once training is complete ○ Information Transfer 4

In In-House priv ivacy solu lutio ions [1] Wired 2016. [2] Apple. “ Learning with Privacy at Scale” Apple Machine Learning Journal V1.8 2017. 5 [3] Wired 2017.

In In-House priv ivacy solu lutio ions [1] Wired 2016. [2] Apple. “ Learning with Privacy at Scale” Apple Machine Learning Journal V1.8 2017. 6 [3] Wired 2017.

In Incentive tr trade-off in in th the ML economy Not only correctness, but there is an issue with incentives: ● Data providers want to keep their data as private as possible ○ Model owners want to extract as much value from the data as possible ○ Service providers lack incent ntives to o pr provid ide fair irness [1] ● Need solutions that can work without cooperation from the system ○ provider and are deployed from outside the system itself [1] Overdorf et al. “ Questioning the assumptions behind fairness solutions. ” NeurIPS 2018. 7

In Incentive tr trade-off in in th the ML economy Not only correctness, but there is an issue with incentives: ● Data providers want to keep their data as private as possible ○ Model owners want to extract as much value from the data as possible ○ Service providers lack incent ntives to o pr provid ide fair irness [1] ● We cannot trust model owners to control the ML Need solutions that can work without cooperation from the system ○ incentive tradeoff! provider and are deployed from outside the system itself [1] Overdorf et al. “ Questioning the assumptions behind fairness solutions. ” NeurIPS 2018. 8

Incentives in today’s ML economy Data providers supply data for model owners ● Model owners: ● Manage infrastructure to host computation ○ Provide privacy and security for data providers ○ Use the model for profit once training is complete ○ Information Transfer 9

Incentives in today’s ML economy Data providers supply data for model owners ● Model owners have incentive to: ● Manage infrastructure to host computation ○ Provide privacy and security for data providers ○ Use the model for profit once training is complete ○ Information Transfer 10

Our contrib ibution: Brokered le learning Introduce a broker as a neutral infrastructure provider: ● Manage infrastructure to host ML computation ○ Provide privacy and security for or da data ta pro provid iders and nd mod odel l ow owners ○ Information Transfer Brokered Information Transfer Agreement Broker 11

Federated le learning A recent push for privacy-preserving multi-party ML [1]: ● Send model updates over network ○ Aggregate updates across multiple clients ○ Client-side differential privacy [2] ○ Better speed, no data transfer ○ Model M State of the art in multi-party ML ○ Brokered learning builds on ○ 𝚬 M 𝚬 M 𝚬 M federated learning [1] McMahan et al. “ Communication-Efficient Learning of Deep Networks from Decentralized Data ” AISTATS 2017. 12 [2] Geyer et al. “ Differentially Private Federated Learning: A Client Level Perspective ” NIPS 2017.

Data providers are not to to be tr trusted Giving data providers unmonitored control over compute: ● Providers can maximize privacy, giv give zer zero util ilit ity or or at atta tack syst system ○ Providers can attack ML model, compromising integrity [1] ○ Providers can attack other providers, compromising privacy [2] ○ [1] Bagdasaryan et al. “ How To Backdoor Federated Learning ” arXiv 2018. 13 [2] Hitaj et al. “ Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning ” CCS 2017.

Data providers are not to to be tr trusted Giving data providers unmonitored control over compute: ● Providers can maximize privacy, giv give zer zero util ilit ity or or at atta tack syst system ○ Providers can attack ML model, compromising integrity [1] ○ Providers can attack other providers, compromising privacy [2] ○ We also cannot trust data providers to control the ML incentive tradeoff! [1] Bagdasaryan et al. “ How To Backdoor Federated Learning ” arXiv 2018. 14 [2] Hitaj et al. “ Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning ” CCS 2017.

Putting it it all ll to together The state of the art in multi-party ML ● Gives too much control to model owners ○ Not No t priv privacy focused and nd vuln vulnerable ○ State of the art in private multi-party ML (federated learning) ● Require trust in model owners or data providers ○ But ut the here is s no no inc ncenti tive for or eit ither to o do do so so ○ Data marketplaces (blockchains) [1] ● Security and system overkill ○ Much too oo slo slow for or mod odern use se cas cases ○ [1] Hynes et al. “ A Demonstration of Sterling: A Privacy-Preserving Data Marketplace ” VLDB 2018. 15

Putting it it all ll to together More Centralized Less Centralized Less Private/Secure More Private/Secure 16

Putting it it all ll to together Centralized Parameter Server More Centralized Less Centralized Less Private/Secure More Private/Secure 17

Putting it it all ll to together Centralized Federated Parameter Server Learning More Centralized Less Centralized Less Private/Secure More Private/Secure 18

Putting it it all ll to together Centralized Federated Blockchain-based Parameter Server Learning Multi-party ML More Centralized Less Centralized Less Private/Secure More Private/Secure 19

Putting it it all ll to together Centralized Federated Brokered Blockchain-based Parameter Server Learning Learning Multi-party ML More Centralized Less Centralized Less Private/Secure More Private/Secure 20

Our contrib ibutions Current multi-party ML systems use unsophisticated threat/incentive model: ● Trust the model owner ○ New brokered learning setting for privacy-preserving ML ● New defences against known ML attacks for this setting ● TorMentor: A brokered learning example of an anonymous ML system ● Bro rokered Le Learnin ing : A new standard for incentives in secure ML 21

Brokered Learning 22

Brokered agreements in in th the ML economy Federated learning: Brokered learning ● ● Communicate with model owner Communicate with neutral broker ○ ○ Trust that model owner is not malicious Broker executes model owner’s ○ ○ Model owners have full control over validation services ○ model and process De Decouple mod odel ow owners and and ○ inf nfrastr tructure 23

Brokered le learning components Deployment verifier ● Interface for model owners (“curators”) ○ Provider verifier ● Interface for data providers ○ Aggregator ● Host ML deployments ○ Collect and aggregate model updates ○ Same as federated learning ○ 24

Deplo loyment verifier API Serves as model owner interface ● curate() : Launch curator deployment ○ Set provider verifier parameters ■ fetch() : Access to model once trained ○ Protects the ML model from abuse from ● curator during training E.g. Blockchain smart contracts [1] ● [1] Szabo, Nick. “ Formalizing and Securing Relationships on Public Networks ” 1997. 25

Provider verifier API Serves as data provider interface ● Defined by curator ○ join() : Verify identity and allow provider join ○ update() : Verify and allow model update ○ Protect model from malicious data providers ● E.g. Access tokens and statistical tests ● 26

Brokered le learning workflow Curator: Create deployment ● Define model and provide deployment ○ parameters Define verification services ○ 27

Brokered le learning workflow Curator: Create deployment ● Define model and provide deployment ○ parameters Define verification services ○ Data providers: Join model ● Define personal privacy preferences (ε) ○ Pass verification on join ○ Admission Parameters 28

Brokered le learning workflow Curator: Create deployment ● Define model and provide deployment ○ parameters Define verification services ○ Data providers: Join model and train ● Define personal privacy preferences (ε) ○ Pass verification on join ○ Iterative model updates ○ Pass verification on model update ○ 29

Brokered Agreements in in Mult lti-Party Machine Learnin ing - PowerPoint PPT Presentation

Brokered Agreements in in Mult lti-Party Machine Learnin ing 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys 2019) Clement Fung, Ivan Beschastnikh University of British Columbia 1 The emerging ML economy With the explosion of

EGUS CDE Agreement Terminology/Updates Agreement Types : AB (2 Party) Party A,

Brokered Agreements in Multi-Party Machine Learning 10th ACM SIGOPS Asia-Pacific Workshop on

Supreme Leader Pulkit Party!!!!!! Party!!!!!! Just us Party!!!!!! What we have done

Models for LTI systems LTI system stands for linear time invariant system Model describing LTI

Topic 2: LTI Systems and Convolution Response of LTI Systems Impulse response and unit

Clock lock Tree ee Res esynt nthes hesis is for or Mult ulti-cor i-corner ner Mult

Assessing ng t the he M Mult lti- dime mens nsiona nal A l Aspects of P Poverty y

for Mu fo r Mult lti-Cl Clouds ouds wi with Intel l SGX GX Houssem KANZARI and Marc

Mult lti-dimensional data NumPy matrix multiplication, @ numpy.linalg.solve,

Mult lti-Resource Packin ing for Clu luster Schedule lers Robert Grandl, Ganesh

Dis iscrete Water Fil illi ling Mult lti-Path Packet Scheduli ling IEEE ISIT 2020 Arno

SIMPLE & LEAN PRODUCER Expanding Production and Reducing Costs Health and Safety Update: No

lti 1 (typically) Unsupervised learning in NLP non-convex optimization lti 2

lti The Goal Input: educational text Output: quiz lti The Goal Input:

Bilateral and Regional Trade Agreements 1 2 Bilateral & Regional Trade Agreements 3

Bank Brokered Deposits: New FDIC Guidance on Identifying, Accepting and Reporting Deposits

The DReW System for Nonmonotonic DL-Programs Guohui Xiao 1 Thomas Eiter 1 Stijn Heymans 2 1

F airness-Aware Hybrid Precoding for mmWave NOMA Unicast/Multicast Transmissions in Industrial IoT

Computers that can Negotiate ERCIM Cor Baayen Award Tim Baarslag Researcher in Centrum Wiskunde

Internet consists mostly of wired data communication equipment To build a stable global wide

Staged Recanalization Of Carotid Artery Occlusion Paul Hsien-Li Kao, MD Associate Professor

A Large-Scale Wired Network Energy Model for Flow-Level Simulations Loic Guegan, Betsegaw Lemma

WiredTiger Backend for OpenLDAP Open Source Solution Technology Corporation Tsukasa Hamano

October 1 st 2009, SoCal NEGT Symposium Welfare Effects of Spectrum Management Regimes Ergin

Brokered Agreements in in Mult lti-Party Machine Learnin ing - PowerPoint PPT Presentation

Brokered Agreements in in Mult lti-Party Machine Learnin ing 10th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys 2019) Clement Fung, Ivan Beschastnikh University of British Columbia 1 The emerging ML economy With the explosion of

EGUS CDE Agreement Terminology/Updates Agreement Types : AB (2 Party) Party A,

Brokered Agreements in Multi-Party Machine Learning 10th ACM SIGOPS Asia-Pacific Workshop on

Supreme Leader Pulkit Party!!!!!! Party!!!!!! Just us Party!!!!!! What we have done

Models for LTI systems LTI system stands for linear time invariant system Model describing LTI

Topic 2: LTI Systems and Convolution Response of LTI Systems Impulse response and unit

Clock lock Tree ee Res esynt nthes hesis is for or Mult ulti-cor i-corner ner Mult

Assessing ng t the he M Mult lti- dime mens nsiona nal A l Aspects of P Poverty y

for Mu fo r Mult lti-Cl Clouds ouds wi with Intel l SGX GX Houssem KANZARI and Marc

Mult lti-dimensional data NumPy matrix multiplication, @ numpy.linalg.solve,

Mult lti-Resource Packin ing for Clu luster Schedule lers Robert Grandl, Ganesh

Dis iscrete Water Fil illi ling Mult lti-Path Packet Scheduli ling IEEE ISIT 2020 Arno

SIMPLE &amp; LEAN PRODUCER Expanding Production and Reducing Costs Health and Safety Update: No

lti 1 (typically) Unsupervised learning in NLP non-convex optimization lti 2

lti The Goal Input: educational text Output: quiz lti The Goal Input:

Bilateral and Regional Trade Agreements 1 2 Bilateral &amp; Regional Trade Agreements 3

Bank Brokered Deposits: New FDIC Guidance on Identifying, Accepting and Reporting Deposits

The DReW System for Nonmonotonic DL-Programs Guohui Xiao 1 Thomas Eiter 1 Stijn Heymans 2 1

F airness-Aware Hybrid Precoding for mmWave NOMA Unicast/Multicast Transmissions in Industrial IoT

Computers that can Negotiate ERCIM Cor Baayen Award Tim Baarslag Researcher in Centrum Wiskunde

Internet consists mostly of wired data communication equipment To build a stable global wide

Staged Recanalization Of Carotid Artery Occlusion Paul Hsien-Li Kao, MD Associate Professor

A Large-Scale Wired Network Energy Model for Flow-Level Simulations Loic Guegan, Betsegaw Lemma

WiredTiger Backend for OpenLDAP Open Source Solution Technology Corporation Tsukasa Hamano

October 1 st 2009, SoCal NEGT Symposium Welfare Effects of Spectrum Management Regimes Ergin

SIMPLE & LEAN PRODUCER Expanding Production and Reducing Costs Health and Safety Update: No

Bilateral and Regional Trade Agreements 1 2 Bilateral & Regional Trade Agreements 3