Fast & Faster Privacy-Preserving ML in Secure Hardware Enclaves - - PowerPoint PPT Presentation

fast faster privacy preserving ml in secure hardware
SMART_READER_LITE
LIVE PREVIEW

Fast & Faster Privacy-Preserving ML in Secure Hardware Enclaves - - PowerPoint PPT Presentation

Fast & Faster Privacy-Preserving ML in Secure Hardware Enclaves Nick Hynes, Raymond Cheng, Dawn Song | UC Berkeley & Oasis Labs with support from the TVM team and community! Ideal: data providers pool data to train a large, complex


slide-1
SLIDE 1

Nick Hynes, Raymond Cheng, Dawn Song | UC Berkeley & Oasis Labs with support from the TVM team and community!

Fast & Faster Privacy-Preserving ML
 in Secure Hardware Enclaves

slide-2
SLIDE 2

Ideal: data providers pool data to train a large, complex model

slide-3
SLIDE 3

Ideal: data providers pool data to train a large, complex model

credit scoring model Experian Equifax TransUnion

slide-4
SLIDE 4

health diagnosis model UCSF Medical Mass. General
 Hospital Kaiser Permanente

Ideal: data providers pool data to train a large, complex model

slide-5
SLIDE 5

truly personal assistant me you your neighbor

Ideal: data providers pool data to train a large, complex model

slide-6
SLIDE 6

data theft non-payment inappropriate use

Reality: data providers are mutually distrusting!

slide-7
SLIDE 7

Solution: providers cooperate via a virtual trusted third party

slide-8
SLIDE 8

Trusted Execution Env. (TEE) Secure multi-party computation Zero-knowledge proof Fully homomorphic encryption Performance Support for practical
 ML models Security mechanisms Secure hardware Cryptography, distributed trust Cryptography, local computation Cryptography

Secure Computation Techniques

slide-9
SLIDE 9

Secure Enclaves

Secure enclave

slide-10
SLIDE 10

Secure Enclaves

Integrity Confidentiality

Secure enclave

slide-11
SLIDE 11

Secure Enclaves

Integrity Confidentiality Remote Attestation

Secure enclave

slide-12
SLIDE 12

TEE Implementations

  • Intel SGX: in your laptop, Azure, Alibaba Cloud, and IBM Cloud
slide-13
SLIDE 13

TEE Implementations

  • Intel SGX: in your laptop, Azure, Alibaba Cloud, and IBM Cloud
  • Keystone: the first open-source end-to-end secure enclave
  • runs on RISCV chips and FPGAs
  • keystone-enclave/keystone
slide-14
SLIDE 14

TEE Implementations

  • Intel SGX: in your laptop, Azure, Alibaba Cloud, and IBM Cloud
  • Keystone: the first open-source end-to-end secure enclave
  • runs on RISCV chips and FPGAs
  • keystone-enclave/keystone
  • Ginseng: a drop-in enclave framework for FPGA ML accelerators
slide-15
SLIDE 15
  • 1. Privacy-Preserving ML & Secure Enclaves
  • 2. Myelin: Efficient Private ML in CPU Enclaves
  • 3. Ginseng: Accelerated Private ML in FPGA Enclaves
  • 4. Sterling: A Privacy-Preserving Data Marketplace
slide-16
SLIDE 16

Myelin: Efficient Private ML in CPU Enclaves

dmlc/tvm/apps/sgx
 dmlc/tvm/rust

slide-17
SLIDE 17

Myelin: Efficient Private ML in CPU Enclaves

[3] Efficient Per-Example Gradient Computations. Goodfellow. 2015

slide-18
SLIDE 18

Step 1: Get the ML in the Enclave

slide-19
SLIDE 19

Step 1: Get the ML in the Enclave

slide-20
SLIDE 20

Step 1: Get the ML in the Enclave

slide-21
SLIDE 21

Step 2: Add Differential Privacy

slide-22
SLIDE 22

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

slide-23
SLIDE 23

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

  • privacy risk to any individual is

the same whether or not they contributed data

slide-24
SLIDE 24

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

  • privacy risk to any individual is

the same whether or not they contributed data

  • adds noise so that that model

trained on neighboring datasets are indistinguishable

slide-25
SLIDE 25

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

  • privacy risk to any individual is

the same whether or not they contributed data

  • adds noise so that that model

trained on neighboring datasets are indistinguishable

  • slow in standard frameworks
slide-26
SLIDE 26

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

  • privacy risk to any individual is

the same whether or not they contributed data

  • adds noise so that that model

trained on neighboring datasets are indistinguishable

  • slow in standard frameworks
slide-27
SLIDE 27

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

  • privacy risk to any individual is

the same whether or not they contributed data

  • adds noise so that that model

trained on neighboring datasets are indistinguishable

  • slow in standard frameworks

add noise

slide-28
SLIDE 28

Step 2: Add Differential Privacy

  • DP offers a strong, formal

definition of privacy

  • privacy risk to any individual is

the same whether or not they contributed data

  • adds noise so that that model

trained on neighboring datasets are indistinguishable

  • slow in standard frameworks

add noise

slide-29
SLIDE 29

Step 3: Make it Fast

Differentially Private SGD

  • 1. compute forward pass for mini-batch of m

examples

  • 2. compute per-example gradients
  • 3. rescale each example’s gradient to have unit

norm

  • 4. average them up
  • 5. add noise
  • 6. take gradient step
slide-30
SLIDE 30

Step 3: Make it Fast

Differentially Private SGD

  • 1. compute forward pass for mini-batch of m

examples

  • 2. compute per-example gradients
  • 3. rescale each example’s gradient to have unit

norm

  • 4. average them up
  • 5. add noise
  • 6. take gradient step

add a pass to fuse these

slide-31
SLIDE 31

Step 3: Make it Fast

Differentially Private SGD

  • 1. compute forward pass for batch of m

examples

  • 2. compute per-example gradients
  • 3. rescale each example’s gradient to have unit

norm

  • 4. average + noise+ gradient step

autograd takes O(m) [4]
 O(1) with custom IR ops

[4] Efficient Per-Example Gradient Computations. Goodfellow. 2015

slide-32
SLIDE 32

Step 4: Benchmark

1 Myelin Enclave non-private CPU related work VGG-9 (training) 21.3 img/s 27.2 img/s Chiron (4 enclaves) [5] 
 24.7 img/s ResNet-32 (training) 12.4 img/s 13.6 img/s – MobileNet (inference) 32.4 img/s – Slalom (enclave+GPU) [6]
 35.7 img/s

Performance on CIFAR-10

[5] Chiron: Privacy-preserving machine learning as a service. Hunt, Song, Shokri, Shmatikov, and Witchel. 2018
 [6] Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware. Tramer and

slide-33
SLIDE 33

State of the Art Performance for
 ML in Single CPU Enclave

  • but a CPU is a CPU: ½ day to train a ResNet is emotionally

unsatisfying

  • no GPU TEEs (yet), but we can do FPGAs!
slide-34
SLIDE 34
  • 1. Privacy-Preserving ML & Secure Enclaves
  • 2. Myelin: Efficient Private ML in CPU Enclaves
  • 3. Ginseng: Accelerated Private ML in FPGA Enclaves
  • 4. Sterling: A Privacy-Preserving Data Marketplace
slide-35
SLIDE 35

Ginseng, the Learning TEE

  • Main idea: FPGA can be programmed with ML accelerator (VTA) and

the components required to make a TEE

  • memory encryption
  • key generation
  • remote attestation
  • TEEs are general-purpose; ML is very particular


We get big efficiency wins from specializing TEE to ML workloads

slide-36
SLIDE 36

Ginseng = VTA + Tensor Encryption + Secure OS

slide-37
SLIDE 37

Ginseng = VTA + Tensor Encryption + Secure OS

  • Tensor Encryption Core (TEC) safeguards the tensors in

memory

  • protects entire models’ tensors for virtually no overhead
slide-38
SLIDE 38

Ginseng = VTA + Tensor Encryption + Secure OS

  • Tensor Encryption Core (TEC) safeguards the tensors in

memory

  • protects entire models’ tensors for virtually no overhead
  • Ginseng Secure OS protects the end-to-end workflow
  • built atop formally verified components
  • minimal trusted computing base
  • side-channel resistant
slide-39
SLIDE 39

Ginseng = VTA + Tensor Encryption + Secure OS

  • Tensor Encryption Core (TEC) safeguards the tensors in

memory

  • protects entire models’ tensors for virtually no overhead
  • Ginseng Secure OS protects the end-to-end workflow
  • built atop formally verified components
  • minimal trusted computing base
  • side-channel resistant
  • End result: an end-to-end secure, speedy ML pipeline
slide-40
SLIDE 40

Ginseng = VTA + Tensor Encryption + Secure OS

slide-41
SLIDE 41
  • 1. Privacy-Preserving ML & Secure Enclaves
  • 2. Myelin: Efficient Private ML in CPU Enclaves
  • 3. Ginseng: Accelerated Private ML in FPGA Enclaves
  • 3. Sterling: A Privacy-Preserving Data Marketplace
slide-42
SLIDE 42

Sterling: A Privacy-Preserving Data Marketplace

[1] A Demonstration of Sterling: A Privacy-Preserving Data Marketplace. VLDB 2018. [2] Ekiden: A Platform for Confidentiality-Preserving, Trustworthy, and Performant Smart Contract

  • Execution. 2018

built on the Oasis blockchain and TVM

slide-43
SLIDE 43

Sterling workflow

slide-44
SLIDE 44

Sterling workflow

  • 1. data provider encrypts data and uploads to Oasis blockchain


access to data is controlled by a confidential smart contract

slide-45
SLIDE 45

Sterling workflow

  • 1. data provider encrypts data and uploads to Oasis blockchain


access to data is controlled by a confidential smart contract

  • 2. data consumer uploads a model training smart contract


which satisfies constraints of provider contract

slide-46
SLIDE 46

Sterling workflow

  • 1. data provider encrypts data and uploads to Oasis blockchain


access to data is controlled by a confidential smart contract

  • 2. data consumer uploads a model training smart contract


which satisfies constraints of provider contract

  • 3. consumer contract requests data from provider contract


sends over payment and credentials

slide-47
SLIDE 47

Sterling workflow

  • 1. data provider encrypts data and uploads to Oasis blockchain


access to data is controlled by a confidential smart contract

  • 2. data consumer uploads a model training smart contract


which satisfies constraints of provider contract

  • 3. consumer contract requests data from provider contract


sends over payment and credentials

  • 4. provider contract checks that consumer contract satisfies constraints

and sends back data

slide-48
SLIDE 48

Sterling workflow

  • 1. data provider encrypts data and uploads to Oasis blockchain


access to data is controlled by a confidential smart contract

  • 2. data consumer uploads a model training smart contract


which satisfies constraints of provider contract

  • 3. consumer contract requests data from provider contract


sends over payment and credentials

  • 4. provider contract checks that consumer contract satisfies constraints

and sends back data

  • 5. consumer contract trains a privacy-preserving model and returns it to

the data consumer

slide-49
SLIDE 49

Sterling & TVM to the Moon

slide-50
SLIDE 50

Sterling & TVM to the Moon

  • Sterling facilitates a distributed, trustless,

uncoordinated data marketplace


slide-51
SLIDE 51

Sterling & TVM to the Moon

  • Sterling facilitates a distributed, trustless,

uncoordinated data marketplace


  • builds on the efficiency of TVM with the portability

and security of Web Assembly

  • also uses the new TVM Rust runtime!

slide-52
SLIDE 52

Sterling & TVM to the Moon

  • Sterling facilitates a distributed, trustless,

uncoordinated data marketplace


  • builds on the efficiency of TVM with the portability

and security of Web Assembly

  • also uses the new TVM Rust runtime!

  • TVM modules run in secure enclaves provided by

the Oasis blockchain

slide-53
SLIDE 53

Roadmap

slide-54
SLIDE 54

Roadmap

  • Training on VTA and CPU! Super excited for Relay autograd
  • Much better than the FExpandCompute kludge pass we’re using

now

slide-55
SLIDE 55

Roadmap

  • Training on VTA and CPU! Super excited for Relay autograd
  • Much better than the FExpandCompute kludge pass we’re using

now

  • Deploy Ginseng to AWS F1 once VTA Chisel port is ready
slide-56
SLIDE 56

Roadmap

  • Training on VTA and CPU! Super excited for Relay autograd
  • Much better than the FExpandCompute kludge pass we’re using

now

  • Deploy Ginseng to AWS F1 once VTA Chisel port is ready
slide-57
SLIDE 57

Roadmap

  • Training on VTA and CPU! Super excited for Relay autograd
  • Much better than the FExpandCompute kludge pass we’re using

now

  • Deploy Ginseng to AWS F1 once VTA Chisel port is ready
  • automatically checking TVM models for differential privacy


(on the blockchain, of course)

slide-58
SLIDE 58

Thanks!