Machine Learning Classification over Encrypted Data Raphal Bost - - PowerPoint PPT Presentation

machine learning classification over encrypted data
SMART_READER_LITE
LIVE PREVIEW

Machine Learning Classification over Encrypted Data Raphal Bost - - PowerPoint PPT Presentation

Machine Learning Classification over Encrypted Data Raphal Bost Raluca Ada Popa, Universit Rennes 1 ETH Zrich MIT MIT Stephen Tu Shafi Goldwasser MIT MIT Classification (Machine Learning) Supervised learning (training)


slide-1
SLIDE 1

Machine Learning Classification over Encrypted Data

Raphaël Bost


Université Rennes 1 MIT

Raluca Ada Popa,

ETH Zürich
 MIT

Stephen Tu

MIT

Shafi Goldwasser

MIT

slide-2
SLIDE 2

Classification

(Machine Learning)

  • Supervised learning (training)
  • Classification

server

data set training phase model classification phase

client

data prediction

slide-3
SLIDE 3

Problem

  • The provider’s model is sensitive

financial model, genetic sequences, …

  • Client’s private data

medical records, credit history, …

slide-4
SLIDE 4

Problem

  • The provider’s model is sensitive

financial model, genetic sequences, …

  • Client’s private data

medical records, credit history, …

MPC / 2PC

slide-5
SLIDE 5

Using General 2PC ?

+ Works for every circuit + Constant number of interactions

  • Have to build circuits
  • Hard to ‘compose’
  • Not easily reusable
slide-6
SLIDE 6

Using General 2PC ?

+ Works for every circuit + Constant number of interactions

  • Have to build circuits
  • Hard to ‘compose’
  • Not easily reusable

➡ Ad Hoc protocols

slide-7
SLIDE 7

Goal

  • Enable classification without sacrificing privacy
  • Secure classification, no learning

the model is already known

  • Practical performance
slide-8
SLIDE 8

Approach

  • Classifiers as specialized 2PC
  • Identify and construct reusable building blocks
  • Threat model: passive (honest-but-curious)

adversary

slide-9
SLIDE 9

Insight

ML Algorithm Classifier Perceptron Linear Least squares Linear Fischer linear discriminant Linear Support vector machine Linear Naïve Bayes Naïve Bayes ID3/C4.5 Decision trees

slide-10
SLIDE 10

Insight

  • Identify core operations
  • Construct reusable/composable building blocks
  • Choose the best fitted primitives

Homomorphic Encryption, FHE, Garbled Circuits, …

slide-11
SLIDE 11

Related Work

  • Privacy-preserving training
  • Using FHE, linear means classifier [GLN12]
  • Specific techniques for Naïve Bayes [VKC08], decision trees

[BDMN05,LP00], linear discriminant [DHC04], kernel methods [LLM06]

  • Privacy-preserving classification
  • Using FHE, outsource computation [BLN13]
  • Secure branching programs [BFK+09, BFL+09]
  • Specific classifiers (face recognition/detection) [SSW09, AB07]
slide-12
SLIDE 12

Building Blocks

  • Dot product
  • Encrypted Comparison
  • Encrypted (arg)max
  • Decision trees
  • Encryption scheme switching
slide-13
SLIDE 13

Classifiers from blocks

Linear Classifier Naïve Bayes Classifier Decision Tree Classifier Dot Product Enc. Compare Enc. Argmax Private Decision Trees ES Switching

slide-14
SLIDE 14

Classifiers

  • Linear Classifier
  • Naïve Bayes Classifier
  • Decision Trees

In Practice

slide-15
SLIDE 15

Linear Classifier

  • Separate two sets of

points

  • Very common

classifier

  • Dot product +

Encrypted compare

slide-16
SLIDE 16

Linear Classifier

Model Size Time / protocol Total Comm. Inter.

Dot Product

  • Enc. Comp.

30 <0.01s 0.194 s 0.204 s 35.84 kB 7 47 0.024 s 0.194 s 0.217 s 40.19 kB 7

Evaluation on UC Irvine ML databases
 40 ms network latency
 2,66 GHz Intel Core i7

slide-17
SLIDE 17

Naïve Bayes Classifier

argmax

i∈[k]

p(C = ci)

d

Y

j=1

p(Xj = xj|C = ci)

slide-18
SLIDE 18

Naïve Bayes Classifier

argmax

i∈[k]

p(C = ci)

d

Y

j=1

p(Xj = xj|C = ci)

slide-19
SLIDE 19

Naïve Bayes Classifier

argmax

i∈[k]

p(C = ci)

d

Y

j=1

p(Xj = xj|C = ci) argmax

i∈[k]

log p(C = ci)

d

X

j=1

log p(Xj = xj|C = ci)

  • Additive homomorphism + Encrypted argmax
slide-20
SLIDE 20

Naïve Bayes Classifier

# Cat. # Features Argmax Total Time Comm. Inter. 2 9 0.40 s 0.48 s 72.47 kB 14 5 9 1.33 s 1.42 s 150.7 kB 42 24 70 3.38 s 3.81 s 1911 kB 166

Evaluation on UC Irvine ML databases
 40 ms network latency
 2,66 GHz Intel Core i7

slide-21
SLIDE 21

Decision Trees

A B C D E x y y1 y2 x1 x2

E D B A C x ≥ x2 x < x2 y > y2 x ≥ x1 x < x1 y < y1

slide-22
SLIDE 22

Decision Tree

  • Combination of other classifiers
  • In this example, linear classifiers
  • Linear classifier + ES Switching + Decision Trees
slide-23
SLIDE 23

Decision Tree

Tree Specs. Time / Protocol Total Comm. Inter.

Nodes Depth Lin. Class. ES Switch Decision Tree (FHE)

4 4 0.45 s 1.64 s 0.27 s 2.3 s 2639 kB 30 6 4 1.41 s 7.41 s 0.93 s 9.8 s 3555 kB 44

Evaluation on UC Irvine ML databases
 40 ms network latency
 2,66 GHz Intel Core i7

slide-24
SLIDE 24

Decision Tree

Tree Specs. Time / Protocol Total Comm. Inter.

Nodes Depth Lin. Class. ES Switch Decision Tree (FHE)

4 4 0.45 s 1.64 s 0.27 s 2.3 s 2639 kB 30 6 4 1.41 s 7.41 s 0.93 s 9.8 s 3555 kB 44

Run sequentially, can be parallelized

slide-25
SLIDE 25

Building blocks library

  • Designed to be modular

Easy composition

  • Easy to construct new secure classifiers

Face detection algorithm (Viola & Jones)

slide-26
SLIDE 26

Building blocks library

Client Server

Dot Product Dot Product

  • Enc. Compare
  • Enc. Compare

SK Jhv, wiK v PK w hv, wi > 0 PK SK

E.g.: Linear Classifier

slide-27
SLIDE 27

Building blocks library

bool Linear_Classifier_Client::run() { exchange_keys(); // values_ is a vector of integers
 // compute the dot product
 mpz_class v = compute_dot_product(values_); 
 mpz_class w = 1; // encryption of 0 // compare the dot product with 0 return enc_comparison(v, w, bit_size_, false);
 } void Linear_Classifier_Server_session:: run_session() 
 {
 exchange_keys(); // enc_model_ is the encrypted model vector 
 // compute the dot product 
 help_compute_dot_product(enc_model_, true); 
 
 // help the client to get
 // the sign of the dot product 
 help_enc_comparison(bit_size_, false);
 }

Client Server

E.g.: Linear Classifier

slide-28
SLIDE 28

In conclusion

  • Composable building blocks for secure classifiers
  • Library with practical performances

Future work :

  • Less roundtrips (work on the protocols)
  • More parallelism (work on the implementation)
slide-29
SLIDE 29

Questions?