Midterm Review Li Xiong Department of Mathematics and Computer - - PowerPoint PPT Presentation

midterm review
SMART_READER_LITE
LIVE PREVIEW

Midterm Review Li Xiong Department of Mathematics and Computer - - PowerPoint PPT Presentation

CS573 Data Privacy and Security Midterm Review Li Xiong Department of Mathematics and Computer Science Emory University Principles of Data Security CIA Triad Confidentiality Prevent the disclosure of information to unauthorized users


slide-1
SLIDE 1

CS573 Data Privacy and Security Midterm Review

Li Xiong

Department of Mathematics and Computer Science Emory University

slide-2
SLIDE 2

Principles of Data Security – CIA Triad

  • Confidentiality

– Prevent the disclosure of information to unauthorized users

  • Integrity

– Prevent improper modification

  • Availability

– Make data available to legitimate users

slide-3
SLIDE 3

Privacy vs. Confidentiality

  • Confidentiality

– Prevent disclosure of information to unauthorized users

  • Privacy

– Prevent disclosure of personal information to unauthorized users – Control of how personal information is collected and used – Prevent identification of individuals

11/8/2016 3

slide-4
SLIDE 4

Data Privacy and Security Measures

  • Access control

– Restrict access to the (subset or view of) data to authorized users

  • Cryptography

– Use encryption to encode information so it can be only read by authorized users (protected in transmit and storage)

  • Inference control

– Restrict inference from accessible data to sensitive (non- accessible) data

slide-5
SLIDE 5
  • Inference control: Prevent inference from

accessible information to individual information (not accessible)

  • Technologies

– De-identification and Anonymization (input perturbation) – Differential Privacy (output perturbation)

Inference Control

slide-6
SLIDE 6

Original Data Sanitized Records De-identification anonymization

Traditional De-identification and Anonymization

  • Attribute suppression, encoding, perturbation, generalization
  • Subject to re-identification and disclosure attacks
slide-7
SLIDE 7

Original Data Statistics/ Models/ Synthetic Records Differentially Private Data Sharing

Statistical Data Sharing with Differential Privacy

  • Macro data (as versus micro data)
  • Output perturbation (as versus input perturbation)
  • More rigorous guarantee
slide-8
SLIDE 8

Cryptography

  • Encoding data in a way that only authorized

users can read it

11/9/2016 8

Original Data Encrypted Data Encryption

slide-9
SLIDE 9

9

Applications of Cryptography

  • Secure data outsourcing

– Support computation and queries on encrypted data

11/9/2016 9

Encrypted Data

Computation /Queries

slide-10
SLIDE 10

10

Applications of Cryptography

  • Multi-party secure computations (secure function

evaluation)

– Securely compute a function without revealing private inputs

xn x1 x3 x2 f(x1,x2,…, xn)

slide-11
SLIDE 11

11

Applications of Cryptography

  • Private information retrieval (access privacy)

– Retrieve data without revealing query (access pattern)

slide-12
SLIDE 12

Course Topics

  • Inference control

– De-identification and anonymization – Differential privacy foundations – Differential privacy applications

  • Histograms
  • Data mining
  • Local differential privacy
  • Location privacy
  • Cryptography
  • Access control
  • Applications

11/8/2016 12

slide-13
SLIDE 13

k-Anonymity

Caucas 78712 Flu Asian 78705 Shingle s Caucas 78754 Flu Asian 78705 Acne AfrAm 78705 Acne Caucas 78705 Flu Caucas 787XX Flu

Asian/AfrA m

78705 Shingle s Caucas 787XX Flu

Asian/AfrA m

78705 Acne

Asian/AfrA m

78705 Acne Caucas 787XX Flu

Quasi-identifiers (QID) = race, zipcode Sensitive attribute = diagnosis K-anonymity: the size of each QID group is at least k

slide-14
SLIDE 14

slide 14

Problem of k-anonymity

… … …

Rusty Shackleford Caucas

78705 … … … Caucas 787XX Flu

Asian/AfrA m

78705 Shingle s Caucas 787XX Flu

Asian/AfrA m

78705 Acne

Asian/AfrA m

78705 Acne Caucas 787XX Flu

Problem: sensitive attributes are not “diverse” within each quasi-identifier group

slide-15
SLIDE 15

l-Diversity

Caucas 787XX Flu Caucas 787XX Shingle s Caucas 787XX Acne Caucas 787XX Flu Caucas 787XX Acne Caucas 787XX Flu

Asian/AfrA m

78XXX Flu

Asian/AfrA m

78XXX Flu

Asian/AfrA m

78XXX Acne

Asian/AfrA m

78XXX Shingle s

Asian/AfrA

78XXX Acne

Entropy of sensitive attributes within each quasi-identifier group must be at least l

[Machanavajjhala et al. ICDE ‘06]

slide-16
SLIDE 16

… HIV- … HIV- … HIV- … HIV- … HIV- … HIV+ … HIV- … HIV- … HIV- … HIV- … HIV- … HIV-

Original dataset

Q1 HIV- Q1 HIV- Q1 HIV- Q1 HIV+ Q1 HIV- Q1 HIV- Q2 HIV- Q2 HIV- Q2 HIV- Q2 HIV- Q2 HIV- Q2 Flu

Anonymization B

Q1 HIV+ Q1 HIV- Q1 HIV+ Q1 HIV- Q1 HIV+ Q1 HIV- Q2 HIV- Q2 HIV- Q2 HIV- Q2 HIV- Q2 HIV- Q2 HIV-

Anonymization A

99% have HIV-

50% HIV-  quasi-identifier group is “diverse” This leaks a ton of information 99% HIV-  quasi-identifier group is not “diverse” …yet anonymized database does not leak anything

Problem with l-diversity

slide-17
SLIDE 17

Caucas 787XX Flu Caucas 787XX Shingle s Caucas 787XX Acne Caucas 787XX Flu Caucas 787XX Acne Caucas 787XX Flu

Asian/AfrA m

78XXX Flu

Asian/AfrA m

78XXX Flu

Asian/AfrA m

78XXX Acne

Asian/AfrA m

78XXX Shingle s

Asian/AfrA

78XXX Acne

[Li et al. ICDE ‘07]

Distribution of sensitive attributes within each quasi-identifier group should be “close” to their distribution in the entire original database

t-Closeness

slide 17

slide-18
SLIDE 18

Problems with Syntactic Privacy notions

  • Syntactic

– Focuses on data transformation, not on what can be learned from the anonymized dataset

  • “Quasi-identifier” fallacy

– Assumes a priori that attacker will not know certain information about his target – Attacker may know the records in the database or external information

slide 18

slide-19
SLIDE 19

Course Topics

  • Inference control

– De-identification and anonymization – Differential privacy foundations – Differential privacy applications

  • Histograms
  • Data mining
  • Location privacy
  • Cryptography
  • Access control
  • Applications

11/8/2016 19

slide-20
SLIDE 20

Differential Privacy

  • Statistical outcome is indistinguishable regardless whether a

particular user (record) is included in the data

slide-21
SLIDE 21

Differential Privacy

  • Statistical outcome is indistinguishable regardless whether a

particular user (record) is included in the data

slide-22
SLIDE 22

Original records Original histogram

Statistical Data Release: disclosure risk

slide-23
SLIDE 23

Original records Original histogram Perturbed histogram with differential privacy

Statistical Data Release: differential privacy

slide-24
SLIDE 24

Differential Privacy

A privacy mechanism A gives ε-differential privacy if for all neighbouring databases D, D’, and for any possible output S ∈ Range(A), Pr[A(D) = S] ≤ exp(ε) × Pr[A(D’) = S]

D D’

  • D and D’ are neighboring

databases if they differ in one record

slide-25
SLIDE 25

Add Laplace noise to the true output f(D)

Δf = max D,D’ |f(D) - f(D’)|

Global Sensitivity

Laplace Mechanism

slide-26
SLIDE 26

Example: Laplace Mechanism

  • For a single counting query Q over a dataset

D, returning Q(D)+Laplace(1/ε) gives ε- differential privacy.

11/8/2016 26

slide-27
SLIDE 27

Exponential Mechanism

Inputs Outputs Sample output r with a utility score function u(D,r)

slide-28
SLIDE 28

Exponential Mechanism

For a database D, output space R and a utility score function u : D×R → R, the algorithm A Pr[A(D) = r] ∝ exp (ε × u(D, r)/ 2Δu) satisfies ε-differential privacy, where Δu is the sensitivity of the utility score function Δu = max r & D,D’ |u(D, r) - u(D’, r)|

slide-29
SLIDE 29

Example: Exponential Mechanism

  • Scoring/utility function w: Inputs x Outputs

 R

  • D: nationalities of a set of people
  • f(D) : most frequent nationality in D
  • u (D, O) = #(D, O) the number of people with

nationality O

Tutorial: Differential Privacy in the Wild 29 Module 2

slide-30
SLIDE 30

Composition theorems

Sequential composition ∑iεi –differential privacy Parallel composition max(εi)–differential privacy

slide-31
SLIDE 31

Differential Privacy

  • Differential privacy ensure an attacker can’t

infer the presence or absence of a single record in the input based on any output.

  • Building blocks

– Laplace, exponential mechanism

  • Composition rules help build complex

algorithms using building blocks

slide-32
SLIDE 32

Course Topics

  • Inference control

– De-identification and anonymization – Differential privacy foundations – Differential privacy applications

  • Histograms
  • Data mining
  • Location privacy
  • Cryptography
  • Access control
  • Applications

11/8/2016 32

slide-33
SLIDE 33

Baseline: Laplace Mechanism

  • For the counting query Q on each histogram

bin, returning Q(D)+Laplace(1/ε) gives ε- differential privacy.

11/8/2016 33

slide-34
SLIDE 34

Name Age Income HIV+ Frank 42 30K Y Bob 31 60K Y Mary 28 20K Y … … … …

Original Records DP V-optimal Histogram Multi-dimensional partitioning

DPCube [SecureDM 2010, ICDE 2012 demo]

DP unit Histogram

DP Interface ε/2-DP ε/2-DP

  • Compute unit

histogram counts with differential privacy

  • Use DP unit

histogram for partitioning

  • Compute V-optimal

histogram counts with differential privacy

slide-35
SLIDE 35

Private Spatial decompositions [CPSSY 12]

quadtree kd-tree  Need to ensure both partitioning boundary and the counts of each partition are differentially private

35

slide-36
SLIDE 36

Non-parametric methods (only work well for low-dimensional data) Parametric methods (joint distribution difficult to model)

Histogram methods vs parametric methods

Fit the data to a distribution, make inferences about parameters

e.g. PrivacyOnTheMap

Original data Histogram Synthetic data Perturbation

Learn empirical distribution through histograms

e.g. PSD , Privelet, FP, P-HP

slide-37
SLIDE 37

DPCopula

A semi-parametric method

Non-parametric estimation for each dimension

Age Hours /week Income

42 64 30K 31 82 60K 28 40 20K 43 36 80K

… … …

Original data set

Hours/week Age Income

DP Marginal Histograms Dependence structure

Age Hours /week Income 42 64 30K 31 82 60K 28 40 20K 43 36 80K … … …

DP synthetic data set

Parametric estimation for dependence

slide-38
SLIDE 38
  • Marginal distributions + Bayesian network

PrivBayes: Bayesian Network

age workclass education title income Pr 𝑏𝑕𝑓 Pr 𝑥𝑝𝑠𝑙 | 𝑏𝑕𝑓 Pr 𝑓𝑒𝑣 | 𝑏𝑕𝑓 Pr 𝑢𝑗𝑢𝑚𝑓 | 𝑥𝑝𝑠𝑙 Pr 𝑗𝑜𝑑𝑝𝑛𝑓 | 𝑥𝑝𝑠𝑙

slide-39
SLIDE 39
  • STEP 1: Choose a suitable Bayesian network 𝒪

– must in a differentially private way – Add edges with highest mutual information - Exponential mechanism

  • STEP 2: Compute conditional distributions implied by 𝒪

– straightforward to do under differential privacy – inject noise – Laplace mechanism

  • STEP 3: Generate synthetic data by sampling from 𝒪

– post-processing: no privacy issues

Outline of the Algorithm

slide-40
SLIDE 40

Metrics:

Random range-count queries with random query predicates covering all attributes Relative error: Absolute error:

Evaluation for DP Histograms

slide-41
SLIDE 41

Course Topics

  • Inference control

– De-identification and anonymization – Differential privacy foundations – Differential privacy applications

  • Histograms
  • Data mining
  • Location privacy
  • Cryptography
  • Access control
  • Applications

11/8/2016 41

slide-42
SLIDE 42

Frequent sequence mining (FSM)

ID 100 200 300 400 500 Record a→c→d b→c→d a→b→c→e→d d→b a→d→c→d Database D Sequence {a} {b} {c} {d} Sup. 3 3 4 4 {e} 1 C1: cand 1-seqs Sequence {a} {b} {c} {d} Sup. 3 3 4 4 F1: freq 1-seqs

Sequence {a→a} {a→b} {a→c} {a→d} Sup. 1 3 3 {b→a} {b→b} {b→c} {b→d} 2 2 1 {c→a} {c→b} {c→c} {c→d} 4 {d→a} {d→b} {d→c} {d→d} 1 1 C2: cand 2-seqs Sequence {a→c} {a→d} {c→d} Sup. 3 3 4 F3: freq 2-seqs

Scan D Scan D Scan D

Sequence {a→a} {a→b} {a→c} {a→d} {b→a} {b→b} {b→c} {b→d} {c→a} {c→b} {c→c} {c→d} {d→a} {d→b} {d→c} {d→d} C2: cand 2-seqs

Sequence {a→b→c} C3: cand 3-seqs Sequence {a→b→c} Sup. 3 F3: freq 3-seqs

slide-43
SLIDE 43

ID 100 200 300 400 500 Record a→c→d b→c→d a→b→c→e→d d→b a→d→c→d Database D

Sequence {a} {b} {c} {d} Sup. 3 3 4 4 {e} 1 C1: cand 1-seqs noise 0.2

  • 0.4

0.4

  • 0.5

0.8

Sequence {a→a} {a→c} {a→d} {c→a} {c→c} {c→d} {d→a} {d→c} {d→d} C2: cand 2-seqs Sequence {a→a} {a→c} {a→d} Sup. 3 3 {c→a} {c→c} {c→d} 4 {d→a} {d→c} {d→d} 1 C2: cand 2-seqs noise 0.2 0.3 0.2

  • 0.5

0.8 0.2 0.3 2.1

  • 0.5

Scan D Scan D

Sequence {a→c→d} C3: cand 3-seqs {a→d→c}

noise 0.3 Sequence {a→c→d} Sup. 3 {a→d→c} 1 C3: cand 3-seqs

Scan D

Sequence {a} {c} {d} Noisy Sup. 3.2 4.4 3.5 F1: freq 1-seqs

Sequence {a→c} {a→d} {c→d} Noisy Sup. 3.3 3.2 4.2 F2: freq 2-seqs {d→c} 3.1

Sequence {a→c→d} Noisy Sup. 3 F3: freq 3-seqs

Lap(|C2| / ε2) Lap(|C1| / ε1) Lap(|C3| / ε3)

Baseline: Laplace Mechanism

slide-44
SLIDE 44

PFS2 Algorithm

  • Sensitivity impacted by two factors:

– Candidate size – Sequence length

  • Basic ideas: reduce sensitivity
  • Use kth sample database for pruning candidate k-

sequences

– reduce candidate size

  • Shrink sequences by transformation while maintaining

the frequent patterns

– Reduce sequence length

Original Database

mth sample database 2nd sample database 1st sample database …… Partition

slide-45
SLIDE 45

DP Frequent Sequence Mining Evaluation

  • Metrics
  • F-score:
  • Relative Error:

2 precision recall F score precision recall     

'

{(| |) / }

X x x x

RE median sup sup sup  

slide-46
SLIDE 46

Course Topics

  • Inference control

– De-identification and anonymization – Differential privacy foundations – Differential privacy applications

  • Histograms
  • Data mining
  • Local differential privacy
  • Location privacy
  • Cryptography
  • Access control
  • Applications

11/9/2016 46

slide-47
SLIDE 47

Local Differential Privacy

Finance.com Fashion.co m WeirdStuff.com . . .

  • No trusted server
  • Each user applies

local perturbation before submitting the value to server

  • Server only

aggregates the values

  • Google Chrome

deployment

slide-48
SLIDE 48

Randomized Response

Disease (Y/N) Y Y N Y N N

With probability p, Report true value With probability 1-p, Report flipped value

Disease (Y/N) Y N N N Y N

D O

[W 65]

slide-49
SLIDE 49

Differential Privacy Analysis

  • Consider 2 databases D, D’ (of size M) that differ in the jth value
  • D[j] ≠ D’[j]. But, D[i] = D’[i], for all i ≠ j
  • Consider some output O
slide-50
SLIDE 50

Course Topics

  • Inference control

– De-identification and anonymization – Differential privacy foundations – Differential privacy applications

  • Histograms
  • Data mining
  • Local differential privacy
  • Location privacy
  • Cryptography
  • Access control
  • Applications

11/9/2016 50

slide-51
SLIDE 51

Individual Location Sharing: Existing Solutions and Challenges

  • Private information retrieval
  • Computationally expensive
  • Spatial cloaking
  • Syntactic privacy notion
  • Temporal correlations due to road constraints and moving patterns
slide-52
SLIDE 52
  • Event-level differential privacy
  • protect an event (the exact location of a single user at a given time)
  • Challenges:
  • Large input domain (locations on the map) makes output useless
  • Temporal correlations

Event-level differential privacy

Definition (Differential Privacy)

At any timestamp t, a randomized mechanism A satisfies

  • differential privacy if, for any output zt and any two locations x1

and x2, the following holds:

Intuition

the released location zt (observed by the adversary) will not help an adversary to differentiate any input locations

slide-53
SLIDE 53
  • Challenges:
  • Distance does not capture location semantics
  • Temporal correlations

Geo-indistinguishability [CCS 13]

Definition (Geo-indistinguishability)

At any timestamp t, a randomized mechanism A satisfies

  • differential privacy if, for any output zt and any two locations x1

and x2 within a circle of radius r , the following holds:

Intuition

the released location zt (observed by the adversary) will not help an adversary to differentiate any two input locations that are close to each other

slide-54
SLIDE 54

Differential privacy on δ-location set under temporal correlations [CCS 15]

Definition (Differential Privacy on δ-location set)

At any timestamp t, a randomized mechanism A satisfies

  • differential privacy on δ-location set if, for any output zt and any

two locations x1 and x2 in δ-location set, the following holds:

Intuition

the released location zt (observed by the adversary) will not help an adversary to differentiate any two locations in δ- location set, the possible set of locations a user might appear

slide-55
SLIDE 55

Course Topics

  • Inference control/Differential Privacy
  • Cryptography

– Foundations – Applications

  • Secure outsourcing
  • Secure multiparty computations
  • Private information retrieval
  • Access control
  • Applications

11/8/2016 55

slide-56
SLIDE 56

56

E D

m plaintext k encryption key k’ decryption key Ek(m) ciphertext Dk’ (Ek(m)) = m attacker

Operational model of encryption

  • Kerckhoff’s assumption:

– attacker knows E and D – attacker doesn’t know the (decryption) key

  • attacker’s goal:

– to systematically recover plaintext from ciphertext – to deduce the (decryption) key

  • attack models:

– ciphertext-only – known-plaintext – (adaptive) chosen-plaintext – (adaptive) chosen-ciphertext

slide-57
SLIDE 57

slide 57

Cryptography Primitives

  • Symmetric encryption
  • Pubic encryption
  • Encryption schemes with different properties

– Hommomorphic encryption – Probabilistic encryption vs Deterministic encryption – Order preserving encryption – Commutative encryption

slide-58
SLIDE 58

Symmetric Key Cryptography

symmetric key crypto: Bob and Alice share know same (symmetric) key: KA-B Examples: AES

plaintext ciphertext

K

A-B encryption algorithm decryption algorithm

K

A-B plaintext message, m c=KA-B (m) K (m)

A-B

m = K ( )

A-B

slide-59
SLIDE 59

Public-Key Cryptography

Public key for encryption and secret key for decryption Examples: RSA

slide-60
SLIDE 60

Course Topics

  • Inference control/Differential Privacy
  • Cryptography

– Foundations – Applications

  • Secure multiparty computations
  • Secure outsourcing
  • Access control
  • Applications

11/9/2016 60

slide-61
SLIDE 61

61

Multi-party secure computations (secure function evaluation)

– Securely compute a function without revealing private inputs

xn x1 x3 x2 f(x1,x2,…, xn)

slide-62
SLIDE 62

slide 62

Security Model

  • A protocol is secure if it emulates an ideal setting

where the parties hand their inputs to a “trusted party,” who locally computes the desired outputs and hands them back to the parties

[Goldreich-Micali-Wigderson 1987]

A B

x1 f2(x1,x2) f1(x1,x2) x2

slide-63
SLIDE 63

slide 63

Properties of the Definition

  • Correctness

– All honest participants should receive the correct result of evaluating function f

  • Privacy

– All corrupt participants should learn no more from the protocol than what they would learn in ideal model – Which means only the private input (obviously) and the result of f

slide-64
SLIDE 64

slide 64

Adversary Models

  • Semi-honest (aka passive; honest-but-curious)

– Follows protocol, but tries to learn more from received messages than he would learn in the ideal model

  • Malicious

– Deviates from the protocol in arbitrary ways, lies about his inputs, may quit at any point

slide-65
SLIDE 65

Security proof tools

– Real/ideal model: the real model can be simulated in the

ideal model

  • Key idea – Show that whatever can be computed by a party

participating in the protocol can be computed based on its input and output only

  •  polynomial time S such that {S(x,f(x,y))} ≡ {View(x,y)}
  • Composition theorem

– if a protocol is secure in the hybrid model where the protocol uses a trusted party that computes the (sub) functionalities, and we replace the calls to the trusted party by calls to secure protocols, then the resulting protocol is secure – Prove that component protocols are secure, then prove that the combined protocol is secure

slide-66
SLIDE 66

General protocols

  • Primitives

– Oblivious transfer (OT) – Random shares

slide-67
SLIDE 67

slide 67

Oblivious Transfer (OT)

  • Fundamental SMC primitive
  • 1-out-of-2 Oblivious Transfer (OT)

S R

m0, m1 m  = 0 or 1 S inputs two bits, R inputs the index of one of S’s bits R learns his chosen bit, S learns nothing

– S does not learn which bit R has chosen; R does not learn the value

  • f the bit that he did not choose

[Rabin 1981]

slide-68
SLIDE 68

Secret Sharing Scheme

  • Splitting

– Encode the secret as an integer S. – Give to each player i (except one) a random integer ri. – Give to the last player the number 𝑇 − 𝑠𝑗

𝑜−1 𝑗=1

slide-69
SLIDE 69

(t, n) threshold scheme

  • Shamir’s scheme 1979

– It takes t points to define a polynomial of degree t-1 – Create a t-1 degree polynomial with secret as the first coefficient and the remaining coefficients picked at

  • random. Find n points on the curve

and give one to each of the players. At least t points are required to fit the polynomial.

slide-70
SLIDE 70

General protocols

  • Passively-secure computation for semi-honest

model

– Yao’s garbled circuit for two-party (OT and symmetric encryption) – GWM protocol for multiple parties (random shares and OT)

  • From passively-secure protocols to actively-secure

protocols for malicious model

– Use zero-knowledge proofs to force parties to behave in a way consistent with the passively-secure protocol

slide-71
SLIDE 71

Specialized protocols

  • Using secret sharing, special encryption

schemes, or randomized responses

– May reveal some information – Tradeoff of security and efficiency

  • Examples

– Secure sum by random shares – Secure union (using commutative encryption)

  • Build complex protocols from primitive

protocols

slide-72
SLIDE 72

Course Topics

  • Inference control/Differential Privacy
  • Cryptography

– Foundations – Applications

  • Secure multiparty computations
  • Secure outsourcing
  • Access control
  • Applications

11/9/2016 72

slide-73
SLIDE 73

73

Secure data outsourcing

  • Support computation and queries on

encrypted data

11/9/2016 73

Encrypted Data

Computation /Queries

slide-74
SLIDE 74

Secure data outsourcing

  • Crypto Primitives

– Homomorphic encryption

  • General protocol based on fully homomorphic

encryption: computationally prohibitive

  • Specialized protocols based on partially homomorphic

encryption

– Property preserving encryption

  • Deterministic encryption vs probabilistic encryption
  • Order preserving encryption
slide-75
SLIDE 75

Homomorphic Encryption

Homomorphic Encryption

Enc[f(x)] Enc[x] f

Eval

slide-76
SLIDE 76

Homomorphic encryption schemes

  • Partially homomorphic encryption:

– homomorphic scheme where only one type of operation is possible (* or +)

  • Multiplicative homomorphic – e.g. RSA
  • Additive homomorphic, e.g. Paillier
  • Somewhat homomorphic encryption:
  • homomorphic scheme that can perform a limited number
  • f additions and multiplications
  • Fully homomorphic encryption (FHE) (Gentry, 2010)

– Can perform an infinite number of additions and multiplications

76

slide-77
SLIDE 77

11/9/2016 77

slide-78
SLIDE 78

Using Partially Hommomorphic Encryption with two servers

  • Two server setting

– C1 holds encrypted data E(a), E(b) – C2 holds decryption key sk

  • Security goal:

– C1 and C2 do not obtain anything about the data and result

  • Basic idea

– Utilize additive homomorphic property – Use random shares to ensure C2 only has access to decrypted data with random shares

  • Primitive protocols:

– secure multiplication, secure comparison, …

  • Build complex protocols from primitive protocols: eg secure kNN

queries …

slide-79
SLIDE 79

11/9/2016 79

slide-80
SLIDE 80

Secure data outsourcing

  • Crypto Primitives

– Homomorphic encryption

  • General protocol based on fully homomorphic

encryption: computationally prohibitive

  • Specialized protocols based on partially homomorphic

encryption

– Property preserving encryption

  • Deterministic encryption vs probabilistic encryption
  • Order preserving encryption
slide-81
SLIDE 81

11/9/2016 81

CryptDB

  • Use layers of encryption
  • Decrypt as needed