CryptDB: Protecting Confidentiality with Encrypted Query Processing - - PowerPoint PPT Presentation

cryptdb protecting confidentiality
SMART_READER_LITE
LIVE PREVIEW

CryptDB: Protecting Confidentiality with Encrypted Query Processing - - PowerPoint PPT Presentation

CS573 Data Privacy and Security CryptDB: Protecting Confidentiality with Encrypted Query Processing Li Xiong Slides credit: Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan MIT CSAIL Problem Confidential


slide-1
SLIDE 1

Li Xiong

CryptDB: Protecting Confidentiality with Encrypted Query Processing

CS573 Data Privacy and Security

Slides credit: Raluca Ada Popa, Catherine M. S. Redfield, Nickolai Zeldovich, and Hari Balakrishnan MIT CSAIL

slide-2
SLIDE 2

Application DB Server

SQL User 1 User 2 User 3

 Confidential data leaks from databases

 8M medical records compromised 2009-2011 [Homeland Sec.

News Wire,’11]

 2012: hackers extracted 6.5 million hashed passwords from the

DB of LinkedIn

 Sony Playstation Network, accessed 77 million user profiles

Problem

System administrator Threat 1: passive DB server attacks Threat 2: any attacks on all servers Hackers

slide-3
SLIDE 3

CryptDB in a nutshell

 Goal: protect confidentiality of data 1.

Process SQL queries on encrypted data

2.

Use fine-grained keys; chain these keys to user passwords based on access control

Application DB Server

SQL Threat 1: passive DB server attacks Threat 2: any attacks on all servers

  • n encrypted data

User 1 User 2 User 3

slide-4
SLIDE 4
  • 1. First practical DBMS to process most SQL queries
  • n encrypted data

Hide DB from sys. admins., outsource DB

2.

Modest overhead: 26% throughput loss for TPC-C

Contributions

3.

No changes to DBMS (e.g., Postgres, MySQL) and no changes to applications

slide-5
SLIDE 5

salary 60 100 800 100 100 = query ?

index

60

Unencrypted databases fast insecure FHE [Gentry’09], [BV’11,12][GHS’12],..

… 100 800 salary xa32601 x8199f3 x62d03b xcef3f7 …

circuit C

  • utput

slow strong security

salary x4be2 x95c6 x2ea8 x17ce x98aa = query ?

index

x4be2 … x17ce x2ea8

CryptDB fast high degree

  • f security

query input x24ab1c

Most SQL uses a limited set of operations Security: Reveal only relations among data that are required by classes of queries issued

slide-6
SLIDE 6

Unencrypted databases fast insecure FHE slow strong security

salary x4be2 x95c6 x2ea8 x17ce x98aa = query ?

index

x4be2 … x17ce x2ea8

CryptDB fast high degree

  • f security

Other work: weaker security, functionality, and/or efficiency:

  • Search on encrypted data (e.g., [Song et al.,’00])
  • Systems proposals (e.g., [Hacigumus et al.,’02])
  • Rewrite the DBMS, significant client-side processing
slide-7
SLIDE 7

System Setup

DB Server

transformed query

Proxy

plain query

  • Stores schema, master key
  • No data storage
  • No query execution

Under attack

Application decrypted results encrypted results

Trusted

  • Process queries

completely at the DBMS,

  • n encrypted database

 Process SQL queries on encrypted data

Encrypted DB

slide-8
SLIDE 8

col1/rank col2/name table1/emp SELECT * FROM emp WHERE salary = 100 x934bc1 x5a8c34 x5a8c34 x84a21c SELECT * FROM table1 WHERE col3 = x5a8c34 Proxy

?

x5a8c34 x5a8c34

?

x5a8c34 x5a8c34 x4be219 x95c623 x2ea887 x17cea7 col3/salary Application 60 100 800 100

Randomized encryption Deterministic encryption

slide-9
SLIDE 9

col1/rank col2/name table1 (emp) x934bc1 x5a8c34 x5a8c34 x84a21c x638e5 4 x638e5 4 x922eb4 x1eab8 1 SELECT * FROM table1 WHERE col3 ≥ x638e54 Proxy x638e5 4 x922eb4 x638e5 4 col3/salary Application 60 100 800 100

Deterministic encryption

SELECT * FROM emp WHERE salary ≥ 100

OPE (order) encryption

slide-10
SLIDE 10
  • 1. Use SQL-aware set of encryption schemes

Two techniques

Most SQL uses a limited set of operations Having encryption schemes that covers most common SQL operations

  • 2. Adjust encryption of database based on

queries

  • Different queries required that data to be

encrypted with different encryption schemes

slide-11
SLIDE 11

Encryption schemes

e.g., =, !=, IN, COUNT, GROUP BY, DISTINCT Scheme RND HOM DET SEARCH JOIN OPE Function none +, * equality join word search

  • rder

Construction AES in CBC AES in CMC Paillier

  • ur new scheme

Song et al.,‘00

Boldyreva et al.’09

first implementation

e.g., >, <, ORDER BY, SORT, MAX, MIN restricted ILIKE

Fullword matching

Highest Security

Equality matches bw 2 columns

e.g., sum

slide-12
SLIDE 12

 Adjust (ti,Cm i): Cm (with )  Encrypt (SK, m, col i): Cm i (with ) - deterministic

JOIN

 Equality checks between two columns  Do not know columns to be joined a priori!

 Correctness: adjustment yields correct join relations

col j col i Proxy Join key col i – col j

 KeyGen (sec. param): SK  Token (SK, col i, col j): (ti, tj)

slide-13
SLIDE 13

JOIN (cont’d)

 Security: do not learn join relations without token  Implementation:  192 bits long, 0.52 ms encrypt, 0.56 ms adjust

col j col i Proxy Join key col i – col j

slide-14
SLIDE 14

OPE

 Preserve the order of ciphertext to remain as they were in

plaintext.

 For example, for any secret key 𝐿, if 𝑦 < 𝑧, then

𝑃𝑄𝐹𝐿(𝑦) < 𝑃𝑄𝐹𝐿(𝑧).

 if a column is encrypted with 𝑃𝑄𝐹, the server can perform range

queries when given encrypted constants 𝑃𝑄𝐹𝐿(𝑑1) and 𝑃𝑄𝐹𝐿(𝑑2) corresponding to the range [𝑑1, 𝑑2].

 OPE is a weaker encryption scheme than DET because it

reveals order

slide-15
SLIDE 15

Encryption schemes

Scheme RND HOM DET SEARCH JOIN OPE Function none +, * equality join word search

  • rder

Construction AES in CBC AES in CMC Paillier

  • ur new scheme

Song et al.,‘00

Boldyreva et al.’09

Highest Security

+ our new scheme

Functionality

slide-16
SLIDE 16

Ho How to e

  • enc

ncryp rypt t each ch da data a item? m?

  • Encryption schemes needed depend on queries
  • May not know queries ahead of time

Leaks order!

rank

ALL?

col1- RND col1- HOM col1- SEARCH col1- DET col1- JOIN col1- OPE ‘CEO’ ‘worker’

slide-17
SLIDE 17

int value

HOM Onion Add

Oni nions

  • ns of
  • f enc

ncryp ryptions tions

value

JOIN DET RND Onion Equality Onion Search

  • Same key for all items in a column for same onion layer
  • Start out the database with the most secure encryption

scheme

OR each value

value

OPE-JOIN OPE RND Onion Order

text value

SEARCH Idea: To stack encryption schemes into onion of encryption

slide-18
SLIDE 18

On Onio ions ns of

  • f enc

encrypti ryptions

  • ns
  • Novel way to compactly store multiple ciphertexts

within each other in the database and avoid expensive re-encryptions

  • Each value is dressed in layers of increasingly stronger

encryption

  • Each layer of each onion enables certain kinds of

functionality

  • For each layer of each onion, the proxy uses the same

key for encrypting values in the same column and

  • Different keys across tables, columns, onions, and
  • nion layers
slide-19
SLIDE 19

Adjust encryption

  • Dynamically adjusts the layer of encryption on

the DBMS server

  • Strip off layers of the onions
  • Proxy gives keys to server using a SQL UDF

(“user-defined function”)

  • Proxy remembers onion layer for columns
  • Do not put back onion layer
slide-20
SLIDE 20

Example:

SELECT * FROM emp WHERE rank = ‘CEO’;

emp: rank name salary ‘CEO’ ‘worker’

‘CEO’

JOIN DET RND Onion Equality col1- OnionEq col1- OnionOrder col1- OnionSearch col2- OnionEq table 1:

… … …

RND RND SEARCH RND SEARCH RND RND RND

slide-21
SLIDE 21

Example (cont’d)

UPDATE table1 SET col1-OnionEq = Decrypt_RND(key, col1-OnionEq);

‘CEO’

JOIN DET RND

SELECT * FROM table1 WHERE col1-OnionEq = xda5c0407;

DET Onion Equality RND RND

SELECT * FROM emp WHERE rank = ‘CEO’;

DET DET col1- OnionEq col1- OnionOrder col1- OnionSearch col2- OnionEq table 1

… … …

RND RND SEARCH RND SEARCH RND

slide-22
SLIDE 22
  • aggregation on a column HOM nothing

Confidentiality level

  • equality predicate on a column DET repeats
  • Never reveals plaintext

Queries encryption scheme exposed

common in practice

  • no filter on a column RND nothing

amount of leakage

  • Encryption schemes exposed for each column are the

most secure enabling queries

slide-23
SLIDE 23

Implementation

CryptDB Proxy

Unmodified DBMS CryptDB SQL UDFs (user-defined

functions)

Server

query results transformed query encrypted results

SQL Interface

  • No change to the DBMS
  • Portable: from Postgres to MySQL with 86 lines

Application

  • no change to applications
slide-24
SLIDE 24

Evaluation

1.

Does it support real queries/applications?

2.

What is the resulting confidentiality?

3.

What is the performance overhead?

slide-25
SLIDE 25

Queries not supported

  • More complex operators, e.g., trigonometry
  • Operations that require combining incompatible

encryption schemes

  • e.g., T1.a + T1.b > T2.c

Extensions: split queries, precompute columns, or add new encryption schemes

slide-26
SLIDE 26

Real queries/applications

Application Total columns Encrypted columns phpBB 563 23 HotCRP 204 22 grad-apply 706 103 TPC-C 92 92 sql.mit.edu 128,840 128,840 # cols not supported 1,094

SELECT 1/log(series_no+1.2) … … WHERE sin(latitude + PI()) …

slide-27
SLIDE 27

Resulting confidentiality

Application Total columns Encrypted columns phpBB 563 23 HotCRP 204 22 grad-apply 706 103 TPC-C 92 92 sql.mit.edu 128,840 128,840 Min level is RND 21 18 95 65 80,053 Min level is DET 1 1 6 19 34,212 Min level is OPE 1 2 2 8 13,131

Most columns at RND Most columns at OPE analyzed were less sensitive

slide-28
SLIDE 28

Performance

DB server throughput

CryptDB Proxy Encrypted database Application 1

CryptDB:

Plain database Application 1

MySQL:

CryptDB Proxy Application 2 Application 2

Latency

  • Hardware: 2.4 GHz Intel Xeon E5620 – 8 cores, 12 GB RAM
slide-29
SLIDE 29

TPC-C performance

Max. Throughput loss 26%

  • Latency (ms/query): 0.10 MySQL vs. 0.72 ms CryptDB
slide-30
SLIDE 30

TPC-C microbenchmarks

Encrypted DBMS is practical No cryptography at the DB server in the steady state! Homomorphic addition

slide-31
SLIDE 31

Conclusions

1.

The first practical DBMS for running most standard queries on encrypted data

2.

Protects data of users logged out during attack even when all servers are compromised

3.

Modest overhead and no changes to DBMS

CryptDB:

Website: http://css.csail.mit.edu/cryptdb/