SQL on Structurally-Encrypted Databases Seny Kamara Tarik Moataz Q - - PowerPoint PPT Presentation

sql on structurally encrypted databases
SMART_READER_LITE
LIVE PREVIEW

SQL on Structurally-Encrypted Databases Seny Kamara Tarik Moataz Q - - PowerPoint PPT Presentation

SQL on Structurally-Encrypted Databases Seny Kamara Tarik Moataz Q : What is a relational database? 2 Relational DB Table or relation Column or attribute Att 1 Att 2 Att 3 Att 4 Att 5 Att 6 Att 7 DB = Row or record T 2 T 1 3 Structured


slide-1
SLIDE 1

SQL on Structurally-Encrypted Databases

Seny Kamara Tarik Moataz

slide-2
SLIDE 2

Q: What is a relational database?

2
slide-3
SLIDE 3

Relational DB

3

Att1 Att2 Att3 Att4 Att5 Att6 Att7

DB =

T1 T2

Row or record Table or relation Column or attribute

slide-4
SLIDE 4

Structured Query Language

  • SQL is a language for querying relational DBs
  • Example:

Se Select ct (name, gender, height) Fr From (T2, T8) Wh Where (age = 36 AND zip = 10040 AND gender = F)

  • SQL is the standard way to query a relational DB
  • Standard ANSI/ISO since 1986/1987
4
slide-5
SLIDE 5

Q: What is Structured Encryption (STE)?

5
slide-6
SLIDE 6

Structured Encryption (STE) [CK10]

6 Setup(1k, DS) ⟾ (K, EDS)

Token(K, q) ⟾ tk

tk

Query(EDS, tk) ⟾ ct

EDS

DS

ct

slide-7
SLIDE 7

Structured Encryption (STE) [CK10]

7 Setup(1k, DS) ⟾ (K, EDS)

Token(K, q) ⟾ tk

tk

Query(EDS, tk) ⟾ ans

EDS

DS

ans

Setup Leakage Ls(DS) Query Leakage Lq(DS, q)

slide-8
SLIDE 8

Structured Encryption (STE) [CK10]

We say that an STE is (LS,LQ)-secure if

  • It reveals no information about the structure beyond LS
  • It reveals no information about the structure and queries beyond LQ
8
slide-9
SLIDE 9

Encrypted Multi-Maps [CK10]

9

Single Keyword SSE [SWP00], [Goh03], [CGKO06], [CK10], [KPR12], [KP13], [CJJKRS13], [CJJJKRS14], [Bost16], [BMO17], [AKM19] … Encrypted Multi-Map Encrypted Inverted Index

slide-10
SLIDE 10

Q: How can we encrypt a relational DB?

10
slide-11
SLIDE 11

Efficiency Leakage Functionality

11
slide-12
SLIDE 12

Tradeoffs: Efficiency vs. Security

12 Efficiency STE/SSE-based PPE-based FHE-based ORAM-based skFE-based pkFE-based Leakage
slide-13
SLIDE 13

Tradeoffs: Functionality vs. Efficiency

13 SK-FE-based STE/SSE-based PPE-based FHE-based ORAM-based PK-FE-based Efficiency Functionality SQL NoSQL
slide-14
SLIDE 14

Q: Can we design an STE-based Relational EDB?

14
slide-15
SLIDE 15

Challenges

  • No PPE so no plug-and-play solutions
  • SQL is a declarative language
  • Where do we even start?
  • SQL is complex
  • Combination of many basic query types
  • Most STE schemes handle a single type queries
  • SQL is “constructive”
  • STE has been optimized for “lookup-type” queries
15
slide-16
SLIDE 16

Ch Ch. . #1 #1: Declarative => Procedural

  • Relational algebra [Codd70]
  • Set of operations on relations/tables
  • Union
  • Difference
  • Selection
  • Projection
  • Cross product
  • Join (many kinds)
16

SQL RA

slide-17
SLIDE 17
  • SPC algebra [Chandra-Merlin77]
  • Selection, Projection, Cross product
  • Equivalent to Conjunctive SQL queries
  • Any SPC query can be written in a Normal Form:

Ch Ch. . #2: #2: Complex => Simple

17

SQL RA SPC

slide-18
SLIDE 18

Select, Project, Cross Product

18 Att1 Att2 Att3

𝜏"

Att1 Att2 Att3 Att1 Att2 Att3

𝜌$,&

Att2 Att3 Att1 Att2 Att3 Att4 Att5 Att6 Att1 Att2 Att3 Att4 Att5 Att6
slide-19
SLIDE 19

Our Goal

19

tk

Att1 Att2 Att1 Att3

STEK

SQL => SPC => NF

Att2 Att3

EncK

slide-20
SLIDE 20

Our Results

  • SP

SPX: Encrypted Relational Database

  • First STE scheme for relational DBs
  • Handles non-trivial subset of SQL
  • Sub-linear search and storage complexity (optimal under certain conditions)
  • from any single-keyword SSE
  • SP

SPX+: dynamic SPX

  • Only row addition and deletion
  • from any dynamic single-keyword SSE
  • Sub-linear search and storage complexity (optimal under certain conditions)
  • FP

FP-SP SPX+: forward-private dynamic SPX

  • poly-logarithmic overhead for updates
20
slide-21
SLIDE 21

A: Naïve STE-based Relational EDB

21
slide-22
SLIDE 22

Naïve SPC Algorithm

22 Att1 Att2 Att3 Att4 Att5 Att6 Att1 Att2 Att3 Att4 Att5 Att6 Att2 Att6
slide-23
SLIDE 23

Sub-Linear SPC Algorithm

  • Ideally linear in output size:
  • Less than cross product size:
23 Att2 Att6 Att1 Att2 Att3 Att4 Att5 Att6
slide-24
SLIDE 24

Q: Can we achieve sub-linear STE-based EDB?

24
slide-25
SLIDE 25

SPX Overview

  • Step 1. Heuristic normal form (HNF) instead of the standard normal

form

  • Avoid naïve Cartesian product by a “push select through product” method
  • Step 2. New (plaintext) data structure that supports HNF
  • Different representations of the database to handle different SPC operators
  • Step 3. Encrypted structure that supports HNF queries
  • Chaining technique with a better control of leakage
  • From any single-keyword SSE schemes
25
slide-26
SLIDE 26

Step 1: Heuristic Normal Form (1)

26 Att4 Att5 Att6 Att1 Att2 Att3

𝜏"

Att4 Att5 Att6 Att1 Att2 Att3

𝜏"' 𝜏"(

Ψ = Ψ1 ∧ Ψ2

More complicated

  • Correlated/non-correlated
  • Different types of select

Push Select through Product

slide-27
SLIDE 27

Step 1: Heuristic Normal Form (2)

27 Att4 Att5 Att6 Att1 Att2 Att3

𝜏"' 𝜏"(

Att1 Att2 Att3 Att4 Att5 Att6 Att1 Att2 Att3 Att4 Att5 Att6 Att1 Att2 Att3 Att4 Att5 Att6

Size Overhead

slide-28
SLIDE 28

Step 2: Database representations

28

DB =

Row representation Column representation Value representation Cross-value representation

T1 T2

Att1 Att2 Att3 Att4 Att5
slide-29
SLIDE 29

Step 2: Row / Column representation

29 Att1 Att2 Att3 Att4 Att5 (T1, 1) Row Multi-map MMR (T1, 2) (T2, 1) (T2, 2) (T2, 3) (T1, Att1) Column Multi-map MMC (T1, Att2) (T2, Att3) (T2, Att4) (T2, Att5)
slide-30
SLIDE 30

Step 2: Value representation

30 Att1 Att2 1 CS 2 Math Att3 Att4 Att5 1 45 CS 2 45 Math 2 60 CS (1, T1, Att1) Value Multi-map MMv (2, T1, Att1) (T1, 1) (CS, T1, Att2) (Math, T1, Att2) (1, T2, Att3) (2, T2, Att3) (45, T2, Att4) (60, T2, Att4) (CS, T2, Att5) (Math, T2, Att5) (T1, 2) (T1, 1) (T1, 2) (T2, 1) (T2, 2) (T2, 3) (T2, 1) (T2, 2) (T2, 1) (T2, 1) (T2, 3) (T2, 2)
slide-31
SLIDE 31

Step 2: Cross-Value representation

31 Att1 Att2 1 CS 2 Math Att3 Att4 Att5 1 45 CS 2 45 Math 2 60 CS ((T1,Att1), (T2, Att3)) Cross-Value Multi-map MMAtt1 (T1, 1), (T2, 1) (T1, 2), (T2, 2) (T1, 2), (T2, 3) ((T1,Att2), (T2, Att5)) Cross-Value Multi-map MMAtt2 (T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2)
slide-32
SLIDE 32

Step 3: SPX Setup

32

SetupSPX 1k,

Att1 Att2 Att3 Att4 Att5
slide-33
SLIDE 33

Step 3: SPX Setup

33 (T1, 1) Encrypted Column Multi-map EMMC (T1, 2) (T2, 1) (T2, 2) (T2, 3) (T1, 1) Encrypted Row Multi-map EMMR (T1, 2) (T2, 1) (T2, 2) (T2, 3) (1, T1, Att1) Encrypted Value Multi-map EMMv (2, T1, Att1) (T1, 1) (CS, T1, Att2) (Math, T1, Att2) (1, T2, Att3) (2, T2, Att3) (45, T2, Att4) (60, T2, Att4) (CS, T2, Att5) (Math, T2, Att5) (T1, 2) (T1, 1) (T1, 2) (T2, 1) (T2, 2) (T2, 3) (T2, 1) (T2, 2) (T2, 1) (T2, 1) (T2, 3) (T2, 2) ((T1,Att1), (T2, Att3)) Encrypted Cross-Values Multi-map EMMAtt1 (T1, 1), (T2, 1) (T1, 2), (T2, 2) (T1, 2), (T2, 3) ((T1,Att2), (T2, Att5)) Encrypted Cross-Values Multi-map EMMAtt2 (T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2) Encrypted dictionary EDX

,

Att1 Att2
slide-34
SLIDE 34

Step 3: SPX Token (1)

34

TokenSPX

,

Select Att3 From (T1, T2) Where T1.Att2= T2.Att5

slide-35
SLIDE 35

Step 3: SPX Token (2)

35

πatt3 ✓ σatt2=att5

  • T1 × T2

  • 1. Rewrite SQL query to Normal Form
  • 2. Rewrite Normal Form to Heuristic Normal Form
  • 3. Generate the token

Att2 3 ((T1, Att2), (T2, Att5))

Dictionary sub-token Select Sub-token Projection Sub-token
slide-36
SLIDE 36

Step 3: SPX Query (1)

36

QuerySPX

,

Encrypted Column Multi-map EMMC Encrypted Row Multi-map EMMR Encrypted Value Multi-map EMMv Encrypted dictionary EDX

Att2 ((T1, Att2), (T2, Att5)) 3

slide-37
SLIDE 37

Step 3: SPX Query (2)

37

Get

,

((T1,Att2), (T2, Att5)) Encrypted Cross-Values Multi-map EMMAtt2 (T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2) ((T1,Att1), (T2, Att3)) Encrypted Cross-Values Multi-map EMMAtt1 (T1, 1), (T2, 1) (T1, 2), (T2, 2) (T1, 2), (T2, 3) ((T1,Att2), (T2, Att5)) Encrypted Cross-Values Multi-map EMMAtt2 (T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2) Encrypted dictionary EDX Att1 Att2

Att2

slide-38
SLIDE 38

Step 3: SPX Query (3)

38

Get

,

((T1, Att2), (T2, Att5))

((T1,Att2), (T2, Att5)) Encrypted Cross-Values Multi-map EMMAtt2 (T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2)

(T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2)

slide-39
SLIDE 39

Step 3: SPX Query (4)

39

Get

,

(T1, 1) Encrypted Row Multi-map EMMR (T1, 2) (T2, 1) (T2, 2) (T2, 3)

(T1, 1)

,

(T2, 1)

Encrypted Row Multi-map EMMR

Get

Temporary Result Table

(T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2)

slide-40
SLIDE 40

Step 3: SPX Query (5)

40

,

(T1, 2)

Encrypted Row Multi-map EMMR

Get

Temporary Result Table

,

(T2, 3)

Encrypted Row Multi-map EMMR

Get

,

(T1, 2)

Encrypted Row Multi-map EMMR

Get

,

(T2, 2)

Encrypted Row Multi-map EMMR

Get

(T1, 1), (T2, 1) (T1, 2), (T2, 3) (T1, 2), (T2, 2)

slide-41
SLIDE 41

Step 3: SPX Query (6)

41

Temporary Result Table 3

𝜌

Final Result

slide-42
SLIDE 42

Leakage: SPX-OPT vs. PPE-based

  • Query leakage of SPX
  • Cross product pattern
  • Projection pattern
  • Selection pattern
  • Query leakage of PPE-based schemes
  • Cross product pattern
  • Projection pattern
  • Selection pattern
  • Frequency pattern
  • Persistent
  • Existing very strong attacks
42

slide-43
SLIDE 43

Modularity: SPX-Obliv vs. SPX-OPT

Query leakage of SPX-OPT

43

Query leakage of SPX-Obliv

  • When the EMMs are oblivious

[GO96,SvDS+13,GMP16,KMO18]

  • But comes with extra overhead
slide-44
SLIDE 44

SPX-OPT Asymptotics

  • Worst-case query complexity with
  • Assuming optimal MM and DX encryption schemes [CK10,CJJ+14]
  • ℎ projected attributes
  • 𝑢 tables with size each
  • is the size of the result on a plaintext database
  • Mild condition
  • If is constant in , then query time is optimal
44

h−1 ·

t

X

i=1

si

slide-45
SLIDE 45

SPX-OPT Asymptotics

  • Communication complexity is optimal
  • Storage depends on the data distribution
  • Concretely, the big-O hides a multiplicative factor of 3
45

O(#DB + X

att∈S

#MMatt)

slide-46
SLIDE 46

Takeaways and Future Work

  • First STE-based encrypted relational database
  • Sub-linear query time (optimal under certain conditions)
  • First (forward-private) dynamic STE-based encrypted relational

database

  • Better leakage profile than PPE-based
  • Future research questions:
  • Extend SPX to handle relational algebra
  • Remove interaction in SPX+
  • Extend SPX to handle range sub-predicates
  • Design of new encrypted range solutions is required [KKNO16, LMP17]
46
slide-47
SLIDE 47

Thank you!

https://eprint.iacr.org/2016/453

47