Frequency-hiding Dependency-preserving Encryption for Outsourced - PowerPoint PPT Presentation

Frequency-hiding Dependency-preserving Encryption for Outsourced Databases ICDE’17 Boxiang Dong 1 Wendy Wang 2 1 Montclair State University Montclair, NJ 2 Stevens Institute of Technology Hoboken, NJ April 20, 2017

Data-Management-as-a-Service (DMaS) D Data Owner Server • Data owner with limited computational resources • Computationally powerful server (e.g. cloud) • Outsourcing provides a cost-effective solution for data management. 2 / 47

Functional Dependency (FD) Definition A FD X → Y states that for any records r 1 and r 2 , r 1 [ X ] = r 2 [ X ] demands that r 1 [ Y ] = r 2 [ Y ] . Applications • Data schema improvement via normalization • Data inconsistency repair 3 / 47

Outsourcing Requirement Data Owner Malicious Server Privacy Concern • Protect the sensitive information from untrusted server. • Encrypt the dataset before outsourcing. Utility Concern • Support FD-based applications. • The encryption scheme should preserve FDs. 4 / 47

Challenges Directly applying deterministic encryption (e.g. RSA) is vulnerable against the frequency-analysis attack (FA attack) [N + 15]. FA-Attack ( P , E ) 1. compute π ← vSort ( Hist ( P )) 2. compute ϕ ← vSort ( Hist ( E )) 3. foreach e ∈ E output p if Rank ϕ ( e ) = Rank π ( p ) ID A B C ID A B C ˆ r 1 a 1 ˆ b 1 ˆ c 1 r 1 a 1 b 1 c 1 ˆ r 2 a 1 ˆ b 1 c 2 ˆ r 2 a 1 b 1 c 2 ˆ ˆ ˆ r 3 a 1 b 1 c 4 r 3 a 1 b 1 c 4 ˆ r 4 a 1 b 1 c 3 r 4 a 1 ˆ b 1 ˆ c 3 r 5 a 2 b 2 c 3 ˆ r 5 a 2 ˆ b 2 c 3 ˆ r 6 a 2 b 2 c 4 ˆ r 6 a 2 ˆ b 2 ˆ c 4 (b) ˆ (a) Base table D ( A → B D 1 : deterministic encryption A �→ C , B �→ C ) 5 / 47

Challenges Applying probabilistic encryption may destroy original FDs or introduce false positive FDs. ID A B C ID A B C a 1 b 1 ˆ c 1 a 1 b 1 ˆ c 1 r 1 ˆ ˆ r 1 ˆ ˆ 1 1 1 1 1 1 a 2 b 2 ˆ c 1 a 2 b 2 ˆ c 2 r 2 ˆ ˆ r 2 ˆ ˆ 1 1 2 1 1 2 ˆ ˆ a 3 b 3 c 2 a 3 b 3 c 3 ˆ ˆ ˆ ˆ r 3 r 3 1 1 4 1 1 4 ˆ ˆ a 4 b 4 c 1 a 4 b 4 c 4 r 4 ˆ ˆ r 4 ˆ ˆ 1 1 3 1 1 3 ˆ ˆ a 1 b 1 c 2 a 5 b 5 c 5 r 5 ˆ ˆ r 5 ˆ ˆ 2 2 3 2 2 3 a 1 b 2 ˆ c 1 a 6 b 6 ˆ c 6 r 6 ˆ ˆ r 6 ˆ ˆ 2 2 4 2 2 4 (c) ˆ (d) ˆ D 2 : probabilistic encryption D 3 : probabilistic encryption on A, B, C individually on (A, B, C) Original FD A → B destroyed False positive FD A → C introduced 6 / 47

Challenges The FD-preserving property introduces new inference attack [PR12]. ( D 0 , FD 0 ) , ( D 1 , FD 1 ) $ D b s.t. b ← − { 0 , 1 } FD-preserving CPA-secure cipher ˆ D b � if FD 0 holds on ˆ 0 D b b ′ = 1 otherwise 7 / 47

Our Contributions Security Definition • α − security against FA -attack • Indistinguishability against FD-preserving chosen plaintext attack (IND-FCPA) Encryption Scheme We design F 2 , a frequency-hiding, FD-preserving encryption scheme based on probabilistic encryption. 8 / 47

Outline 1 Introduction 2 Related Work 3 Security Model 4 Encryption Scheme • Step 1: Identifying Maximum Attribute Sets • Step 2: Splitting-and-Scaling Encryption • Step 3: Conflict Resolution • Step 4. Eliminating False Positive FDs 5 Experiments 6 Conclusion 9 / 47

Related Work Privacy-preserving outsourced computing • Data encoding [H + 02a, H + 02b] • Data encryption [S + 00, P + 12] • Property-preserving encryption [Ker15, B + 11, G + 06, B + 09] Inference attack • FA attack [N + 15] • Query-recovery attack [I + 12] FD applications • Data cleaning [T + 11] • Schema design [BFFR05, B + 07] 10 / 47

Security Model Experiment Exp F A Π () p ′ ← A freq E ( e ) ,freq ( P ) Return 1 if p ′ = Decrypt ( k, e ) Return 0 o th erwise Adv FA Π ( A ) = Prob ( Exp FA Π ( A ) = 1 ) measures the success rate of FA attack. Definition ( α -security against FA Attack) An encryption scheme Π is α -secure against FA if for every adversary A it holds that Adv FA Π ( A ) ≤ α , where α ∈ ( 0 , 1 ] is user specified. 11 / 47

Security Model The server may exploit the FDs to break the cipher. Experiment Exp F CP A () Π ( D 0 , FD ) , ( D 1 , FD ), | D 0 | = | D 1 | $ ← − { 0 , 1 } D b s.t. b An encryption scheme Π ˆ D b b ′ b = b ′ otherwise 1 0 12 / 47

Security Model Adv FCPA ( A ) = Prob ( Exp FCPA ( A ) = 1 ) − 1 / 2 measures the Π Π advantage of the FCPA -attack over a random guess. Definition (Indistinguishability against FD- preserving Chosen Plaintext Attack (IND-FCPA)) An encryption scheme Π is IND-FCPA if for any polynomial-time adversary A , it holds that the advantage is negligible in λ , i.e., Adv FCPA ( A ) = negl ( λ ) , where λ is a Π pre-defined security parameter. 13 / 47

F 2 Encryption Scheme - Overview F 2 , a frequency-hiding FD-preserving encryption scheme, consists of four steps. D Step 1. Identifying Maximal Attribute Sets 14 / 47

F 2 Encryption Scheme - Overview F 2 , a frequency-hiding FD-preserving encryption scheme, consists of four steps. D Step 1. Identifying Maximal Attribute Sets Step 2. Splitting-and- Scaling Encryption 15 / 47

F 2 Encryption Scheme - Overview F 2 , a frequency-hiding FD-preserving encryption scheme, consists of four steps. D Step 1. Identifying Maximal Attribute Sets Step 2. Splitting-and- Scaling Encryption Step 3. Conflict Resolution ¯ D 18 / 47

F 2 Encryption Scheme - Overview F 2 , a frequency-hiding FD-preserving encryption scheme, consists of four steps. D Step 1. Identifying Maximal Attribute Sets Step 2. Splitting-and- Scaling Encryption Step 3. Conflict Resolution ¯ D ˆ D Step 4. Eliminating False ∆ D Positive FDs 19 / 47

Step 1 - Identifying Maximal Attribute Sets Theorem Given a dataset D and a FD X → Y , if we apply probabilistic encryption scheme on attribute set A and get ˆ D , then ˆ D preserves X → Y if ( X ∪ Y ) ⊆ A . 20 / 47

Step 1 - Identifying Maximal Attribute Sets Definition (Maximum Attribute Set ( MAS )) Given a dataset D , an attribute set A is a MAS if: (1) there exists at least an instance of A whose number of occurrences is larger than 1; and (2) no superset of A satisfies this requirement. 21 / 47

Step 1 - Identifying Maximal Attribute Sets Lemma Given a dataset D and a FD X → Y , there must exist at least a MAS M such that ( X ∪ Y ) ⊆ M . 22 / 47

Step 1 - Identifying Maximal Attribute Sets • To preserve FD s, we need to find the MAS s from the dataset. • We adapt an efficient solution named Ducc [H + 13]. • The complexity is much lower than FD discovery. ID A B C r 1 a 2 b 1 c 1 r 2 a 1 b 1 c 1 r 3 a 1 b 1 c 2 r 4 a 3 b 1 c 2 r 5 a 4 b 2 c 2 r 6 a 5 b 2 c 3 FD : A → B 23 / 47

Step 1 - Identifying Maximal Attribute Sets • To preserve FD s, we need to find the MAS s from the dataset. • We adapt an efficient solution named Ducc [H + 13]. • The complexity is much lower than FD discovery. ID A B C r 1 a 2 b 1 c 1 r 2 a 1 b 1 c 1 r 3 a 1 b 1 c 2 r 4 a 3 b 1 c 2 r 5 a 4 b 2 c 2 r 6 a 5 b 2 c 3 FD : A → B MAS = { AB, BC } 24 / 47

Step 1 - Identifying Maximal Attribute Sets • To preserve FD s, we need to find the MAS s from the dataset. • We adapt an efficient solution named Ducc [H + 13]. • The complexity is much lower than FD discovery. ID A B C r 1 a 2 b 1 c 1 r 2 a 1 b 1 c 1 r 3 a 1 b 1 c 2 r 4 a 3 b 1 c 2 r 5 a 4 b 2 c 2 r 6 a 5 b 2 c 3 FD : A → B MAS = { AB, BC } 25 / 47

Step 2 - Splitting-and-Scaling Encryption for all MAS do Construct equivalence classes (ECs) end for ID B C r 1 b 1 c 1 C 1 r 2 b 1 c 1 r 3 b 1 c 2 C 2 r 4 b 1 c 2 C 3 r 5 b 2 c 2 r 6 b 2 c 3 C 4 26 / 47

Step 2 - Splitting-and-Scaling Encryption for all MAS do Construct equivalence classes (ECs) Organize EC s into collision-free groups of size at least 1 α end for α = 1 ID B C 2 r 1 b 1 c 1 C 1 r 2 b 1 c 1 E CG 1 r 3 b 1 c 2 C 2 r 4 b 1 c 2 C 3 r 5 b 2 c 2 E CG 2 r 6 b 2 c 3 C 4 27 / 47

Step 2 - Splitting-and-Scaling Encryption for all MAS do Construct equivalence classes (ECs) Organize EC s into collision-free groups of size at least 1 α Apply splitting and scaling to reach the same frequency end for Splitting Split a EC into ω copies with the same frequency. Scaling Duplicate a EC to reach frequency homogenization. ID B C split ˆ b 1 c 1 ˆ r 1 b 1 c 1 1 1 C 1 ˆ b 2 c 2 ˆ r 2 b 1 c 1 1 1 split b 3 c 1 ˆ ˆ r 3 b 1 c 2 C 2 1 2 ˆ b 4 c 2 r 4 b 1 c 2 ˆ 1 2 C 3 r 5 b 2 c 2 r 6 b 2 c 3 C 4 28 / 47

Step 2 - Splitting-and-Scaling Encryption for all MAS do Construct equivalence classes (ECs) Organize EC s into collision-free groups Apply splitting and scaling to reach the same frequency end for We design an algorithm to decide the splitting and scaling strategy to minimize the amount of duplications. ID B C split b 1 ˆ c 1 ˆ r 1 b 1 c 1 1 1 C 1 ˆ b 2 c 2 ˆ r 2 b 1 c 1 1 1 split b 3 c 1 ˆ ˆ r 3 b 1 c 2 C 2 1 2 b 4 ˆ c 2 r 4 b 1 c 2 ˆ 1 2 C 3 r 5 b 2 c 2 r 6 b 2 c 3 C 4 29 / 47

Frequency-hiding Dependency-preserving Encryption for Outsourced - PowerPoint PPT Presentation

Frequency-hiding Dependency-preserving Encryption for Outsourced Databases ICDE17 Boxiang Dong 1 Wendy Wang 2 1 Montclair State University Montclair, NJ 2 Stevens Institute of Technology Hoboken, NJ April 20, 2017

Dependency Dependency- -Based Automatic Evaluation Based Automatic Evaluation Dependency

Forensic Data Hiding Optimized for JPEG 2000 Dieter Bardyn, Johann A. Briffa, Ann Dooms and Peter

FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY PRESERVING SURGERY FERTILITY

Frequency Decomposition The base frequency or the fundamental frequency is the lowest frequency.

Practical Solutions for Format- Preserving Encryption Mor Weiss Joint work with Boris Rozenberg

Graph Based Dependency Parsing Wei Qiu December 15, 2011 . . . . . . Graph Based

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear

Lecture 19: Dependency Grammars and Dependency Parsing Julia Hockenmaier juliahmr@illinois.edu

Usable Encryption Class Presentation for CMSC 818D Wei Bai S Application S Hardware

Ruby Monstas Session 17: Interlude: Encryption Encryption What comes to mind if you think about

Functional Encryption Lecture 27 Functional Encryption Plain encryption: for secure

Privacy preserving data mining randomized response and association rule hiding Li Xiong

A covert channel A covert channel hiding data in in packet headers packet headers hiding data

Information hiding Information hiding Notice how a user of a service being provided by an

The Curse of Small Domains New Attacks on Format-Preserving Encryption Viet Tung Hoang Stefano

Format-Preserving Encryption Somitra Kumar Sanadhya Indian Institute of Technology Ropar August

A fixed point theorem for Boolean networks expressed in terms of forbidden subnetworks Adrien

1 Algorithm for Identifying Loop Invariant Code Algorithm for Identifying Loop Invariant Code

Solar-cycle variation of oscillation frequencies and surface magnetic field Shao Min Tan

Deep TEN: Texture Encoding Network Hang Zhang, Jia Xue, Kristin Dana 1 Hang Zhang Highlight and

Stick Graphs with Length Constraints Steven Chaplick, Philipp Kindermann, Andre L offler,

Lognormal distribution of subjects by income XL5A-V0H XL5A: 0H XL5A: 0H 2014 Schield

Global Register Allocation - 2 Y N Srikant Computer Science and Automation Indian Institute of

Security Testing Checking for what shouldnt happen Azqa Nadeem PhD Student @ Cyber Security