se secur ure data a type pes a a simp mple ab abstract
play

Se Secur ure Data a Type pes: A A Simp mple Ab Abstract - PowerPoint PPT Presentation

Se Secur ure Data a Type pes: A A Simp mple Ab Abstract ction for Co Confidentiality-Preserving g Da Data An Analytics Savvas Savvides, Julian Stephen, Masoud Saeida Ardekani, Vinaitheerthan Sundaram, Patrick Eugster Purdue


  1. Se Secur ure Data a Type pes: A A Simp mple Ab Abstract ction for Co Confidentiality-Preserving g Da Data An Analytics Savvas Savvides, Julian Stephen, Masoud Saeida Ardekani, Vinaitheerthan Sundaram, Patrick Eugster Purdue University

  2. Introduction Query Data Leakage Results Requirement: Confidentiality–preserving query execution 2

  3. Preserving Confidentiality • Fully homomorphic encryption (FHE) • Can express arbitrary computations • High overhead for complex queries • Partially homomorphic encryption (PHE) • Allows specific operations over encrypted data • E.g., addition, multiplication, comparisons, pattern match • Mutually incompatible (limited expressiveness) 3

  4. Current PHE-based Solutions Drawbacks Trusted Client Side Untrusted Cloud 1. Compilation transparent to data constraints 2. Compilation largely ignores encryption scheme properties ASHE Paillier [OSDI’16] [EUROCRT’99] E(x) + E(y) � � E(x) + y � E(x) × y � Performance symmetric asymmetric Security high high 3. No/Limited use of trusted service a) Give up (CryptDB [SOSP’11]) b) Split execution (Monomi [VLDB’13]) Trusted Service c) Re-encryption (Crypsis [ASE’14]) 4

  5. Cuttlefish Secure data types (SDTs) Trusted Client Side Untrusted Cloud • Capture constraints and structure of data Compilation Techniques Encryption scheme properties Capture supported operations, • performance and security guarantees of encryption schemes à Compilation techniques Planner Engine More optimized queries • à Planner engine • More efficient deployment Can utilize trusted hardware • Trusted Service 5

  6. Secure Data Types • Sensitivity levels • high , low , public • Accounts for different security guarantees offered by cryptosystems • Data range • + / - numbers • Fixed range s, e.g., 100-200 • Composite types • Values containing multiple parts, e.g., dates, addresses, phones • E.g., composite [(4:int[ + ])-(2:int[ range (1-12)])-(2:int[ range (1-31)])] • Also: decimal accuracy, uniqueness, tokenization, enumerated types, etc. 6

  7. Compilation Techniques • Expression rewriting • Simplify expressions involving composite types • E.g., d ≥ 2010-01-01 AND d < 2011-01-01 à y ≥ 2010 OR (y == 2010 AND m ≥ 01) ... à y == 2010 • Condition expansion • Expand conditions to aggressively filter rows, based on range information • E.g., x + y > c Short-circuit à y > (c – max (x)) AND x + y > c • Similarly for [ + , - , × , / ] and [ == , > , ≥ , < , ≤ ] 7

  8. Compilation Techniques (cont.) • Selective encryption • Choose encryption scheme that does not require use of trusted service • E.g., (x + y) × z where z is public à (ashe(x) + ashe(y)) × z à (paillier(x) + paillier(y)) × z ASHE ASHE Paillier Paillier [OSDI’16] [OSDI’16] [EUROCRT’99] [EUROCRT’99] E(x) + E(y) E(x) + E(y) � � � � E(x) + y E(x) + y � � E(x) × y E(x) × y � � Performance Performance symmetric symmetric asymmetric asymmetric • See paper for more compilation techniques 8

  9. Planner Engine A A A A Requires trusted service B B B B Split execution C C C C Re-encryption D D D D E E E E Greedy split execution Greedy re-encryption Cuttlefish heuristic • Cuttlefish Heuristic • Use a cost model to choose between re-encryption and split execution at each step • Utilize trusted hardware , if available, to deploy an in-cloud re- encryption service 9

  10. Evaluation Cuttlefish • Apache Spark 2.1 • Cuttlefish-TH: trusted service deployed using trusted hardware (Intel SGX) • Cuttlefish-CS: trusted service deployed using remote client side Setup • TPC-H and TPC-DS (subset) at scale 100 • Cloud: 20 AWS m4.xlarge instances (4 CPUs and 16GB memory) • Client: 1 AWS c4.2xlarge instance (8 CPUs and 15GB memory) 10

  11. System Performance Plaintext Cuttlefish-TH 800 Cuttlefish-CS Monomi 700 Crypsis 600 Latency (s) 500 400 300 200 100 0 Q01 Q02 Q03 Q04 Q05 Q06 Q07 Q08 Q09 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 Q21 Q22 TPC-H Average overhead compared to plaintext Average performance gains • Cuttlefish-TH: 2.34 × • 3.35× faster than Monomi • Cuttlefish-CS: 3.05 × • 3.71× faster than Crypsis 11

  12. Compilation Techniques Performance 900 Plaintext 800 Cuttlefish-TH - Expression rewriting 700 - Condition expansion 600 - Selective encryption Latency (s) - Efficient encryption 500 400 300 200 100 0 Q03 Q07 Q19 Q27 Q34 Q42 Q43 Q46 Q52 Q53 Q55 Q59 Q63 Q65 Q68 Q73 Q79 Q89 Q98 TPC-DS Average overhead compared to plaintext With Compilation techniques: 1.69 × • Without Compilation techniques: 4.23 × • 12

  13. Conclusion • Cuttlefish enables efficient data analytics in public clouds • Secure data types • Capture constraints and structure of data • Compilation techniques • Enable more efficient queries • Planner engine • Optimized use of trusted service 13

  14. Thank you! 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend