Improved reconstruction attacks using range query leakage - PowerPoint PPT Presentation

Improved reconstruction attacks using range query leakage Marie-Sarah Lacharité Brice Minaud Kenny Paterson Information Security Group

Application Setting

Storing Records in the Cloud value of record ( N possible values) record identifier (unique) R records 3

Application Scenario give me all records with values in the range [1975, 1979] client 4

Access Pattern Leakage give me all records with values in the range [1975, 1979] client record identifiers 5 OPE, ORE schemes, POPE, [HK16], Blind seer, [Lu12], [FJKNRS15],…

Access Pattern Leakage and Rank Leakage give me all records with values rank in the range [1975, 1979] a+1 client b record identifiers 6 FH-OPE, Lewi-Wu, Arx, Cipherbase, EncKV,…

Assumptions 1. Data is dense: all values appear in at least one record. 2. Queries are uniformly distributed . Target : full reconstruction: find the value associated with each record. Best previous result (Kellaris et al., CCS 2016): Full reconstruction by analysing access pattern leakage from O( N 2 log N ) queries. 7

Our Main Results (eprint 2017/701) Full reconstruction with O( N log N ) queries • – in fact, expected N · (3 + log N ). Approximate reconstruction with relative accuracy ε from • O( N · (log 1/ ε )) queries – in fact, expected 5/4 · N · (log 1/ ε ) + O(N). Approximate reconstruction using an auxiliary distribution and • rank leakage. – more efficient in practice, evaluation via simulation. – applies in the non-dense case too, giving a new attack on OPE/ORE schemes. 8

(1, 1) Uniform Queries: Uniform Endpoints vs. Uniform Ranges ( N =10) 9 (1, 2) (1, 3) (1, 4) (1, 5) (1, 6) (1, 7) (1, 8) (1, 9) (1, 10) (2, 2) (2, 3) (2, 4) (2, 5) (2, 6) (2, 7) (2, 8) (2, 9) (2, 10) (3, 3) (3, 4) (3, 5) (3, 6) (3, 7) (3, 8) (3, 9) (3, 10) (4, 4) (4, 5) (4, 6) (4, 7) (4, 8) (4, 9) (4, 10) (5, 5) (5, 6) (5, 7) (5, 8) (5, 9) (5, 10) (6, 6) (6, 7) (6, 8) (6, 9) Uniform ranges Uniform endpoints (6, 10) (7, 7) (7, 8) (7, 9) (7, 10) (8, 8) (8, 9) (8, 10) (9, 9) (9, 10) (10, 10)

Distribution of Left Endpoints: Uniform Endpoints vs. Uniform Ranges ( N =10) Uniform endpoints Uniform ranges 1 2 3 4 5 6 7 8 9 10 10

Coupon Collector’s Problem 800 700 600 500 Expected 400 number of draws 300 N · (1 + log N) 200 N · H(N) 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 11 Number of coupons (N)

Coupon Collector’s Problem 800 700 600 500 Expected 400 number of draws 300 N · (1 + log N) 200 N · H(N) 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 12 Number of coupons (N)

Attack 1: Full Reconstruction

Motivating Example (with Rank Leakage) • Suppose left endpoints of query intervals are chosen uniformly at random. • Wish to observe at least 1 query with each of the N possible left endpoints. • Expected number of queries needed is at most N · (1 + log N ). hidden leaked [x,y] a = rank(x-1) b = rank(y) matching IDs [20,25] 1300 1500 M 20 [1,18] 0 1200 M 1 [55,125] 3100 4400 M 55 [2,10] 500 800 M 2 [7,98] 700 4200 M 7 relabelled for convenience 14

Motivating Example (with Rank Leakage) 501 … … 4400 1 rank M 1 M 2 M 3 …. M 1 – U i >1 M i M 2 – U i >2 M i M N-1 – M N M N 15

Full Reconstruction (with Rank Leakage) • Now suppose queries have ranges chosen uniformly at random. • We present a data-optimal algorithm (fails ð full reconstruction is impossible). • Expected number of sufficient queries is at most N · (2 + log N ) for N ≥ 27. • Main idea: partition, then sort (easy with rank leakage, harder without). • Expected number of necessary queries is at least 1/2 · N · log N – O(N) for any algorithm . 16

Full Reconstruction (with Rank Leakage) 80000 70000 60000 50000 Expected number 40000 of queries O( N 2 log N ) 30000 KKNO16 20000 This work 10000 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 17 Number of coupons (N)

Full Reconstruction (with Rank Leakage): Partitioning Step record matched query? ID 1 2 3 4 5 6 7 20 ü ü û û ü û û 23 ü û û ü ü ü ü 29 û ü ü û û ü û 89 û ü ü û ü ü û 193 ü ü û û ü ü ü … • Equality of matching defines a partition of records. • Records in same class of partition cannot be distinguished. • For complete reconstruction, we need N classes – one class 18 per value.

Full Reconstruction (with Rank Leakage): Partitioning Step record matched query? ID 1 2 3 4 5 6 7 20 ü ü û û ü û û 23 ü ü û û ü ü ü [1,100] [18,82] [16,96] [16,30] [21,61] 29 û ü ü û û ü û 89 û ü ü û ü ü û 193 ü ü û û ü ü ü … Can also deduce from rank leakage that, e.g., records 23 and 193 have ranks in [21,30], by intersecting rank intervals. 19

Full Reconstruction (with Rank Leakage): Partitioning Step 1 2 Order partition 3 into N classes by rank 4 Ranks 5 [21,30] records 23 and 193 6 (and more) 20

Full Reconstruction (with Rank Leakage): Proof Intuition • Hard part is to show that O( N log N ) queries suffice with a small constant. • Proof consists of showing that if certain favourable queries are made, then partitioning succeeds in constructing N classes. • Roughly speaking, for our proof we hope for queries on ranges: 1. [x,*] for all 1 ≤ x ≤ N /2 (left coupons) 2. [*,y] for all N /2+1 ≤ y ≤ N (right coupons) 3. [ N /2+1,y] and [x, N ] for some y ≥ x. • Assuming these all arise, then a combinatorial argument establishes the success of the partitioning step. • First two cases are essentially a pair of coupon collector problems – success with high probability with O( N log N ) queries. • Third case is a high probability event: 1 - e - Q /(2 N+2) for Q queries. 21

Full Reconstruction ( without Rank Leakage) • Can only recover values up to reflection . • Data-optimal algorithm (fails _ full reconstruction is impossible). • Expected number of sufficient queries is at most N · (3 + log N ) for N ≥ 26 • Partition (as before), then sort*. • Expected number of necessary queries is at least 1/2 · N · log N – O(N) - for any algorithm . *Not quite. 22

Full Reconstruction (without Rank Leakage): Sorting Step all records M 7 M 39 M 72 1 or N M 36 M 93 M 58 M 28 M 9 M 40 M 18 23 Interval of size N -1

Full Reconstruction (without Rank Leakage): Sorting Step – Extending all records M 25 M 36 M 22 M 17 T M 62 T M 81 T … 24

Full Reconstruction (without Rank Leakage): Sorting Step – Extending all records 25

Full Reconstruction (without Rank Leakage): Sorting Step all records M 3 M 39 M 27 M 13 T M 52 T M 99 T 26

Full Reconstruction (without Rank Leakage): Sorting Step all records … 27

Full Reconstruction (without Rank Leakage): Proof Intuition • Hard part is again to show that O( N log N ) queries suffice, with a small constant. • Proof again consists of showing that if certain favourable queries are made, then partitioning succeeds in constructing N classes. • Coupon collecting bounds then establish that O( N log N ) queries are enough. 28

Attack 2: Approximate Reconstruction

Approximate Reconstruction Attack (without Rank Leakage) • Recover values up to reflection and with relative error ε. • Expected number of sufficient queries is 5/4 · N · (log 1/ ε ) + O(N). • Expected number of necessary queries is at least 1/2 · N · (log 1/ ε) – O(N) for any algorithm. • Not data-optimal without rank leakage (but is with it) 30

Coupon Collection ( N =125) Collecting n of 125 coupons 700 600 500 Expected 400 number of draws 300 200 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 Coupon number ( n ) 31

Coupon Collection ( N =125) Collecting fraction (1- ε) of 125 coupons 700 ε = 0.04 ε = 0.08 600 ε = 0.12 ε = 0.16 500 ε = 0.2 Expected 400 number of draws 300 200 100 0 0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100 105 110 115 120 125 Coupon number ( n ) 32

Approximate Reconstruction: Old Partitioning Method Doesn't Work all records M 7 M 39 M 72 M 36 M 93 M 58 M 28 M 9 M 40 M 18 33

Approximate Reconstruction: Partitioning Step 1. Pick any record r. 34

Approximate Reconstruction: Partitioning Step 2. Intersect all queries matching r to get M . 35

Approximate Reconstruction: Partitioning Step 2. Intersect all queries matching r to get M . M 36

Approximate Reconstruction: Partitioning Step 3. Find q L and q R : q L ∩ q R = M and | q L U q R | maximised. q L q R M 37

Approximate Reconstruction: Partitioning Step 4. Find q' L : q L ∩ q' L ≠ ∅ , q' L ∩ q R ⊆ M , | q L U q' L | maximised. q' L q L q R M 38

Approximate Reconstruction: Partitioning Step 5. Find q' R : q R ∩ q' R ≠ ∅ , q' R ∩ q L ⊆ M , | q R U q' R | maximised. q' L q L q R M q' R 39

Approximate Reconstruction: Partitioning Step 6. Start over if not every record is in q L U q' L U q R U q’ R . q' L q L q R M q' R 40

Improved reconstruction attacks using range query leakage - PowerPoint PPT Presentation

Improved reconstruction attacks using range query leakage Marie-Sarah Lacharit Brice Minaud Kenny Paterson Information Security Group Application Setting Storing Records in the Cloud value of record ( N possible values) record identifier

Improved Reconstruction Attacks on Encrypted Data Using Range Query Leakage Marie-Sarah

Encrypted Search: Leakage Attacks Seny Kamara How do we Deal with Leakage? Our definitions

Digital Leakage Today Analog and Digital Leakage LTE interference Kendall Robinson Regional

3D RECONSTRUCTION Reconstruction method Reconstruction from images Reconstruction from video

Improve Query Performance with the Query Log Analyzer Kees Vegter Field Engineer Query Log

Query Execution 2 and Query Optimization Instructor: Matei Zaharia cs245.stanford.edu Query

Spatial Range Query in Sensor Spatial Range Query in Sensor Networks Networks Jie Gao Computer

Delaunay Triangulation: Applications Reconstruction Meshing 1 Reconstruction From points 2 -

Carbon leakage: theory, evidence and policy PMR Webinar on Carbon Leakage John Ward November 24

Encrypted Search: Leakage Suppression Seny Kamara How Should we Handle Leakage? Approach #1:

Query Processing Relevance feedback; query expansion; Web Search 1 Overview Indexes Query

Multi Multi-dimensional Data and Spatial Range dimensional Data and Spatial Range Query in

Reconstructing Encrypted Data Using Range Query Leakage Marie-Sarah Lacharit, Brice Minaud,

Query Understanding: A Manifesto Daniel Tunkelang queryunderstanding.com Overview What is

Perfect Query FORMULA 5 critical sections in every successful query letter (c) 2019

Query Op)miza)on 1 Query op)miza)on Given an SQL query,

Problems of Network Coding in P2P - and how to overcome it Christian Schindelhauer joint work

Probability 2: Random variables and Expectations E [ X + Y ] = E [ X ] + E [ Y ] Review Some

Last Time... Sanity Check Let X be a RV that takes on values in A . Expectation describes the

Foundations of Computing II Lecture 12: Multiple Random Variables, Linearity of Expectation.

Transport problems 18.S995 - L32 dunkel@math.mit.edu Root systems Katifori lab, MPI Goettingen

Probabilistic Programming Fun but Intricate Too! Joost-Pieter Katoen with Friedrich Gretz, Nils

Parametric Models Part II: Expectation-Maximization and Mixture Density Estimation Selim Aksoy

Covariance in Unsupervised Learning of Probabilis6c Grammars Cohen