A Survey of Oblivious RAMs
David Cash
IBM
A Survey of Oblivious RAMs David Cash IBM Securely Outsourcing - - PowerPoint PPT Presentation
A Survey of Oblivious RAMs David Cash IBM Securely Outsourcing Memory Server Goal : Store, access, and update data on an untrusted server. Mem[0] a Mem[1] b Mem[2] c Write(i,x) Read(j) Mem[i] d Client Mem[j] e Untrusted
IBM
2
Server
Write(i,x) Mem[0] Mem[1] Mem[N] Mem[2] Mem[N-1] Mem[i] d
Mem[j] e a b c
Client
y z
“Untrusted” means:
Goal: Store, access, and update data on an untrusted server.
Read(j)
3
Server
Mem[0] Mem[1] Mem[N] Mem[2] Mem[N-1] Mem[i] d
Mem[j] e a b c Client Cache Op(arg) e Op1(arg1) x1 Opt(argt) xt
An ORAM emulator is an intermediate layer that protects any client (i.e. program). ORAM will issue operations that deviate from actual client requests. Correctness: If server is honest then input/
Security: Server cannot distinguish between two clients with same running time. Client
4
Assumption #1: Server does not see data. Store an encryption key on emulator and re-encrypt on every read/write. Assumption #2: Server does not see op (read vs write). Every op is replaced with both a read and a write. Assumption #3: Server is honest-but-curious. Store a MAC key on emulator and sign (address, time, data)
Opn(in)
5
What’s left to protect is the “access pattern” of the program. Definition: The access pattern generated by a sequence (i1, ..., in) with the ORAM emulator is the random variable (j1, ... , jT) sampled while running with an honest server. Server
Cache Op1(i1) Op1(j1) OpT(jT) Op2(i2)
Definition: An ORAM emulator is secure if for every pair of sequences of the same length, their access patterns are indistinguishable.
Op1(j2)
Client
6
Assumption #3: Server is honest-but-curious. Store a MAC key on client and sign (addr, time, data) on each op... Simple authentication does not work: What do we check with timestamp? It does work if scheme supports “time labeled simulation” Means system can calculate “last touched” time for each index at all times. Then can check if server returned correct (addr, time, data) Some of the recent papers might not support this.
7
RAM
for the encryption and authentication
8
PIR: Oblivious transfer without sender security (i.e. receiver may learn more than requested index) Some differences: In ORAM... In PIR... Server data changes with each operation Server data does not change Server only performs simple read/write ops Server performs “heavier” computation Client may keep state between queries Client does not keep state
9
N - number of memory slots Efficiency measures:
divided by # of ops issued by client
simulator to respond to any given call by program
between ops
during processing of an op Parameter: Can also look at scaling with size of memory slots. (Not today)
10
Example #1: Store everything in ORAM simulator cache and simulate with no calls to server. Client storage = N. Amortized and worst-case communication overhead = N. Essentially optimal, but assumption does not hold in practice. Example #2: Store memory on server, but scan entire memory
Example #3: Assume client accesses each memory slot at most once, and then permute addresses using a PRP.
11
Theorem (GO’90): Any ORAM emulator must perform Ω(t log t) operations to simulate t operations. Proved via a combinatorial argument. Theorem (BM’10): Any ORAM emulator must either perform Ω(t log t log log t) operations to simulate t operations or use storage Ω(N2-o(1)) (on the server). They actually prove more for other computation models.
12
In order to be interesting, an ORAM must simultaneously provide
Desirable features for an “optimal ORAM”:
to obliviously access data w/o communicating amongst themselves between queries. Requires op counters.
13
simulate general Turing machines
gave first interesting construction
14
= covered in this talk
and SSS’12
Nickname Client Memory Client Storage Server Storage Worst-Case Overhead Amortized Overhead G’87 “√n” O(1) O(1) O(n) O(n log2 n) O(√n log2 n) O’90 “Hierarchical” O(1) O(1) O(n log n) O(n log2 n) O(log3 n) OS’97 “Unamortized √n” O(1) O(1) O(n) O(√n log2 n) O(√n log2 n) OS’97 “Unamortized Hierarchical” O(1) O(1) O(n log n) O(log3 n) O(log3 n) WS’08 “Merge sort ” O(√n) O(√n) O(n log n) O(n log n) O(log2 n) GM’11 “Cuckoo 1” O(1) O(1) O(n) O(n) O(log2 n) KLO’11 “Cuckoo virtual stash” O(1) O(1) O(n) O(n) O(log2 n/ log log n) GM’11 “Cuckoo 2” O(nδ) O(nδ) O(n) O(n) O(log n) GMOT’11 “Republishing OS’97 Pt 1” O(1) O(1) O(n) O(√n log2 n) O(√n log2 n) GMOT’11 “Extending OS’97” O(nδ) O(1) O(n) O(log n) O(log n) SCSL’11 “Binary Tree” O(1) O(1) O(n log n) O(log3 n) O(log3 n) GMOT’12 “Cuckoo+” O(nδ) O(1) O(n) O(n) O(log n) SSS’12 “Parallel Buffers ” O(√n) O(√n) O(n) O(√n) O(log2 n) You? “Optimal” O(1) O(1) O(n) O(log n) O(log n)
15
16
17
Claim: Given any permutation π on {1 , ... , N}, we can permute the data according to π with a sequence of ops that does not depend on the data or π. This means we move data at address i to address π(i). Proof idea: Use an oblivious sorting algorithm. For each comparison in the sort, read both positions and rewrite them, either swapping the data or not (depending on if π(i) > π(j)).
18
Claim: Given any permutation π on {1 , ... , N}, we can permute the data according to π with a sequence of ops that does not depend on the data or π. Proof idea: Use an oblivious sorting algorithm. For each comparison in the sort, read both positions and rewrite them, either swapping the data or not (depending on if π(i) > π(j)). This means we move data at address i to address π(i). Batcher sorting network: O(N log2 N) comparisons, fast AKS sorting network: O(N log N) comparisons, slow in practice Randomized Shell sort: O(N log N) comparisons, fast, sorts w.p. 1 - 1/poly - Concrete security loss?
19
Claim: Given any permutation π on {1 , ... , N}, we can permute the data according to π with a sequence of ops that does not depend on the data or π. Corollary: Given a key K for a PRP F, we can permute the data according to F(K, · ) using O(1) client memory with a sequence of O(N log N) ops that does not depend on the data or K. Note: Using O(N) client memory we can do this with O(N) ops by reading everything, permuting locally, and then uploading.
To read/write a slot:
read it from DB
read next dummy slot
20
N data slots C dummy slots
Initialization: Pick PRP key. Use it to
with C “dummy” slots.
Client storage: C slots Server storage: N + C slots
21
N data slots C dummy slots
After C ops, cache may be full or we may run out of dummy slots. ⇒ Reshuffle and flush cache after every C reads. Pick new PRF key and shuffle,
that were changed in client cache) Client storage: C slots Server storage: N + C slots
22
Security: Relatively easy to prove. Performance: Client storage: C slots Server Storage: N + C slots Amortized overhead: 1 + (N+C) log(N+C) / C Worst-case overhead: 1 + (N+C) log(N+C)
Server sees an oblivious sort and then C unique, random- looking read/writes before reinitializing.
23
N data slots C dummy slots
Basic observation: We can just put the cache on the DB and read it back each time.
C “cache”slots
To read a slot:
it from main memory
dummy slot
Client storage: O(1) bits Server storage: N + 2C slots Initialization: Same, plus empty cache.
24
Client Memory Client Storage Server Storage Amortized Cost Worst-case Cost O(1) C N + C (C + (N+C) log(N+C))/C 1 + (N+C) log(N+C) O(1) O(1) N + 2C (C2 + (N+C) log(N+C))/C C + (N+C) log(N+C)
Take C = N1/2: #1 #2 Batcher sort ⇒ extra log N factor in costs
Client Memory Client Storage Server Storage Amortized Cost Worst-Case Cost O(1) N1/2 O(N) O(N1/2 log N) O(N log N) O(1) O(1) O(N) O(N1/2 log N) O(N log N)
#1 #2
25
Observation: For our oblivious sorts, all comparisons are predetermined, so the work can be divided up and done in small bursts instead of one big sort.
[Ostrovsky, Shoup’97]
we still need to do reads/writes in between the bursts.
“current”.
26
Current Auxiliary
To read a slot:
update Current’s cache
Auxiliary Initialization: Same, except allocate and two tables and caches.
changed slots will be in aux cache
27
28
to same slots
29
Server storage:
Client storage:
per bucket on average)
30
Read/Write(addr)
31
Computation during Read/Write(red address):
32
average
i+1 using new key After T operations:
33
34
35
36
37
38
39
Key observation: This scheme never uses the value F(Ki, addr)
Why? Suppose client touches for the same address twice. Security proof is more delicate than the first one.
chosen for that level since last read due to shuffling. Using key observation, all reads look like random bucket scans.
40
Worst-case overhead: O(N log3 N)
Average-case overhead: O(log3 N)
Σlog N
i=0 O(i log N) = Σlog N i=0 O(log2 N) = O(log3 N)
Σlog N
i=0 O(2ii log N) = Σlog N i=0 O(N log2 N) = O(N log3 N)
Storage: O(N log N) slots
De-amortized variant: Can shuffle incrementally as before.
41
More advanced sorting: Use O(Nδ) client storage when sorting to save log N factor in communication. (log2 N) amortized cost. Gives O(log3 N) worst-case overhead, doubles server storage
[Williams, Sion, Sotakova’08] [Ostrovsky, Shoup’97]
42
43
Ostrovksky’90 Pinkas-Reinman’10 Storage Amortized Overhead
O(N log3 N) O(N log N) O(N) O(N log2 N)
44
45
h1(A) h1(B)
B A
46
A
h1(C) = h1(A)
B
h2(A)
C
47
B
Look-up is constant-time.
To look up A:
h1(A)? h2(A)?
C A
48
A B C
Failures occur when x items hash to same (x-1) slots in both tables.
D
h1(A) = h1(C) = h1(D) h2(A) = h2(C) = h2(D) Theorem (Pagh-Rodler’01): After (1-ϵ)n insertions probability
In practice, we abort the insertion after a chain of c log n evictions.
49
for 2i slots Server storage:
for each level Client storage:
50
Computation during Read/Write(A):
A
51
prevent overflows
52
Worst-case overhead: O(N log2 N)
shuffle/rehash
Average-case overhead: O(log2 N)
Storage: O(N) slots
log N
O(2ii/2i) =
log N
O(log N) = O(log2 N)
log N
O(2ii) =
log N
O(N log N) = O(N log2 N)
log N
2(1 + ε)2i = O(2log N+1) = O(N)
53
54
We define two clients that server can distinguish:
red data is on next level Both clients start with: Read several blue slots Client 1: Read several red slots Client 2: Then they differ in one last step:
55
Claim: Server can distinguish clients w/ advantage ~1/n6:
pair of buckets (i.e. cause Cuckoo failure)
56
look different from random access Observation: If probability of failure were negligible, then PR’10 is secure.
57
58
59
B
Same tables as before, plus small extra table called a “stash”.
C A
Theorem (Goodrich-Mitzenmacher’11): After (1-ϵ)n insertions probability of failure is O(n-s), where ϵ is constant. Let stash size = s.
60
for 2i slots, and a log N-size stash Server storage:
for each level Client storage:
during shuffling
61
(GMOT’11b)
62
63
[Stefanov, Shi, Song’12]
Several small sub-ORAMs One constant-size cache per sub-ORAM Buffer that can hold one sub-ORAM Position map: For each data slot, index of sub-ORAM holding it. Client needs O(N) storage for position map! ... but it is storing log N bits per slot instead of the slot itself.
64
Read/Write(addr)
to corresponding sub-ORAM on server.
buffer. Background process:
65
allow 1/poly advantage. Concrete numbers may be ok.
Configuration: N1/2 sub-ORAMs implemented with modified GM’10, each of capacity about N1/2.
66
67
Trusted HW Database
Client Server
Authenticated channel
K K
68
size
hardware
69
certain form?
next ORAM (SCSL’11 does this)
size into account is important.
careful about timing attacks.
70