Scrypt is Maximally Memory Hard Joël Alwen Binyi Chen IST Austria UCSB Krzysztof Pietrzak Leonid Reyzin Stefano Tessaro IST Austria Boston University UCSB
Password hashing: Store a hash of a password + salt $5$MnfsQ4iN$ZMTppKN16y/tIsUYs/ obHlhdP.Os80yXhTurpBMUbA5 password F F(salt || password) salt Honest users can still login quickly . ≈ 1 evaluation of F F is moderately hard Brute-force attack is infeasible . Many evaluations of F Traditional hardness metric: Time complexity (e.g., PKCS #5)
ASIC-resistance better parallelization, pipelining for speedup; lower energy costs … Honest users Adversaries (General-purpose CPU) (ASICs) cost per F eval = 𝑫 ′ ≪ 𝑫 cost per F eval = 𝑫 Can we enforce 𝑫 ′ ≈ 𝑫 ? Idea: Memory costs (e.g., on-chip area, access time, $-cost) are platform independent
Memory-hard functions (MHFs) [Percival, 2009] Fast evaluation of MHF F ⇒ Small memory ⇒ slow large memory evaluation of MHF F
Memory-hardness, more precisely Goal: Maximize cumulative memory complexity (CMC) for any (possibly parallelized ) strategy to evaluate F. Memory usage 𝑈 0 Time Time needed to evaluate F 𝑈 CMC = σ 𝑢=0 Memory(𝑢) [Alwen and Serbinenko, STOC ’ 15]
Memory-hardness Memory-hardness was de- facto requirement for PHC Many memory-hard candidates: Argon2d, Argon2i, Scrypt, Lyra2, Balloon hashing, Catena, Yescrypt , …… Can we build provably memory-hard functions?
Towards optimal memory hardness Previous provably MHFs [AS15,BCS16,ABP17] are iMHFs: data-independent memory access patterns! memory memory 𝐽 1 𝑃 1 MHF 𝐽 2 MHF 𝑃 2 Two issues raised by Alwen and Blocki: Can data- (1) No iMHF achieves optimal memory hardness. dependence help? (2) Practical iMHFs are even less memory hard for parallel evaluation strategies.
This paper: Scrypt is optimally memory hard • Very first conjectured MHF: Proposed by Colin Percival in 2009 • Used within PoWs in Litecoin • Inspired the design of Argon2d – one of the winners of Password Hashing Competition • Covered by RFC 7914
Take home message: Very first example of function with provably optimal memory hardness. + it is practical, already in use, and relatively simple Finding such proof has been a surprisingly hard problem: • [Percival, 2009] is incorrect • [ACKKPT16] only gave restricted result No iMHF achieves optimal memory hardness
Roadmap 1. The Scrypt function Definition, memory-hardness intuition, and challenges 2. Optimal memory hardness of Scrypt 3. Conclusions
Core of Scrypt: ROMix 𝐈 : A Salsa20 based “hash function” Modeled as a random oracle! with output length 𝒙 . 𝑌 0 𝑌 1 𝑌 2 𝑌 𝑜−1 𝑌 3 𝑌 𝑜−2 𝐶 0 𝐈 𝐈 𝐈 𝐈 𝐈 𝐈 input 𝑫 𝟏 = 𝟑 𝑫 𝟐 = 𝒐 − 𝟐 𝐶 2 𝐶 3 𝐶 1 𝐒𝐏𝐍𝐣𝐲 𝑌 0 𝐶 0 𝐈 𝐈 𝐈 𝐈 𝐷 0 = 𝐶 0 mod 𝑜 𝐷 3 𝐷 1 = 𝐶 1 mod 𝑜 𝐷 2 𝐷 𝑜−1 𝑜: a tunable parameter. e.g., 𝑜 = 2 14 , 𝑥 = 1 KB
ROMix: Answering challenges 𝑌 1 𝑌 0 𝑌 2 𝑌 𝑜−1 𝑌 3 𝐶 0 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 𝐈 input 𝑫 𝟐 = 𝒐 − 𝟐 𝑫 𝟏 = 𝟑 𝐶 1 𝐶 2 𝐶 3 𝐒𝐏𝐍𝐣𝐲 𝑌 0 𝐶 0 𝐈 𝐈 𝐈 𝐈 challenge 𝑫 𝟏 𝑫 𝟐 = 𝑜 − 1 𝑫 𝟑 revealed! 𝑫 𝟒 𝑫 𝒐−𝟐 learnt after answering all 𝑫 𝒌 𝐷 0 , 𝐷 1 , … 𝐷 𝑜−1 unpredictable challenges : Useful to 1. Need to know 𝑌 𝐷 𝑘 to learn 𝐷 𝑘+1 abstract 2. Need to answer all challenges to this! complete the evaluation
Abstraction: Round Game 𝐷 𝑘 𝑌 1 𝑌 0 𝑌 2 𝑌 3 𝑌 𝑜−1 𝐈 𝑌 4 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 input Abstract 2 nd phase: challenges are 𝐈 -dependent random! For all round 𝑘 = 0, … , 𝑜 − 1 : 𝐈 𝐷 𝑘 ⟵ {0,1, … , 𝑜 − 1} 𝑌 𝐷 𝑘 𝑌 0 Adversary Challenger Adversary’s goal: Reduce its own CMC for answering all challenges! 𝑈 CMC = Cumulative Memory Complexity = σ 𝑢=0 Memory(𝑢)
Round game – Naïve strategy Init: For round 𝑘 = 0 to 𝑜 − 1 : Mem 0 ← 𝑌 0 𝐷 𝑘 for 𝑗 = 1, … , 𝑜 do Mem 𝑗 ← 𝐈( Mem 𝑗 − 1 ) 𝑌 𝐷 𝑘 Upon challenge 𝐷 𝑘 : return Mem [𝐷 𝑘 ] CMC = Ω(𝑜 2 𝑥) Memory Θ(𝑜 × 𝑥) remember it! … Θ(𝑜) 𝐷 0 𝐷 1 𝐷 𝑜−1 𝐷 2
Round game – Memory-less strategy Upon challenge 𝐷 𝑘 : For round 𝑘 = 0 to 𝑜 − 1 : 𝑌 = 𝑌 0 𝐷 𝑘 for 𝑗 = 1, … , 𝐷 𝑘 do 𝑌 ← 𝐈(𝑌) return 𝑌 𝑌 𝐷 𝑘 CMC = Ω(𝑜 2 𝑥) again! Memory … 𝐷 1 𝐷 3 𝐷 0 𝐷 𝑜−1 𝐷 2 Expected= Θ(𝑜)
Previous two strategies are special cases: consistent memory size CMC = Ω(𝑜 2 𝑥) Naïve strategy CMC = Ω(𝑜 2 𝑥) Memory-less strategy Memory Memory … … 𝐷 0 𝐷 1 𝐷 2 𝐷 3 𝐷 𝑜−1 𝐷 0 𝐷 1 𝐷 𝑜−1 𝐷 2 More general strategy: memory consumption can vary a lot e.g., forget values, re-compute afterward reduce memory afterward 𝑈 𝑗 =time to answer large memory at the time the 𝑗 -th challenge a challenge is revealed Memory 𝑈 Time …… 1 𝑫 𝟏 𝑫 𝟐 𝑫 𝟑 Goal: prove CMC = Ω(𝑜 2 𝑥) !
Memory hardness: intuition Intuition: Answering challenge fast requires large state! answer quickly ⇒ small initial state ⇒ answer slowly large initial state Memory Time challenge 𝑫 𝟑 𝑫 𝟒 𝑫 𝟓
Single-shot memory-time trade-off Simplifying assumption: upon learning challenge 𝐷 𝑘 , adversary only stores 𝑞 of the values 𝑌 0 , … , 𝑌 𝑜−1 𝐷 𝑜 = 10, 𝑞 = 3 𝑘 stored 𝐈 𝑌 0 𝑌 1 𝑌 4 𝑌 5 𝑌 2 𝑌 3 𝑌 6 𝑌 7 𝑌 9 𝑌 8 dist = 3 Fact: Avg-distance from 𝑌 𝐷 𝑘 to closest stored 𝑌 𝑗 preceding 𝑌 𝐷 𝑘 is 𝑜/2𝑞 Regardless of parallelism, as computation of 𝑌 -values is inherently sequential! Expected time to answer the challenge is 𝑜/2𝑞 ≈ |memory|
How to translate this intuition into a memory- hardness proof for ROMix? Three technical barriers: 1. Adversary stores arbitrary information e.g., XOR of 𝑌 𝑗 values, halves of 𝑌 𝑗 , reconstruct information adaptively on challenges, etc. [ACKKPT16] considered restricted strategies and exhibited Focus on round games where general storing strategies can help! 𝟐 and 𝟑 2. Memory variation during computation single-shot memory-time trade-off not enough! 𝑜 2 𝑥 [ACKKPT16] only shows CMC = Ω( log 2 (𝑜) ) 3. 3. 𝐈 -dependent challenges, as opposed to truly random see the paper!
Roadmap 1. The Scrypt function 2. Optimal memory hardness of Scrypt Model, theorem, and proof approach 3. Conclusions
The parallel random oracle model [Alwen and Serbinenko, STOC ’ 15] 𝐈 𝐈 ROMix (𝑌 0 ) arbitrary arbitrary … computation computation 𝑌 0 𝑇 1 𝑇 2 𝑇 𝑈 𝑍 new state new state initial state At each step: Adv asks one batch of parallel 𝐈 queries + performs unbounded computation 𝑈 Goal of adv: minimize CMC = σ 𝑗=1 |𝑇 𝑗 |
Main Theorem. For any adversary A evaluating ROMix , 25 ∙ 𝑜 2 ∙ (𝑥 − 4 ⋅ log(𝑜)) 1 CMC ( A ) ≥ w/ overwhelming probability over the choice of 𝐈 . The 4log(𝑜) loss is inherent in the proof. Ω 𝑜 2 𝑥 clearly best possible for any construction making 𝑜 queries to 𝐈 . Naïve strategy: Make 𝑜 calls, remember all outputs
Proof strategy: step 1 Green (memory usage at this step) 𝑈 𝑗 =time to answer the is inversely proportional to orange ( 𝑈 𝑗 ) 𝑗 -th challenge ≥ Ω(𝑜𝑥) Memory 𝑈 𝑗 challenge 𝑫 𝒋 𝑫 𝒋+𝟐 Memory-time trade-off ⇒ lower bound on memory The memory-time trade-off holds true for adv storing 𝑌 -values even if the adv stores arbitrary information!
Single-shot memory-time trade-off 𝐷 𝑌 1 𝑌 0 𝑌 2 𝑌 𝑜−1 𝑌 3 𝑌 4 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 𝐈 input state 𝑈 steps 𝐵 𝐈 ( , 𝐷) 𝑎 𝐷 recover 𝑌 𝐷 𝑎: arbitrary computation on 𝐈 -outputs Goal: Lower bound |𝑎| • E.g., pre-computation of 𝐈 ’s entries, as function of 𝑈 and 𝑜 XOR of {𝑌 𝑗 } values, halves of 𝑌 𝑗 [ACKKPT16] : computation on 𝐈 -outputs can help in some round games [This result] : computation on 𝐈 -outputs cannot help for Scrypt!
Single-shot memory-time trade-off 𝐷 𝑌 1 𝑌 0 𝑌 2 𝑌 𝑜−1 𝑌 3 𝑌 4 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 𝐈 input 𝑈 steps 𝐵 𝐈 ( , 𝐷) 𝑎 recover 𝑌 𝐷 Lemma. For all 𝐵 , for most 𝐈 , if 𝑎 ≈ 𝑞𝑥 bits 𝑈 > 𝑜 2𝑞 > 1 Pr 2 𝐷
Lemma. For all 𝐵 , for most 𝐈 , if 𝑎 ≈ 𝑞𝑥 bits 𝑈 > 𝑜 2𝑞 > 1 Pr 2 𝐷 Proof idea: If adversary 𝐵 𝐈 (𝑎, 𝐷) 𝐵 𝐈 (𝑎, 𝐷) can output or answers too fast for query many 𝑌 𝑗 values most challenges 𝐷 w/o querying 𝐈 first Can compress the oracle 𝐈 using state 𝑎 Cannot be true for too many H : random oracle is incompressible [Dwork, Naor and Wee, Crypto’05], [ Alwen and Serbinenko, STOC ’ 15]
Proof strategy: step 2 Technical barriers: 1. Adversary stores arbitrary information Single-shot memory-time trade-off for arbitrary adv 2. Memory variation during computation Single-shot Optimal CMC memory-time lower bound for trade-off the round game Generalize
Recommend
More recommend