Memory Hard Jol Alwen Binyi Chen IST Austria UCSB Krzysztof - PowerPoint PPT Presentation

Scrypt is Maximally Memory Hard Joël Alwen Binyi Chen IST Austria UCSB Krzysztof Pietrzak Leonid Reyzin Stefano Tessaro IST Austria Boston University UCSB

Password hashing: Store a hash of a password + salt $5$MnfsQ4iN$ZMTppKN16y/tIsUYs/ obHlhdP.Os80yXhTurpBMUbA5 password F F(salt || password) salt Honest users can still login quickly . ≈ 1 evaluation of F F is moderately hard Brute-force attack is infeasible . Many evaluations of F Traditional hardness metric: Time complexity (e.g., PKCS #5)

ASIC-resistance better parallelization, pipelining for speedup; lower energy costs … Honest users Adversaries (General-purpose CPU) (ASICs) cost per F eval = 𝑫 ′ ≪ 𝑫 cost per F eval = 𝑫 Can we enforce 𝑫 ′ ≈ 𝑫 ? Idea: Memory costs (e.g., on-chip area, access time, $-cost) are platform independent

Memory-hard functions (MHFs) [Percival, 2009] Fast evaluation of MHF F ⇒ Small memory ⇒ slow large memory evaluation of MHF F

Memory-hardness, more precisely Goal: Maximize cumulative memory complexity (CMC) for any (possibly parallelized ) strategy to evaluate F. Memory usage 𝑈 0 Time Time needed to evaluate F 𝑈 CMC = σ 𝑢=0 Memory(𝑢) [Alwen and Serbinenko, STOC ’ 15]

Memory-hardness Memory-hardness was de- facto requirement for PHC Many memory-hard candidates: Argon2d, Argon2i, Scrypt, Lyra2, Balloon hashing, Catena, Yescrypt , …… Can we build provably memory-hard functions?

Towards optimal memory hardness Previous provably MHFs [AS15,BCS16,ABP17] are iMHFs: data-independent memory access patterns! memory memory 𝐽 1 𝑃 1 MHF 𝐽 2 MHF 𝑃 2 Two issues raised by Alwen and Blocki: Can data- (1) No iMHF achieves optimal memory hardness. dependence help? (2) Practical iMHFs are even less memory hard for parallel evaluation strategies.

This paper: Scrypt is optimally memory hard • Very first conjectured MHF: Proposed by Colin Percival in 2009 • Used within PoWs in Litecoin • Inspired the design of Argon2d – one of the winners of Password Hashing Competition • Covered by RFC 7914

Take home message: Very first example of function with provably optimal memory hardness. + it is practical, already in use, and relatively simple Finding such proof has been a surprisingly hard problem: • [Percival, 2009] is incorrect • [ACKKPT16] only gave restricted result No iMHF achieves optimal memory hardness

Roadmap 1. The Scrypt function Definition, memory-hardness intuition, and challenges 2. Optimal memory hardness of Scrypt 3. Conclusions

Core of Scrypt: ROMix 𝐈 : A Salsa20 based “hash function” Modeled as a random oracle! with output length 𝒙 . 𝑌 0 𝑌 1 𝑌 2 𝑌 𝑜−1 𝑌 3 𝑌 𝑜−2 𝐶 0 𝐈 𝐈 𝐈 𝐈 𝐈 𝐈 input 𝑫 𝟏 = 𝟑 𝑫 𝟐 = 𝒐 − 𝟐 𝐶 2 𝐶 3 𝐶 1 𝐒𝐏𝐍𝐣𝐲 𝑌 0 𝐶 0 𝐈 𝐈 𝐈 𝐈 𝐷 0 = 𝐶 0 mod 𝑜 𝐷 3 𝐷 1 = 𝐶 1 mod 𝑜 𝐷 2 𝐷 𝑜−1 𝑜: a tunable parameter. e.g., 𝑜 = 2 14 , 𝑥 = 1 KB

ROMix: Answering challenges 𝑌 1 𝑌 0 𝑌 2 𝑌 𝑜−1 𝑌 3 𝐶 0 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 𝐈 input 𝑫 𝟐 = 𝒐 − 𝟐 𝑫 𝟏 = 𝟑 𝐶 1 𝐶 2 𝐶 3 𝐒𝐏𝐍𝐣𝐲 𝑌 0 𝐶 0 𝐈 𝐈 𝐈 𝐈 challenge 𝑫 𝟏 𝑫 𝟐 = 𝑜 − 1 𝑫 𝟑 revealed! 𝑫 𝟒 𝑫 𝒐−𝟐 learnt after answering all 𝑫 𝒌 𝐷 0 , 𝐷 1 , … 𝐷 𝑜−1 unpredictable challenges : Useful to 1. Need to know 𝑌 𝐷 𝑘 to learn 𝐷 𝑘+1 abstract 2. Need to answer all challenges to this! complete the evaluation

Abstraction: Round Game 𝐷 𝑘 𝑌 1 𝑌 0 𝑌 2 𝑌 3 𝑌 𝑜−1 𝐈 𝑌 4 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 input Abstract 2 nd phase: challenges are 𝐈 -dependent random! For all round 𝑘 = 0, … , 𝑜 − 1 : 𝐈 𝐷 𝑘 ⟵ {0,1, … , 𝑜 − 1} 𝑌 𝐷 𝑘 𝑌 0 Adversary Challenger Adversary’s goal: Reduce its own CMC for answering all challenges! 𝑈 CMC = Cumulative Memory Complexity = σ 𝑢=0 Memory(𝑢)

Round game – Naïve strategy Init: For round 𝑘 = 0 to 𝑜 − 1 : Mem 0 ← 𝑌 0 𝐷 𝑘 for 𝑗 = 1, … , 𝑜 do Mem 𝑗 ← 𝐈( Mem 𝑗 − 1 ) 𝑌 𝐷 𝑘 Upon challenge 𝐷 𝑘 : return Mem [𝐷 𝑘 ] CMC = Ω(𝑜 2 𝑥) Memory Θ(𝑜 × 𝑥) remember it! … Θ(𝑜) 𝐷 0 𝐷 1 𝐷 𝑜−1 𝐷 2

Round game – Memory-less strategy Upon challenge 𝐷 𝑘 : For round 𝑘 = 0 to 𝑜 − 1 : 𝑌 = 𝑌 0 𝐷 𝑘 for 𝑗 = 1, … , 𝐷 𝑘 do 𝑌 ← 𝐈(𝑌) return 𝑌 𝑌 𝐷 𝑘 CMC = Ω(𝑜 2 𝑥) again! Memory … 𝐷 1 𝐷 3 𝐷 0 𝐷 𝑜−1 𝐷 2 Expected= Θ(𝑜)

Previous two strategies are special cases: consistent memory size CMC = Ω(𝑜 2 𝑥) Naïve strategy CMC = Ω(𝑜 2 𝑥) Memory-less strategy Memory Memory … … 𝐷 0 𝐷 1 𝐷 2 𝐷 3 𝐷 𝑜−1 𝐷 0 𝐷 1 𝐷 𝑜−1 𝐷 2 More general strategy: memory consumption can vary a lot e.g., forget values, re-compute afterward reduce memory afterward 𝑈 𝑗 =time to answer large memory at the time the 𝑗 -th challenge a challenge is revealed Memory 𝑈 Time …… 1 𝑫 𝟏 𝑫 𝟐 𝑫 𝟑 Goal: prove CMC = Ω(𝑜 2 𝑥) !

Memory hardness: intuition Intuition: Answering challenge fast requires large state! answer quickly ⇒ small initial state ⇒ answer slowly large initial state Memory Time challenge 𝑫 𝟑 𝑫 𝟒 𝑫 𝟓

Single-shot memory-time trade-off Simplifying assumption: upon learning challenge 𝐷 𝑘 , adversary only stores 𝑞 of the values 𝑌 0 , … , 𝑌 𝑜−1 𝐷 𝑜 = 10, 𝑞 = 3 𝑘 stored 𝐈 𝑌 0 𝑌 1 𝑌 4 𝑌 5 𝑌 2 𝑌 3 𝑌 6 𝑌 7 𝑌 9 𝑌 8 dist = 3 Fact: Avg-distance from 𝑌 𝐷 𝑘 to closest stored 𝑌 𝑗 preceding 𝑌 𝐷 𝑘 is 𝑜/2𝑞 Regardless of parallelism, as computation of 𝑌 -values is inherently sequential! Expected time to answer the challenge is 𝑜/2𝑞 ≈ |memory|

How to translate this intuition into a memory- hardness proof for ROMix? Three technical barriers: 1. Adversary stores arbitrary information e.g., XOR of 𝑌 𝑗 values, halves of 𝑌 𝑗 , reconstruct information adaptively on challenges, etc. [ACKKPT16] considered restricted strategies and exhibited Focus on round games where general storing strategies can help! 𝟐 and 𝟑 2. Memory variation during computation single-shot memory-time trade-off not enough! 𝑜 2 𝑥 [ACKKPT16] only shows CMC = Ω( log 2 (𝑜) ) 3. 3. 𝐈 -dependent challenges, as opposed to truly random see the paper!

Roadmap 1. The Scrypt function 2. Optimal memory hardness of Scrypt Model, theorem, and proof approach 3. Conclusions

The parallel random oracle model [Alwen and Serbinenko, STOC ’ 15] 𝐈 𝐈 ROMix (𝑌 0 ) arbitrary arbitrary … computation computation 𝑌 0 𝑇 1 𝑇 2 𝑇 𝑈 𝑍 new state new state initial state At each step: Adv asks one batch of parallel 𝐈 queries + performs unbounded computation 𝑈 Goal of adv: minimize CMC = σ 𝑗=1 |𝑇 𝑗 |

Main Theorem. For any adversary A evaluating ROMix , 25 ∙ 𝑜 2 ∙ (𝑥 − 4 ⋅ log(𝑜)) 1 CMC ( A ) ≥ w/ overwhelming probability over the choice of 𝐈 . The 4log(𝑜) loss is inherent in the proof. Ω 𝑜 2 𝑥 clearly best possible for any construction making 𝑜 queries to 𝐈 . Naïve strategy: Make 𝑜 calls, remember all outputs

Proof strategy: step 1 Green (memory usage at this step) 𝑈 𝑗 =time to answer the is inversely proportional to orange ( 𝑈 𝑗 ) 𝑗 -th challenge ≥ Ω(𝑜𝑥) Memory 𝑈 𝑗 challenge 𝑫 𝒋 𝑫 𝒋+𝟐 Memory-time trade-off ⇒ lower bound on memory The memory-time trade-off holds true for adv storing 𝑌 -values even if the adv stores arbitrary information!

Single-shot memory-time trade-off 𝐷 𝑌 1 𝑌 0 𝑌 2 𝑌 𝑜−1 𝑌 3 𝑌 4 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 𝐈 input state 𝑈 steps 𝐵 𝐈 ( , 𝐷) 𝑎 𝐷 recover 𝑌 𝐷 𝑎: arbitrary computation on 𝐈 -outputs Goal: Lower bound |𝑎| • E.g., pre-computation of 𝐈 ’s entries, as function of 𝑈 and 𝑜 XOR of {𝑌 𝑗 } values, halves of 𝑌 𝑗 [ACKKPT16] : computation on 𝐈 -outputs can help in some round games [This result] : computation on 𝐈 -outputs cannot help for Scrypt!

Single-shot memory-time trade-off 𝐷 𝑌 1 𝑌 0 𝑌 2 𝑌 𝑜−1 𝑌 3 𝑌 4 𝐈 𝑌 𝑜−2 𝐈 𝐈 𝐈 𝐈 input 𝑈 steps 𝐵 𝐈 ( , 𝐷) 𝑎 recover 𝑌 𝐷 Lemma. For all 𝐵 , for most 𝐈 , if 𝑎 ≈ 𝑞𝑥 bits 𝑈 > 𝑜 2𝑞 > 1 Pr 2 𝐷

Lemma. For all 𝐵 , for most 𝐈 , if 𝑎 ≈ 𝑞𝑥 bits 𝑈 > 𝑜 2𝑞 > 1 Pr 2 𝐷 Proof idea: If adversary 𝐵 𝐈 (𝑎, 𝐷) 𝐵 𝐈 (𝑎, 𝐷) can output or answers too fast for query many 𝑌 𝑗 values most challenges 𝐷 w/o querying 𝐈 first Can compress the oracle 𝐈 using state 𝑎 Cannot be true for too many H : random oracle is incompressible [Dwork, Naor and Wee, Crypto’05], [ Alwen and Serbinenko, STOC ’ 15]

Proof strategy: step 2 Technical barriers: 1. Adversary stores arbitrary information Single-shot memory-time trade-off for arbitrary adv 2. Memory variation during computation Single-shot Optimal CMC memory-time lower bound for trade-off the round game Generalize

Memory Hard Jol Alwen Binyi Chen IST Austria UCSB Krzysztof - PowerPoint PPT Presentation

Scrypt is Maximally Memory Hard Jol Alwen Binyi Chen IST Austria UCSB Krzysztof Pietrzak Leonid Reyzin Stefano Tessaro IST Austria Boston University UCSB Password hashing: Store a hash of a password + salt

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

HydroCare HC-44 HydroCare HC-44 Hard Water Problems Hard Water Problems Hard Water Costs You

6/18/2018 When Family Life Gets Hard 1 6/18/2018 When Family Life Gets Hard God

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

Memory Management Ideally programmers want memory that is large fast non

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

Machine Learning for NLP Unsupervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Chapter 5-2: Clu lust ster erin ing Jilles Vreeken Revision 1, November 20 th typos fixed:

INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal &

Clustering A Categorization of Major Clustering Methods Partitioning Methods

HMC-Sim 2.0: A Simulation Platform for Exploring Custom Memory Cube Operations John D. Leidel,

Harmonic Map Let f : T 2 S 3 = SU (2) be a harmonic map. A harmonic map is a critical

INSTRUCTOR INFORMATION Name: Dr. Annamaria Iezzi Office: CMC 110 Contact Informa1on: aiezzi@usf.edu

Case Management Conference What is a CMC? 3 Case Management Conference It is a discussion

Memory Hard Jol Alwen Binyi Chen IST Austria UCSB Krzysztof - PowerPoint PPT Presentation

Scrypt is Maximally Memory Hard Jol Alwen Binyi Chen IST Austria UCSB Krzysztof Pietrzak Leonid Reyzin Stefano Tessaro IST Austria Boston University UCSB Password hashing: Store a hash of a password + salt

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

Virtual Memory 1 Memory Hierarchy Memory 4GB Cache 1M Registers 1K Question: What if

Personal SE Computer Memory Addresses C Pointers Computer Memory Organization Memory is a

Memory Memory processing is the ability to: Acquire (Short term memory) Manipulate

Memory Management Memory Manager Requirements Minimize primary memory access time

HydroCare HC-44 HydroCare HC-44 Hard Water Problems Hard Water Problems Hard Water Costs You

6/18/2018 When Family Life Gets Hard 1 6/18/2018 When Family Life Gets Hard God

UNIFIED MEMORY IN CUDA 6 MARK HARRIS NVIDIA CONFIDENTIAL Unified Memory Dramatically Lower

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Dynamic Memory Management 333 Dynamic Memory Management Process Memory Layout Process Memory

Lecture 11: Persistent Memory Databases 1 / 71 Persistent Memory Databases Recap

Memory Hierarchy: Caching CSE 141, S2'06 Jeff Brown The memory subsystem Computer Control

Memory Management Ideally programmers want memory that is large fast non

28.05.04 09:50 Memory Management The computer memory is a limited resource so the Memory

Machine Learning for NLP Unsupervised Learning Aurlie Herbelot 2019 Centre for Mind/Brain

Chapter 5-2: Clu lust ster erin ing Jilles Vreeken Revision 1, November 20 th typos fixed:

INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal &amp;

Clustering A Categorization of Major Clustering Methods Partitioning Methods

HMC-Sim 2.0: A Simulation Platform for Exploring Custom Memory Cube Operations John D. Leidel,

Harmonic Map Let f : T 2 S 3 = SU (2) be a harmonic map. A harmonic map is a critical

INSTRUCTOR INFORMATION Name: Dr. Annamaria Iezzi Office: CMC 110 Contact Informa1on: aiezzi@usf.edu

Case Management Conference What is a CMC? 3 Case Management Conference It is a discussion

INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Erik Velldal &