 
              Differential Power Analysis (DPA) With Key Ranking and Enumeration Martijn Stam COINS Winterschool in Finse, May 2019
The Rise of Side-Channels The Ideal World 2 Modern Cryptology From Katz and Lindell’s Classic Textbook Three Principles 1 Formal Definitions: “giving a clear description of what threats are in scope and what security guarantees are desired” 2 Precise Assumptions: “that are simpler to state, since [they] are easier to study and (potentially) refute” 3 Proofs of Security: “that a construction satisfies a definition under certain specified assumptions”
The Rise of Side-Channels The Ideal World 3 FDH-RSA Cryptosystem Black-box perspective of chosen-message attacks S ← H ( M ) d mod N ( N , d , e ) ← $ Kg M S ❆ N , e M , ˆ ˆ S � � S e mod N = H ( ˆ Adv euf - cma ˆ M ) , ˆ M fresh FDH − RSA ( ❆ ) = Pr
The Rise of Side-Channels The Real World 4 The Rise of Side-Channels Paul Kocher’s Revolution https://www.paulkocher.com/ 1996 Timing Attacks 1999 Simple and Differential Power Analysis (DPA) (w. Joshua Jaffe & Benjamin Jun) 2016 https://www.youtube.com/ watch?v=6lt7ExN6Kw4 Power Analysis Measuring power consumption over time allows (relatively) easy recovery of secret keys
The Rise of Side-Channels The Real World 5 SPA: Simple Power Analysis A Simple Attack Against Unprotected RSA What is SPA SPA exploits data-dependent differences in power consumption of a single operation to recover secret information. S ← H ( m ) d mod N Simple Attack s ← 1 , x ← H ( m ) Assume you can tell multiplications while d > 0 and squarings apart. if d odd then So you observe something like s ← s · x mod N SMSSMSSM x ← x 2 mod N Corresponds to exponent d ← ⌊ d / 2 ⌋ ( 10101 ) 2 = 21
The Rise of Side-Channels The Real World 6 DPA: Differential Power Analysis The workhorse of side-channel attacks What is DPA DPA exploits data-dependent correlation in power consumption over multiple, related operations to recover secret information. Power of DPA Any unprotected implementation will eventually be susceptible. Countermeasures All implementations will need protection against side channels.
The Rise of Side-Channels The Real World 7 Power Analysis Attacks Stefan Mangard, Elisabeth Oswald, and Thomas Popp’s Classic Revealing the Secrets of Smart Cards “first comprehensive treatment of power analysis attacks and countermeasures” Aimed at the practitioner From 2007 ⇒ no modern ideas and theory
Outline 1 How Differential Power Attacks Work Our Setting A Typical Pipeline for Key Recovery Profiled Attack Example Key Enumeration and Ranking 2 Enumeration Ranking Conclusion 3 Want to Learn More?
How Differential Power Attacks Work Our Setting 9 Modern Cryptology Black-box Blockciphers What is a Blockcipher A blockcipher E is family of keyed permutations E : { 0 , 1 } k × { 0 , 1 } n → { 0 , 1 } n where k is the key length and n the block length Blockcipher Usage Use a mode-of-operation like GCM to create an encryption scheme GCM security proof assumes the blockcipher E is a “PRP” So E is treated as a black box What happens if you can see it “work”?
How Differential Power Attacks Work Our Setting 10 Modern Cryptology AES-128 Round function f C ← M × C S 0 4 8 12 1 5 9 13 x x x x 2 6 10 14 x x AK SB SR MC 3 7 11 15 x x w i − 1 x i y i z i w i Design k = 128, n = 128, where 128 = 16 × 8 (16 bytes) 10 rounds of whitened SP network Non-linearity comes from bytewise S-boxes Images: TikZ for Cryptographers, Jérémy Jean, www.iacr.org/authors/tikz/
How Differential Power Attacks Work Our Setting 11 SCALE: A Resource by Dan Page https://github.com/danpage/scale Side-Channel Attack Lab. Exercises Provides a suite of material related to side-channel (and fault) attacks that is low-cost, accessible, relevant, coherent, and effective. SCALE Data Sets 1 Four platforms: an Atmel atmega328p (an AVR) plus three NXP ARM Cortex-M processors 2 Implementation uses an 8-bit datapath and look-up tables for the S-box and xtime operations (but code not known) 3 2 × 1000 traces of AES-128 each (known vs. unknown key) 4 Traces acquired using a Picoscope 2206B, using triggers for alignment
How Differential Power Attacks Work Our Setting 12 Plotting a Trace SCALE’s AES-128 on an Atmel 0.4 0.2 0.0 0.2 0.4 0 20000 40000 60000 80000 100000 120000 A full trace k = 2B7E151628AED2A6ABF7158809CF4F3C Total of 132 , 292 points You can see a pattern repeating roughly 10 times
How Differential Power Attacks Work Our Setting 13 Finding the Rounds Using crosscorrelation 1200 1000 800 600 400 200 0 0 50000 100000 150000 200000 250000 Crosscorrelation of a trace Compares how well shifs of the trace match the original � c i = a j a i + j j Leads to round duration of 12421
How Differential Power Attacks Work Our Setting 13 Finding the Rounds Using crosscorrelation 0.3 0.2 0.1 0.2 0.0 0.1 0.1 0.2 0.0 0.3 0.4 0.1 0.5 0 2000 4000 6000 8000 10000 12000 0 2000 4000 6000 8000 10000 12000 Plotting the Rounds Jointly Left Rounds 1 and 2 superimposed Round 1 is building up power Right Rounds 5 and 6 superimposed Peaks and jittery areas match well
How Differential Power Attacks Work Our Setting 14 Plotting a Trace SCALE’s AES-128 on an Atmel 0.3 0.2 0.1 0.0 0.1 26000 28000 30000 32000 34000 36000 38000 3rd round close up 1 Some peaks, some jitter 2 Hard to really discern much of interest...
How Differential Power Attacks Work Our Setting 15 Signal versus Noise What determines the power consumption? 0.3 0.2 0.1 0.0 0.1 26000 28000 30000 32000 34000 36000 38000 Engineer’s Perspective (MOP, Ch. 4) P total = P op + P data + P el . noise + P const P op d.o. the operation P el . noise electrical noise P data d.o. the data P const constant base
How Differential Power Attacks Work Our Setting 15 Signal versus Noise What determines the power consumption? 0.3 0.2 0.1 0.0 0.1 26000 28000 30000 32000 34000 36000 38000 Engineer’s Perspective (MOP, Ch. 4) P op + P data = P exp + P sw . noise P op d.o. the operation P exp exploitable signal P data d.o. the data P sw . noise switching noise
How Differential Power Attacks Work Our Setting 15 Signal versus Noise What determines the power consumption? 0.3 0.2 0.1 0.0 0.1 26000 28000 30000 32000 34000 36000 38000 Engineer’s Perspective (MOP, Ch. 4) P total = P exp + P sw . noise + P el . noise + P const P exp exploitable signal P el . noise electrical noise P const constant base P sw . noise switching noise
How Differential Power Attacks Work Our Setting 15 Signal versus Noise What determines the power consumption? 0.3 0.2 0.1 0.0 0.1 26000 28000 30000 32000 34000 36000 38000 Theoretician’s Perspective P total = f ( data ) + N ( 0 , σ ) f ( data ) mainly models P exp , function f incorporates P op and P const σ depends on P sw . noise and P el . noise
How Differential Power Attacks Work Our Setting 15 Signal versus Noise What determines the power consumption? 0.3 0.2 0.1 0.0 0.1 26000 28000 30000 32000 34000 36000 38000 Some Caveats 1 Which operations are performed on which registers can be relevant 2 Looking at multiple points might lead to multivariate dependencies 3 Sometimes noise levels ( σ ) are data-dependent 4 The function f and noise level σ are unknown
How Differential Power Attacks Work Our Setting 15 Signal versus Noise What determines the power consumption? 0.008 0.4 0.007 0.006 0.2 0.005 0.0 0.004 0.003 0.2 0.002 0.001 0.4 0.000 0 20000 40000 60000 80000 100000 120000 0 20000 40000 60000 80000 100000 120000 Atmel AES, Based on 1000 traces Assuming no branches in the execution Left Pointwise sample mean: P const + P op Right Pointwise sample variance: P data + P el . noise Both P exp and P sw . noise depend on your target...
How Differential Power Attacks Work Our Setting 16 Signal versus Noise Intermediate values and target selection Round function f C ← M × C 0 4 8 12 S 1 5 9 13 x x x x 2 6 10 14 x x AK SB SR MC 3 7 11 15 x x w i − 1 x i y i z i w i The Locality of Leakage Intermediate value : 0.3 the (few) byte(s) involved in a 0.2 specific operation 0.1 0.0 Locality assumption: 0.1 leakage primarily depends on the 26000 28000 30000 32000 34000 36000 38000 intermediate value operated upon
How Differential Power Attacks Work Our Setting 16 Signal versus Noise Intermediate values and target selection Round function f C ← M × C 0 4 8 12 S 1 5 9 13 x x x x 2 6 10 14 x x AK SB SR MC 3 7 11 15 x x w i − 1 x i y i z i w i The Locality of Leakage Intermediate value : Target intermediate the (few) byte(s) involved in a value captured by specific operation P exp Locality assumption: The “rest” leakage primarily depends on the contributes to intermediate value operated upon P sw . noise
Recommend
More recommend