Guillaume VINET 19th May 2019 1 becoming more mature the art - - PowerPoint PPT Presentation

guillaume vinet 19th may 2019
SMART_READER_LITE
LIVE PREVIEW

Guillaume VINET 19th May 2019 1 becoming more mature the art - - PowerPoint PPT Presentation

Guillaume VINET 19th May 2019 1 becoming more mature the art adequatly protected. and heavy processing issues White-Box Cryptography (WBC) security analyses are Tracing the binary execution is now part of the state of Tracing can


slide-1
SLIDE 1

Guillaume VINET 19th May 2019

1

slide-2
SLIDE 2
  • White-Box Cryptography (WBC) security analyses are

becoming more mature

  • Tracing the binary execution is now part of the state of

the art

  • Tracing can be very powerful if the WBC code is not

adequatly protected.

  • Nowadays, a state-of-the-art analysis requires to:
  • Optimize the data tracing to overcome the data size, disk space

and heavy processing issues

  • Focus the analyses to specific instructions or parts of the code
  • Cover a large space of different attacks

2

slide-3
SLIDE 3

3

slide-4
SLIDE 4

Native binary file (assembly code) NO SOURCE FILES! It is not possible to add a printf or comment a line to see what happens

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

Native binary file Algorithm level

  • bfuscation

AES trace acquisition with visible rounds Another AES trace acquisition but with no pattern

6

slide-7
SLIDE 7

Native binary file Program level

  • bfuscation

Algorithm level

  • bfuscation
  • Control flow
  • Data obfuscation
  • Preventive transformations

Illustration: http:/ /tigress.cs.arizona.edu/transformPage/docs/flatten/index.html Control flow flattening

7

slide-8
SLIDE 8

Native binary file Program level

  • bfuscation

Algorithm level

  • bfuscation

8

slide-9
SLIDE 9

9

slide-10
SLIDE 10

Input Output Observe Modify

White-Box

10

slide-11
SLIDE 11

Option 1: Reverse engineering

Assets:

  • White-Box algorithm recovery

(Industrial Property) Drawbacks:

  • Elapsed time: from several weeks to

several months (if they are good protections)

  • Expertise:
  • Multiple experts: reverse

engineering & cryptography

11

slide-12
SLIDE 12

Unboxing the White-Box - Practical attacks against Obfuscated Ciphers Eloi Sanfelix, Cristofaro Mune and Job de Haas Black Hat 2015 Differential Computation Analysis Hiding your White Box Designs is Not Enough Joppe W. Bos, Charles Hubain, Wil Michiels and Philippe Teuwen CHES 2016

Dynamic Binary Instrumentation Tool to generate acquisition trace

12

slide-13
SLIDE 13

Assets:

  • Elapsed time: from several hours to several weeks
  • Expertise:
  • Expert: cryptography

Drawbacks (coming from binary obfuscation):

  • White-Box algorithm not recovered
  • Big trace size
  • Time of acquisition

13

slide-14
SLIDE 14
  • We trace directly the White-Box without reverse

engineering.

  • We will obtain big trace size

14

slide-15
SLIDE 15

15

slide-16
SLIDE 16

https:/ /github.com/SideChannelMarvels

Memory Access monitoring:

  • Read/Write value
  • Program Counter
  • Kind of operation

included in

16

slide-17
SLIDE 17

instruction Reading Writing

Illustration: https:/ /www.sstic.org/media/SSTIC2016/SSTIC-actes/design_de_cryptographie_white-box_et_a_la_fin_c_es/SSTIC2016-Slides- design_de_cryptographie_white-box_et_a_la_fin_c_est_kerckhoffs_qui_gagne-hubain_teuwen_1.pdf 17

slide-18
SLIDE 18

Assets:

  • Binary can be traced directly:

valgrind --tool=tracergrind --output=ls.trace ls

  • valgrind tracer
  • trace filename
  • binary to trace

18

slide-19
SLIDE 19

Assets:

  • Executables can be traced directly: no reverse

engineering skill required

  • Open Source

Drawbacks:

  • Only memory access tracing
  • Filtering based only on PC address/Memory

address range

  • To trace a library, a launcher must be created

19

slide-20
SLIDE 20

Rainbow

  • Memory Access monitoring:
  • Read/Write value
  • Program Counter
  • Kind of operation
  • Register monitoring

Unicorn

Illustration https:/ /www.ledger.com/2019/02/26/introducing-rainbow-donjons-side-channel-analysis-simulation-tool/

20

slide-21
SLIDE 21

Rainbow

Assets:

  • Open source
  • Use the powerful Unicorn Engine…

21

slide-22
SLIDE 22

Rainbow

Source https:/ /github.com/Ledger-Donjon/rainbow/blob/master/examples/ledger_ctf2/ripped.py

Call to external libraries must be implemented

22

slide-23
SLIDE 23

Rainbow

Assets:

  • Open source
  • Use the powerful Unicorn Engine…

Drawbacks:

  • … that might need reverse engineering
  • Executable/Library must be instrumented by a

script

  • The Unicorn emulation is slower than Valgrind/PIN

23

slide-24
SLIDE 24

X86, x86_64, ARM support included in

  • Memory Access monitoring:
  • Read/Write value
  • Program Counter
  • Kind of operation
  • Register monitoring

24

slide-25
SLIDE 25

25

slide-26
SLIDE 26

Assets:

  • Faster than Tracer and Rainbow
  • Executables can be traced directly
  • A lot of filtering options

Drawbacks:

  • Not open source
  • To trace a library, a launcher must be created

26

slide-27
SLIDE 27

27

slide-28
SLIDE 28
  • We trace directly the White-Box without reverse

engineering.

  • Configuration:
  • CPU i7-7560U, 2.4GHz dual core
  • 16 GB of RAM (we not need so much)
  • SSD NVMe
  • We can only use Side Channel Marvels Tracer or esTracer
  • We will obtain big trace size

28

slide-29
SLIDE 29
  • We trace directly the White-Box without reverse
  • engineering. We can only use Side Channel Marvels Tracer
  • r esTracer
  • We will obtain big trace size

29

slide-30
SLIDE 30

Input:

  • message to sign m ,
  • elliptic curve parameters p , a , b , n , G = ( Gx , Gy ),
  • secret key d .

Output:

  • signature ( r , s )
  • Generate randomly the secret scalar k in [ 1 , n - 1]
  • Compute the scalar multiplication: Q = ( Qx , Qy ) = [ k ] . G
  • Compute r = Qx mod n
  • Compute s = [ r×d + Hash(m) ] × k-1 mod n
  • return ( r , s )

30

slide-31
SLIDE 31

Input:

  • message to sign m ,
  • elliptic curve parameters p , a , b , n , G = ( Gx , Gy ),
  • secret key d .

Output:

  • signature ( r , s )
  • Generate randomly the secret scalar k in [ 1 , n - 1]
  • Compute the scalar multiplication: Q = ( Qx , Qy ) = [ k ] . G
  • Compute r = Qx mod n
  • Compute s = [ r×d + Hash(m) ] × k-1 mod n
  • return ( r , s )

31

slide-32
SLIDE 32

32

slide-33
SLIDE 33

r3 r2 r1 r0 x d3 d2 d1 d0 c7 c6 c5 c4 c3 c2 c1 c0 Example with 32-bits r & d

  • s = [ r×d + Hash(m) ] × k-1 mod n
  • r is known

33

slide-34
SLIDE 34

r3 r2 r1 r0 x d3 d2 d1 d0 c7 c6 c5 c4 c3 c2 c1 c0 Example with 32-bits r & d

  • s = [ r×d + Hash(m) ] × k-1 mod n
  • r is known
  • Guess d0 and correlate 8 bits information
  • Intermediate value is:
  • c0 = r0 x d0 mod 28

34

slide-35
SLIDE 35

r3 r2 r1 r0 x d1 d2 d1 d0 c7 c6 c5 c4 c3 c2 c1 c0 Example with 32-bits r & d

  • s = [ r×d + Hash(m) ] × k-1 mod n
  • r is known

35

  • Guess d1 and correlate 16 bits using the

best candidates from d0

  • Intermediate value is:
  • c1 c0 = (r1r0 x d1d0 ) mod 216
slide-36
SLIDE 36

36

slide-37
SLIDE 37

37

slide-38
SLIDE 38

38

slide-39
SLIDE 39

39

slide-40
SLIDE 40

What must be traced?

  • nly the binary itself, not external system

libraries How to know where to trace?

  • Trace memory access or registers
  • Display them to see distinguishable patterns
  • Program Counter (PC), address of executed

instruction, tracing is a good start

40

slide-41
SLIDE 41

41

slide-42
SLIDE 42

42

Double & Add but not our use case Our use case: r x d 33 millions points (64 bits) only for PC register

slide-43
SLIDE 43

43

slide-44
SLIDE 44

~4.510 MB ~4.446 MB ~4.509 MB ~4.667 MB ~4.287 MB ~4.287 MB ~4.805 MB ~4.661 MB ~4.437 MB ~4.677 MB

44

slide-45
SLIDE 45

~4.510 MB ~4.446 MB ~4.509 MB ~4.667 MB ~4.287 MB ~4.287 MB ~4.805 MB ~4.661 MB ~4.437 MB ~4.677 MB

2 problems

  • Different trace

size

  • Big trace size

45

slide-46
SLIDE 46

46

slide-47
SLIDE 47

47

slide-48
SLIDE 48

Problem 1 - Different trace size

  • Why?
  • ECDSA algorithm
  • How defeat it?
  • Remove variant PC

48

slide-49
SLIDE 49

Problem 2 – Big trace size

  • Why?
  • Unvariant registers
  • How defeat it?
  • Step 1: remove identical

colums

  • Step 2: remove duplicated

columns Step 1 Step 2

49

slide-50
SLIDE 50

Drawbacks:

  • Post-processing:
  • Problem 1: space disk. We obtain big traces and

transform them in small traces.

  • Problem 2: time. We lost a lot of time to generate them,

and filter them.

50

slide-51
SLIDE 51

Drawbacks:

  • Post-processing:
  • Problem 1: space disk. We obtain big traces and

transform them in small traces.

  • Problem 2: time. We lost a lot of time to generate them,

and filter them.

  • Pattern Detector & Accurate register tracing

51

slide-52
SLIDE 52

Example of desynchronisation with 2 PC traces

52

slide-53
SLIDE 53

Example of desynchronisation with 2 PC traces

53

slide-54
SLIDE 54

Trig&Act:

  • Trigger: pattern detector
  • Action: start/stop acquisition, stop program

Trig&Act chaining:

  • Trace only first & last rounds
  • Defeat several synchronisations

54

slide-55
SLIDE 55

cmp al, [rbp+var_2C]

  • No modification in

rax, rcx, rdx, rbx, rsp, rbp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15, pc

  • Do not trace this instruction

55

slide-56
SLIDE 56

sub edx, eax

  • Only edx is written and eax/edx read
  • Useless to acquire rcx, rbx, rsp, rbp, rsi, rdi, r8,

r9, r10, r11, r12, r13, r14, r15, pc

  • Acquire only read/written registers or both

56

slide-57
SLIDE 57

57

slide-58
SLIDE 58
  • Trig&act to get synchronized traces
  • Trace only written registers

58

slide-59
SLIDE 59

~28 MB ~28 MB ~28 MB ~28 MB ~28 MB ~28 MB ~28 MB ~28 MB ~28 MB ~28 MB

59

slide-60
SLIDE 60

60

slide-61
SLIDE 61

61

slide-62
SLIDE 62

62

slide-63
SLIDE 63

63

slide-64
SLIDE 64

64

slide-65
SLIDE 65

~203 MB ~204 MB ~207 MB ~214 MB ~196 MB ~196 MB ~220 MB ~213 MB ~203 MB ~214 MB

65

slide-66
SLIDE 66

66

slide-67
SLIDE 67

67

slide-68
SLIDE 68

68

With trig&act, we can skip the point multiplication (very big). Without it, we would have the same trace size as with Tracer Valgrind.

slide-69
SLIDE 69

69

slide-70
SLIDE 70

70

slide-71
SLIDE 71

71

slide-72
SLIDE 72

We attack a multiplication… so we could focus

  • n instruction related to it.

For each executed mult instruction:

  • Acquire the read/written registers

72

slide-73
SLIDE 73

~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB ~0.043 MB

73

slide-74
SLIDE 74

74

slide-75
SLIDE 75

75

slide-76
SLIDE 76
slide-77
SLIDE 77
slide-78
SLIDE 78

r3 r2 r1 r0 x d3 d2 d1 d0 c7 c6 c5 c4 c3 c2 c1 c0 Example with 32-bits r & d

  • s = [ r×d + Hash(m) ] × k-1 mod n
  • r is known
  • Guess d0 and correlate 8 bits information
  • Intermediate value is:
  • c0 = r0 x d0 mod 28

78

slide-79
SLIDE 79

r3 r2 r1 r0 x d3 d2 d1 d0 c7 c6 c5 c4 c3 c2 c1 c0 Example with 32-bits r & d

  • s = [ r×d + Hash(m) ] × k-1 mod n
  • r is known
  • Guess d1 and correlate 16 bits using the

best candidates from d0.

  • Intermediate value is:
  • c1 c0 = (r1r0 x d1d0 ) mod 216

79

slide-80
SLIDE 80

80

  • d0 : 5 best guesses
  • d1 : 5 best guesses
  • d2 : 5 best guesses
  • d3 : 1 best guesses
slide-81
SLIDE 81

81

  • d3 d2 d1 d0 :word0
  • d7 d6 d5 d4 :word0
  • d11 d10 d9 d8 :word0
  • d15 d14 d13 d12:word0
  • d19 d18 d17 d16:word0
  • d23 d22 d21 d20:word0
  • word0
  • word1
  • word2
  • word3
  • word4
  • word5
slide-82
SLIDE 82

r7 r6 r5 r4 r3 r2 r1 r0 x d7 d6 d5 d4 d3 d2 d1 d0 ...c7 c6 c5 c4 c3 c2 c1 c0

82

word0 recovery

slide-83
SLIDE 83

r7 r6 r5 r4 r3 r2 r1 r0 x d7 d6 d5 d4 d3 d2 d1 d0 c3 c2 c1 c0

83

word0 recovery

r7 r6 r5 r4 r3 r2 r1 r0 x d7 d6 d5 d4 d3 d2 d1 d0 c7 c6 c5 c4

. . . .

word1 recovery

If we attack word1 in the same frame, we might correlate word0 with d7 d6 d5 d4

slide-84
SLIDE 84

84

r7 r6 r5 r4 r3 r2 r1 r0 x d7 d6 d5 d4 d3 d2 d1 d0 c3 c2 c1 c0

word0 recovery

r7 r6 r5 r4 r3 r2 r1 r0 x d7 d6 d5 d4 d3 d2 d1 d0 c7 c6 c5 c4

. . . .

word1 recovery

slide-85
SLIDE 85

85

  • 500 traces: 407 sec & 4.1 MB
  • DCA Attack:
  • 48 sec
  • Value Model
slide-86
SLIDE 86

86

slide-87
SLIDE 87

87

slide-88
SLIDE 88

88

slide-89
SLIDE 89

89

slide-90
SLIDE 90

Tracing a White-Box must be focused on the binary. Why trace directly without reverse engineering? Fast But:

  • Big size traces

Post treatment required & time consuming

90

slide-91
SLIDE 91

Strategies to defeat these issues: Focus tracing only on memory access or register access Pattern detector to trace only interesting area Depending on the algorithm, focus on specific instructions In that way, it is possible to obtain small traces that still contain leakage points.

91