Guillaume VINET 19th May 2019 1 stress WBC-based solutions, - - PowerPoint PPT Presentation

guillaume vinet 19th may 2019
SMART_READER_LITE
LIVE PREVIEW

Guillaume VINET 19th May 2019 1 stress WBC-based solutions, - - PowerPoint PPT Presentation

Guillaume VINET 19th May 2019 1 stress WBC-based solutions, challenging practical realisation, to exploitable faulty computations, include: and able to affect large range of instructions multiple faults Binary analysis: dynamic fault


slide-1
SLIDE 1

1

Guillaume VINET 19th May 2019

slide-2
SLIDE 2
  • Binary analysis: dynamic fault injection is a powerful way to

stress WBC-based solutions,

  • Publications in this area remain modest, mostly due to

challenging practical realisation,

  • Registers, memory access can be changed in runtime leading

to exploitable faulty computations,

  • Nowadays, a state-of-the-art WBC security analysis must

include:

  • static and dynamic fault injections
  • an efficient way to induce dynamic faulty computations: being precise

and able to affect large range of instructions

  • a large range of public fault injection attacks exploiting single or

multiple faults

2

slide-3
SLIDE 3

3

slide-4
SLIDE 4

Native binary file (assembly code) NO SOURCE FILES!

4

slide-5
SLIDE 5

5

slide-6
SLIDE 6

Native binary file Cryptographic attacks

6

  • Differential Fault Analysis (DFA)
  • Safe Error
slide-7
SLIDE 7

Native binary file Cryptographic attacks

  • Defeat integrity mechanisms
  • Defeat algorithm protection.

Stuck mask value to transform a 2nd

  • rder attack to a 1rst order

7

Security downgrade

  • Differential Fault Analysis (DFA)
  • Safe Error
slide-8
SLIDE 8

Native binary file Cryptographic attacks

8

Security downgrade

slide-9
SLIDE 9

9

slide-10
SLIDE 10

10

slide-11
SLIDE 11

Assets:

  • Easy to implement

Drawbacks:

  • Speed: how to avoid combinatorial

complexity with multiple fault injections?

  • Accuracy: valuable to modify table value,

but not disturbing operation execution

  • Anti-Fault countermeasures: fault easily

detected

11

slide-12
SLIDE 12

https:/ /github.com/SideChannelMarvels

  • Python framework
  • Tree strategy to inject

the faults included in

12

slide-13
SLIDE 13

13

slide-14
SLIDE 14

Assets:

  • Accuracy:
  • Alter registers, memory or instructions
  • Multiple fault injection to defeat security

countermeasures Drawbacks:

  • Fault Model: which fault effects must be implemented
  • Speed: how to avoid combinatorial complexity with multiple

fault injections?

14

slide-15
SLIDE 15

Assets:

  • Open source
  • Use the powerful Unicorn Engine…

15

slide-16
SLIDE 16

Source https:/ /www.riscure.com/uploads/2017/09/eu-15-sanfelix-mune-dehaas-unboxing-the-white-box-wp_v1.1.pdf

Know where to recover the ciphertext once the fault was injected

16

Call to external libraries must be implemented/patched

slide-17
SLIDE 17

Assets:

  • Open source
  • Use the powerful Unicorn Engine…

Drawbacks:

  • … that needs reverse engineering
  • Executable/Library must be instrumented by a

script

  • The Unicorn emulation is slow

17

slide-18
SLIDE 18

18

slide-19
SLIDE 19

Relevant Fault Models Speed Where? When? Configuration

19

Dynamic fault injection

slide-20
SLIDE 20

Faults models:

  • Register modification

20

slide-21
SLIDE 21

Faults models:

  • Register modification
  • Data Flow Modification

21

slide-22
SLIDE 22

Faults models:

  • Register modification
  • Data Flow Modification
  • Control Flow Modification with Program

Counter Register

22

slide-23
SLIDE 23

Where to fault:

  • Filtering based on:
  • Program Counter
  • Kind of instruction: mov, add …

23

slide-24
SLIDE 24

When to fault: pattern detector

  • 24
slide-25
SLIDE 25

When to fault: pattern detector

  • 25
slide-26
SLIDE 26
  • X86, x86_64,

ARM support

  • Takes advantage
  • f Qemu speed

included in

  • Dynamic register modification
  • Data flow disturbance
  • Control flow disturbance
  • Multi-fault injection
  • Fault&trace capabilities
  • Filtering and trig&act capabilities

26

slide-27
SLIDE 27

27

slide-28
SLIDE 28
  • Attack an AES White-Box implementation
  • Configuration:
  • CPU i7-7560U, 2.4GHz dual core
  • 16 GB of RAM (we not need so much)
  • SSD NVMe
  • Double fault injection
  • Key recovery from the faulty outputs with a DFA of Piret

(with 4 modified bytes in a specific way)

28

slide-29
SLIDE 29
  • Attack an AES White-Box implementation
  • Double fault injection
  • Key recovery from the faulty outputs with a DFA of Piret

(with 4 modified bytes in a specific way)

29

nb_ins x nb_fault_model x nb_target x nb_input x nb_area

slide-30
SLIDE 30

$./cipheraes 06 1F C9 F5 88 B2 F9 D2 00 19 86 82 2C 12 11 79 message: 061fc9f588b2f9d2001986822c121179 cipher: 14ed01ea7ce2a551c9791ae85c7cecf4

AES-128 X86-64 architecture Differential Fault Analysis with a double fault injection attack:

  • Data flow disturbance
  • Control flow disturbance

30

slide-31
SLIDE 31

31

slide-32
SLIDE 32

32

slide-33
SLIDE 33

33

slide-34
SLIDE 34

34

slide-35
SLIDE 35

35

slide-36
SLIDE 36

36

slide-37
SLIDE 37

37

slide-38
SLIDE 38

38

slide-39
SLIDE 39

39

slide-40
SLIDE 40

40

1557 755 35613 34422 7 565 34943 36500 35745 887 2078 36493 35935 644 310 225 264 78 5000 10000 15000 20000 25000 30000 35000 40000 rax rcx rdx rsp rbp rsi rdi Faulty Output Correct Ouput Parse Error

No effect for the other registers (rbx, r8, r9, r10, r11, r12, r13, r14, r15)

slide-41
SLIDE 41

14 campaigns faulting one register (rsp/rbp not included)

41

~76 min (multi-thread not used) ~112 injected faults by second 511,000 injected faults

slide-42
SLIDE 42

42

slide-43
SLIDE 43

43

slide-44
SLIDE 44

Correct/ Faulty output analysis Execute algorithm:

  • To recover the key
  • Or to detect in which round the fault was injected
  • A lot of public algorithm available, but if they fail it

gives no information

44

slide-45
SLIDE 45

Correct/ Faulty output analysis Execute algorithm:

  • To recover the key
  • Or to detect in which round the fault was injected
  • A lot of public algorithm available, but if they fail it

gives no information Reverse engineering

  • Understand the effect of the fault on the program

execution

  • Give a way to understand very accurately the fault

but it requires reverse engineering skills

45

slide-46
SLIDE 46

Correct/ Faulty output analysis Execute algorithm:

  • To recover the key
  • Or to detect in which round the fault was injected
  • A lot of public algorithm available, but if they fail it

gives no information Reverse engineering

  • Understand the effect of the fault on the program

execution

  • Give a way to understand very accurately the fault

but it requires reverse engineering skills Fault & Trace

  • Fault and trace at the same time (memory access,

Program Counter registers …)

  • Give a visual way to understand accurately the fault

effect without reverse engineering skills

46

slide-47
SLIDE 47

47

slide-48
SLIDE 48

48

slide-49
SLIDE 49

Fault and trace the PC register to see the executed instructions Traces are identical Traces are different

49

slide-50
SLIDE 50

Fault and trace the PC register to see the executed instructions Traces are identical Traces are different

50

slide-51
SLIDE 51

51

slide-52
SLIDE 52

52

slide-53
SLIDE 53

53

slide-54
SLIDE 54

54

slide-55
SLIDE 55

55

slide-56
SLIDE 56

56

slide-57
SLIDE 57

loc_4033CE: cmp [rbp+var_1], 3 jbe short loc_403393 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4

The output was computed

  • twice. Its consistency is

checked by block of four bytes.

loc_4033D4 nop leave retn

57

slide-58
SLIDE 58

loc_4033CE: cmp [rbp+var_1], 3 jbe short loc_403393 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4 movzx eax, [rbp+var_1] lea rdx, ds:0[rax*4] mov rax, [rbp+var_38] add rax, rdx mov dword ptr [rax], 0

The output was computed

  • twice. Its consistency is

checked by block of our bytes. In case of a failure, the four-byte block is set to zero.

loc_4033D4 nop leave retn

58

slide-59
SLIDE 59

loc_4033CE: cmp [rbp+var_1], 3 jbe short loc_403393 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4 movzx eax, [rbp+var_1] lea rdx, ds:0[rax*4] mov rax, [rbp+var_38] add rax, rdx mov dword ptr [rax], 0 loc_4033C4: movzx eax, [rbp+var_1] add eax, 1 mov [rbp+var_1], al

We start the output analysis The output was computed

  • twice. Its consistency is

checked. In case of a failure, the four-byte block is set to zero.

loc_4033D4 nop leave retn

These operation are done 4 times to analyze all the

  • utput.

59

slide-60
SLIDE 60

loc_4033CE: cmp [rbp+var_1], 3 jbe short loc_403393 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4 movzx eax, [rbp+var_1] lea rdx, ds:0[rax*4] mov rax, [rbp+var_38] add rax, rdx mov dword ptr [rax], 0 loc_4033C4: movzx eax, [rbp+var_1] add eax, 1 mov [rbp+var_1], al

We start the output analysis The output was computed

  • twice. Its consistency is

checked. In case of a failure, the four-byte block is set to zero.

loc_4033D4 nop leave retn

These operation are done 4 times to analyze all the

  • utput.

60

slide-61
SLIDE 61

loc_4033CE: cmp [rbp+var_1], 3 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4 movzx eax, [rbp+var_1] lea rdx, ds:0[rax*4] mov rax, [rbp+var_38] add rax, rdx mov dword ptr [rax], 0 loc_4033C4: movzx eax, [rbp+var_1] add eax, 1 mov [rbp+var_1], al loc_4033D4 nop leave retn

61

slide-62
SLIDE 62

loc_4033CE: cmp [rbp+var_1], 3 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4 movzx eax, [rbp+var_1] lea rdx, ds:0[rax*4] mov rax, [rbp+var_38] add rax, rdx mov dword ptr [rax], 0 loc_4033C4: movzx eax, [rbp+var_1] add eax, 1 mov [rbp+var_1], al loc_4033D4 nop leave retn

62

slide-63
SLIDE 63

loc_4033CE: 0x4033CE cmp [rbp+var_1], 3 0x4033D2 loc_403393: movzx eax, [rbp+var_1] cdqe mov edx, [rbp+rax*4+var_20] movzx eax, [rbp+var_1] cdqe mov eax, [rbp+rax*4+var_30] cmp edx, eax jz short loc_4033C4 movzx eax, [rbp+var_1] lea rdx, ds:0[rax*4] mov rax, [rbp+var_38] add rax, rdx mov dword ptr [rax], 0 loc_4033C4: movzx eax, [rbp+var_1] add eax, 1 mov [rbp+var_1], al 0x4033D4 nop 0x4033D5 leave 0x4033D6 retn

63

slide-64
SLIDE 64

64

slide-65
SLIDE 65

65

slide-66
SLIDE 66

66

slide-67
SLIDE 67

67

slide-68
SLIDE 68

68

slide-69
SLIDE 69

69

slide-70
SLIDE 70

Faulting a White-Box must be focused on the binary. Dynamic fault injection is a prerequisite Accurate multiple faults can be injected Security mechanisms can be defeated But Combinatorial complexity

70

nb_ins x nb_fault_model x nb_target x nb_input x nb_area

slide-71
SLIDE 71

Strategies to defeat these issues: Pattern detector to fault only interesting area Focus faulting on specific Register / Program Counter

  • r instructions

Fault & trace to understand the effect of a fault or downgrade security In that way, it is possible to execute successful multi-fault attacks, in reasonable amount of time.

71