Outline Introduction 1 Identifier renaming 2 Complicating - - PowerPoint PPT Presentation

outline
SMART_READER_LITE
LIVE PREVIEW

Outline Introduction 1 Identifier renaming 2 Complicating - - PowerPoint PPT Presentation

Outline Introduction 1 Identifier renaming 2 Complicating control flow 3 Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions Opaque Predicates 4 Opaque predicates from


slide-1
SLIDE 1

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Introduction 1/82

slide-2
SLIDE 2

Code obfuscation — It’s elusive!

Hard to pin down exactly what obfuscation is

Introduction 2/82

slide-3
SLIDE 3

Code obfuscation — It’s elusive!

Hard to pin down exactly what obfuscation is Hard to devise practically useful algorithms

Introduction 2/82

slide-4
SLIDE 4

Code obfuscation — It’s elusive!

Hard to pin down exactly what obfuscation is Hard to devise practically useful algorithms Hard to evaluate the quality of these algorithms.

Introduction 2/82

slide-5
SLIDE 5

Code obfuscation — what is it?

Informally, to obfuscate a program P means to transform it into a program P′ that is still executable but for which it is hard to extract information.

Introduction 3/82

slide-6
SLIDE 6

Code obfuscation — what is it?

Informally, to obfuscate a program P means to transform it into a program P′ that is still executable but for which it is hard to extract information. “Hard?” ⇒ Harder than before!

Introduction 3/82

slide-7
SLIDE 7

Code obfuscation — what is it?

Informally, to obfuscate a program P means to transform it into a program P′ that is still executable but for which it is hard to extract information. “Hard?” ⇒ Harder than before! static obfuscation ⇒ obfuscated programs that remain fixed at runtime.

tries to thwart static analysis attacked by dynamic techniques (debugging, emulation, tracing).

Introduction 3/82

slide-8
SLIDE 8

Code obfuscation — what is it?

Informally, to obfuscate a program P means to transform it into a program P′ that is still executable but for which it is hard to extract information. “Hard?” ⇒ Harder than before! static obfuscation ⇒ obfuscated programs that remain fixed at runtime.

tries to thwart static analysis attacked by dynamic techniques (debugging, emulation, tracing).

dynamic obfuscators ⇒ transform programs continuously at runtime, keeping them in constant flux.

tries to thwart dynamic analysis

Introduction 3/82

slide-9
SLIDE 9

Code obfuscation — Overview

1 Simple obfuscating transformations.

Introduction 4/82

slide-10
SLIDE 10

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool.

Introduction 4/82

slide-11
SLIDE 11

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool. 3 Definitions.

Introduction 4/82

slide-12
SLIDE 12

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool. 3 Definitions. 4 Control-flow transformations.

Introduction 4/82

slide-13
SLIDE 13

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool. 3 Definitions. 4 Control-flow transformations. 5 Data transformations.

Introduction 4/82

slide-14
SLIDE 14

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool. 3 Definitions. 4 Control-flow transformations. 5 Data transformations. 6 Abstraction transformations.

Introduction 4/82

slide-15
SLIDE 15

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool. 3 Definitions. 4 Control-flow transformations. 5 Data transformations. 6 Abstraction transformations. 7 Constructing opaque predicates.

Introduction 4/82

slide-16
SLIDE 16

Code obfuscation — Overview

1 Simple obfuscating transformations. 2 How to design an obfuscation tool. 3 Definitions. 4 Control-flow transformations. 5 Data transformations. 6 Abstraction transformations. 7 Constructing opaque predicates. 8 Dynamic obfuscating transformations.

Introduction 4/82

slide-17
SLIDE 17

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Identifier renaming 5/82

slide-18
SLIDE 18

Algorithm obfTP: Identifier renaming

Java released 1996:

decompilation is easy! compiled code ⇔ source!

Identifier renaming 6/82

slide-19
SLIDE 19

Algorithm obfTP: Identifier renaming

Java released 1996:

decompilation is easy! compiled code ⇔ source!

Hans Peter Van Vliet

1 released Crema a Java obfuscator. 2 released Mocha Java decompiler. 3 RIP

Identifier renaming 6/82

slide-20
SLIDE 20

Algorithm obfTP: Identifier renaming

Java released 1996:

decompilation is easy! compiled code ⇔ source!

Hans Peter Van Vliet

1 released Crema a Java obfuscator. 2 released Mocha Java decompiler. 3 RIP

It’s an obfuscator/decompiler war!

1

HoseMocha kills Mocha (add an instruction after return);

2 Rename identifiers using characters that are legal in the JVM,

but not in Java source.

Identifier renaming 6/82

slide-21
SLIDE 21

Renaming Example

✞ ☎

int modexp ( int y,int x[], int w,int n) { int R, L; int k = 0; int s = 1; while (k < w) { if (x[k] == 1) R = (s*y)%n; else R = s; s = R*R%n; L = R; k++; } return L; }

✝ ✆ ✞ ☎

int f1( int x1 ,int x2[], int x3 ,int x4) { int x5 , x6; int x7 = 0; int x8 = 1; while (x7 < x3) { if (x2[x7 ] == 1) x5 = (x8*x1)%x4; else x5 = x8; x8 = x5*x5%x4; x6 = x5; x7++; } return x6; }

✝ ✆

Identifier renaming 7/82

slide-22
SLIDE 22

Identifier renaming

Historical interest.

Identifier renaming 8/82

slide-23
SLIDE 23

Identifier renaming

Historical interest. Decompiler can’t recover information which has been removed!

Identifier renaming 8/82

slide-24
SLIDE 24

Identifier renaming

Historical interest. Decompiler can’t recover information which has been removed! Identifier renaming ⇒ no performance overhead!

Identifier renaming 8/82

slide-25
SLIDE 25

Algorithm obfTP

In an object-oriented language:

Use overloading! Give as many declarations as possible the same name!

Identifier renaming 9/82

slide-26
SLIDE 26

Algorithm obfTP

In an object-oriented language:

Use overloading! Give as many declarations as possible the same name!

Algorithm by Paul Tyma:

Used in PreEmptive Solutions’ Dash0 Java obfuscator. Licensed by Microsoft for Visual Studio

Identifier renaming 9/82

slide-27
SLIDE 27

Algorithm obfTP

Java naming rules:

1 Class names should be globally unique, 2 Field names should be unique within classes 3 Methods with different signatures can have the same name.

Identifier renaming 10/82

slide-28
SLIDE 28

Algorithm obfTP

Java naming rules:

1 Class names should be globally unique, 2 Field names should be unique within classes 3 Methods with different signatures can have the same name.

Algorithm

1 Build a graph:

nodes are declarations edges between nodes that cannot have the same name

2 Merge methods that must have the same name (because they

  • verride each other) into super-nodes.

3 Color the graph with the smallest number of colors (=names)!

Identifier renaming 10/82

slide-29
SLIDE 29

Algorithm obfTP: Original program

✞ ☎

class Felinae { int color ; int speed ; public void move(int x,int y){} } class Felis extends Felinae { public void move(int x,int y){} public void meow(int tone ,int length ){} } class Pantherinae extends Felinae { public void move(int x,int y){} public void growl(int tone ,int length ){} } class Panthera extends Pantherinae { public void move(int x,int y){} }

✝ ✆

Identifier renaming 11/82

slide-30
SLIDE 30

Algorithm obfTP: Interference graph

✞ ☎

class Felinae { int color ; int speed ; void move(int x,int y) } class Felis extends Felinae { void move(int x,int y){} void meow(int tone ,int len ) } class Pantherinae extends Felinae { void move(int x,int y){} void growl(int tone ,int len) } class Panthera extends Pantherinae { void move(int x,int y) }

✝ ✆

Panthera Felis Pantherinae speed color Felis.meow Pantherinae.growl Felinae Felis.move Felinae.move Pantherinae.move Panthera.move

Identifier renaming 12/82

slide-31
SLIDE 31

Algorithm obfTP: Renamed program

Panthera Felis Pantherinae speed color Felis.meow Pantherinae.growl Felinae Felis.move Felinae.move Pantherinae.move Panthera.move

✞ ☎

class Pink { int Pink; int Blue; public void Blue(int x,int y){} } class Blue extends Pink { public void Blue(int x,int y){} public void Pink(int tone ,int len ){} } class Green extends Pink { public void Blue(int x,int y){} public void Pink(int tone ,int len ){} } class Yellow extends Green { public void Blue(int x,int y){} }

✝ ✆

Identifier renaming 13/82

slide-32
SLIDE 32

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Complicating control flow 14/82

slide-33
SLIDE 33

Complicating control flow

Transformations that make it difficult for an adversary to analyze the flow-of-control:

1 insert bogus control-flow,

Complicating control flow 15/82

slide-34
SLIDE 34

Complicating control flow

Transformations that make it difficult for an adversary to analyze the flow-of-control:

1 insert bogus control-flow, 2 flatten the program

Complicating control flow 15/82

slide-35
SLIDE 35

Complicating control flow

Transformations that make it difficult for an adversary to analyze the flow-of-control:

1 insert bogus control-flow, 2 flatten the program 3 hide the targets of branches to make it difficult for the

adversary to build control-flow graphs

Complicating control flow 15/82

slide-36
SLIDE 36

Complicating control flow

Transformations that make it difficult for an adversary to analyze the flow-of-control:

1 insert bogus control-flow, 2 flatten the program 3 hide the targets of branches to make it difficult for the

adversary to build control-flow graphs

None of these transformations are immune to attacks,

Complicating control flow 15/82

slide-37
SLIDE 37

Opaque Expressions

Simply put: an expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out

Complicating control flow 16/82

slide-38
SLIDE 38

Opaque Expressions

Simply put: an expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out Notation:

PT for an opaquely true predicate PF for an opaquely false predicate P? for an opaquely indeterminate predicate E =v for an opaque expression of value v

Complicating control flow 16/82

slide-39
SLIDE 39

Opaque Expressions

Simply put: an expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out Notation:

PT for an opaquely true predicate PF for an opaquely false predicate P? for an opaquely indeterminate predicate E =v for an opaque expression of value v

Graphical notation:

true false true false true false

P? PT PF

Building blocks for many obfuscations.

Complicating control flow 16/82

slide-40
SLIDE 40

Opaque Expressions

An opaquely true predicate:

true false

2|(x2 + x)T

Complicating control flow 17/82

slide-41
SLIDE 41

Opaque Expressions

An opaquely true predicate:

true false

2|(x2 + x)T

An opaquely indeterminate predicate:

false true

x mod 2 = 0?

Complicating control flow 17/82

slide-42
SLIDE 42

Simple Opaque Predicates

Look in number theory text books, in the problems sections: “Show that ∀x, y ∈ Z : p(x, y)”

Complicating control flow 18/82

slide-43
SLIDE 43

Simple Opaque Predicates

Look in number theory text books, in the problems sections: “Show that ∀x, y ∈ Z : p(x, y)” ∀x, y ∈ Z : x2 − 34y 2 = 1

Complicating control flow 18/82

slide-44
SLIDE 44

Simple Opaque Predicates

Look in number theory text books, in the problems sections: “Show that ∀x, y ∈ Z : p(x, y)” ∀x, y ∈ Z : x2 − 34y 2 = 1 ∀x ∈ Z : 2|x2 + x . . .

Complicating control flow 18/82

slide-45
SLIDE 45

Algorithm obfCTJbogus: Inserting bogus control-flow

Insert bogus control-flow into a function:

1 dead branches which will never be taken

Complicating control flow 19/82

slide-46
SLIDE 46

Algorithm obfCTJbogus: Inserting bogus control-flow

Insert bogus control-flow into a function:

1 dead branches which will never be taken 2 superfluous branches which will always be taken

Complicating control flow 19/82

slide-47
SLIDE 47

Algorithm obfCTJbogus: Inserting bogus control-flow

Insert bogus control-flow into a function:

1 dead branches which will never be taken 2 superfluous branches which will always be taken 3 branches which will sometimes be taken and sometimes not,

but where this doesn’t matter

Complicating control flow 19/82

slide-48
SLIDE 48

Algorithm obfCTJbogus: Inserting bogus control-flow

Insert bogus control-flow into a function:

1 dead branches which will never be taken 2 superfluous branches which will always be taken 3 branches which will sometimes be taken and sometimes not,

but where this doesn’t matter

The resilience reduces to the resilience of the opaque predicates.

Complicating control flow 19/82

slide-49
SLIDE 49

Algorithm obfCTJbogus: Inserting bogus control-flow

It seems that the blue block is only sometimes executed:

true false

PT

Complicating control flow 20/82

slide-50
SLIDE 50

Algorithm obfCTJbogus: Inserting bogus control-flow

A bogus block (green) appears as it might be executed while, in fact, it never will:

true false

PT

Complicating control flow 21/82

slide-51
SLIDE 51

Algorithm obfCTJbogus: Inserting bogus control-flow

Sometimes execute the blue block, sometimes the green block. The green and blue blocks should be semantically equivalent.

true false

P?

Complicating control flow 22/82

slide-52
SLIDE 52

Algorithm obfCTJbogus: Inserting bogus control-flow

Extend a loop condition P by conjoining it with an opaquely true predicate PT:

true false false false true true

P P PT

Complicating control flow 23/82

slide-53
SLIDE 53

Algorithm obfWHKD: Control-flow flattening

Removes the control-flow structure of functions.

Complicating control flow 24/82

slide-54
SLIDE 54

Algorithm obfWHKD: Control-flow flattening

Removes the control-flow structure of functions. Put each basic block as a case inside a switch statement, and wrap the switch inside an infinite loop.

Complicating control flow 24/82

slide-55
SLIDE 55

Algorithm obfWHKD: Control-flow flattening

Removes the control-flow structure of functions. Put each basic block as a case inside a switch statement, and wrap the switch inside an infinite loop. Known as chenxify, chenxification, after Chenxi Wang:

Complicating control flow 24/82

slide-56
SLIDE 56

✞ ☎

int modexp (int y,int x[], int w,int n) { int R, L; int k = 0; int s = 1; while (k < w) { if (x[k] == 1) R = (s*y) % n; else R = s; s = R*R % n; L = R; k++; } return L; }

✝ ✆

if (k<w) if (x[k]==1) s=R*R mod n L = R k++ R=s R=(s*y) mod n s=1 k=0 return L B6 : B1 : B2 : B5 : goto B1 B4 : B3 : B0 :

slide-57
SLIDE 57

✞ ☎

int modexp (int y, int x[], int w, int n) { int R, L, k, s; int next =0; for (;;) switch (next ) { case 0 : k=0; s=1; next =1; break ; case 1 : if (k<w) next =2; else next =6; break; case 2 : if (x[k]==1) next =3; else next =4; break; case 3 : R=(s*y)%n; next =5; break; case 4 : R=s; next =5; break; case 5 : s=R*R%n; L=R; k++; next =1; break ; case 6 : return L; } }

✝ ✆

slide-58
SLIDE 58

next=3 if (k<w) else next=2 next=6 next=5 R=(s*y)%n R=s next=5 S=R*R%n L=R K++ next=1 return L k=0 s=1 next=1 next=0 switch(next) if (x[k]==1) else next=4 B5 B6 B0 B1 B3 B4 B2

slide-59
SLIDE 59

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Complicating control flow 28/82

slide-60
SLIDE 60

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Why?

1 The for loop incurs one jump,

Complicating control flow 28/82

slide-61
SLIDE 61

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Why?

1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable,

Complicating control flow 28/82

slide-62
SLIDE 62

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Why?

1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table.

Complicating control flow 28/82

slide-63
SLIDE 63

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Why?

1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table.

Optimize?

Complicating control flow 28/82

slide-64
SLIDE 64

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Why?

1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table.

Optimize?

1 Keep tight loops as one switch entry.

Complicating control flow 28/82

slide-65
SLIDE 65

Performance penalty

Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor

  • f 2.

Why?

1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table.

Optimize?

1 Keep tight loops as one switch entry. 2 Use gcc’s labels-as-values ⇒ a jump table lets you jump

directly to the next basic block.

Complicating control flow 28/82

slide-66
SLIDE 66

Algorithm obfWHKDalias: Control-flow flattening

Attack against Chenxification:

1 Work out what the next block of every block is.

Complicating control flow 29/82

slide-67
SLIDE 67

Algorithm obfWHKDalias: Control-flow flattening

Attack against Chenxification:

1 Work out what the next block of every block is. 2 Rebuild the original CFG!

Complicating control flow 29/82

slide-68
SLIDE 68

Algorithm obfWHKDalias: Control-flow flattening

Attack against Chenxification:

1 Work out what the next block of every block is. 2 Rebuild the original CFG!

How does an attacker do this?

1 use-def data-flow analysis

Complicating control flow 29/82

slide-69
SLIDE 69

Algorithm obfWHKDalias: Control-flow flattening

Attack against Chenxification:

1 Work out what the next block of every block is. 2 Rebuild the original CFG!

How does an attacker do this?

1 use-def data-flow analysis 2 constant-propagation data-flow analysis

Complicating control flow 29/82

slide-70
SLIDE 70

Compute next as an opaque predicate!

✞ ☎

i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next=E =0 ; for ( ; ; ) switch ( next ) { case 0 : k =0; s =1; next=E =1 ; break ; case 1 : i f ( k<w) next=E =2 ; els e next=E =6 ; break ; case 2 : i f ( x [ k]==1) next=E =3 ; els e next=E =4 ; break ; case 3 : R=(s ∗y)%n ; next=E =5 ; break ; case 4 : R=s ; next=E =5 ; break ; case 5 : s=R∗R%n ; L=R ; k++; next=E =1 ; break ; case 6 : return L ; } }

✝ ✆

Complicating control flow 30/82

slide-71
SLIDE 71

✞ ☎

i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next =0; i n t g [] = {10 ,9 ,2 ,5 ,3}; for ( ; ; ) switch ( next ) { case 0 : k =0; s =1; next=g[0]% g [ 1 ] =1 ; break ; case 1 : i f ( k<w) next=g [ g [ 2 ] ] =2 ; els e next=g [0] −2∗ g [ 2 ] =6 ; break ; case 2 : i f ( x [ k]==1) next=g[3] −g [ 2 ] =3 ; els e next =2∗g [ 2 ] =4 ; break ; case 3 : R=(s ∗y)%n ; next=g [4]+ g [ 2 ] =5 ; break ; case 4 : R=s ; next=g[0] − g [ 3 ] =5 ; break ; case 5 : s=R∗R%n ; L=R ; k++; next=g [ g [4]]% g [ 2 ] =1 ; break ; case 6 : return L ; } }

✝ ✆

slide-72
SLIDE 72

Modify the array at runtime!

A function that rotates an array one step right:

✞ ☎

void permute ( int g [ ] , int n , int ∗ m) { int i ; int tmp=g [ n −1]; for ( i=n −2; i >=0; i −−) g [ i +1] = g [ i ] ; g [0]=tmp ; ∗m = ((∗m)+1)%n ; }

✝ ✆

Make static array aliasing analysis harder for the attacker! Modify the array at runtime!

Complicating control flow 32/82

slide-73
SLIDE 73

✞ ☎

i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next =0; i n t m=0; i n t g [] = {10 ,9 ,2 ,5 ,3}; for ( ; ; ) { switch ( next ) { case 0 : k =0; s =1; next=g[(0+m)%5]%g[(1+m)%5]; break ; case 1 : i f ( k<w) next=g [ ( g[(2+m)%5]+m)%5]; els e next=g[(0+m)%5]−2∗g[(2+m)%5]; break ; case 2 : i f ( x [ k]==1) next=g[(3+m)%5]−g [(2+m)%5]; els e next =2∗g[(2+m)%5]; break ; case 3 : R=(s ∗y)%n ; next=g[(4+m)%5]+g[(2+m)%5]; break ; case 4 : R=s ; next=g[(0+m)%5]−g[(3+m)%5]; break ; case 5 : s=R∗R%n ; L=R ; k++; next=g [ ( g[(4+m)%5]+m)%5]%g[(2+m)%5]; break ; case 6 : return L ; } permute (g ,5 ,&m) ; } }

✝ ✆

slide-74
SLIDE 74

Make the array global!

✞ ☎

i n t g [ 2 0 ] ; i n t m; i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next =0; for ( ; ; ) switch ( next ) { case 0 : k =0; s =1; next=g [m+0]%g [m+ 1]; break ; case 1 : i f ( k<w) next=g [m +g [m+ 2]]; els e next=g [m+0]−2∗g [m+ 2]; break ; case 2 : i f ( x [ k]==1) next=g [m+3]−g [m+2]; els e next =2∗g [m+ 2]; break ; case 3 : R = ( s ∗y)%n ; next=g [m+4]+g [m+ 2]; break ; case 4 : R=s ; next=g [m+0]−g [m+ 3]; break ; case 5 : s = R∗R%n ; L=R ; k++; next=g [m +g [m+4]]%g [m+ 2]; break ; case 6 : return L ; } }

✝ ✆

Complicating control flow 34/82

slide-75
SLIDE 75

With the array global you can initialize it differently at different call sites:

✞ ☎

g [0]=10; g [ 1] = 9; g [ 2] = 2; g [ 3] = 5; g [ 4] = 3; m=0; modexp ( y , x , w, n ) ; . . . g [5]=10; g [ 6] = 9; g [ 7] = 2; g [ 8] = 5; g [ 9] = 3; m=5; modexp ( y , x , w, n ) ;

✝ ✆

slide-76
SLIDE 76

Sprinkle pointer variables (pink), pointer manipulations (blue), dead code (green) over the program:

✞ ☎

i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next =0; i n t g [] = {10 ,9 ,2 ,5 ,3 , 42}; i n t ∗ g2 ; i n t ∗ gr ; for ( ; ; ) switch ( next ) { case 0 : k =0; g2= &g [ 2 ] ; s =1; next=g [0]% g [ 1 ] ; gr= &g [ 5 ] ; break ; case 1 : i f ( k<w) next=g [ ∗g2 ] ; els e next=g[0] −2∗ g [ 2 ] ; break ; case 2 : i f ( x [ k]==1) next=g[3] −∗g2 ; els e next =2∗∗g2 ; break ; case 3 : R=(s ∗y)%n ; next=g [4]+ ∗g2 ; break ; case 4 : R=s ; next=g[0] −g [ 3 ] ; break ; case 5 : s=R∗R%n ; L=R ; k++; next=g [ g [4]]% ∗g2 ; break ; case 6 : return L ; case 7 : ∗ g2 =666; next=∗gr %2; gr=&g [∗ g2 ] ; break ; } }

✝ ✆

slide-77
SLIDE 77

Algorithm obfWHKDalias

Hopefully, because of the obfuscated manipulations the attacker’s static analysis will conclude that nothing can be deduced about next.

Complicating control flow 37/82

slide-78
SLIDE 78

Algorithm obfWHKDalias

Hopefully, because of the obfuscated manipulations the attacker’s static analysis will conclude that nothing can be deduced about next. Not knowing next, he can’t rebuild the CFG.

Complicating control flow 37/82

slide-79
SLIDE 79

Algorithm obfWHKDalias

Hopefully, because of the obfuscated manipulations the attacker’s static analysis will conclude that nothing can be deduced about next. Not knowing next, he can’t rebuild the CFG. Symbolic execution? We know next starts at 0...

Complicating control flow 37/82

slide-80
SLIDE 80
  • bfWHKDopaque: Opaque values from array aliasing

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 36 58 1 46 23 5 16 65 2 41 2 7 1 37 11 16 2 21

Invariants:

1 every third cell (in pink), starting will cell 0, is ≡ 1 mod 5; 2 cells 2 and 5 (green) hold the values 1 and 5, respectively; 3 every third cell (in blue), starting will cell 1, is ≡ 2 mod 7; 4 cells 8 and 11 (yellow) hold the values 2 and 7, respectively.

You can update a pink element as often as you want, with any value you want, as long as you ensure that the value is always ≡ 1 mod 5!

Complicating control flow 38/82

slide-81
SLIDE 81

✞ ☎

int g [] = {36 ,58 ,1 ,46 ,23 ,5 ,16 ,65 ,2 ,41 , 2 ,7 ,1 ,37 ,0 ,11 ,16 ,2 ,21 ,16}; i f (( g [3] % g[5])==g [ 2 ] ) p r i n t f ( ” true !\ n” ) ; g [ 5 ] = ( g [ 1 ] ∗ g [4])% g [11] + g[6]% g [ 5 ] ; g [14] = rand ( ) ; g [ 4] = rand ()∗ g [11]+ g [ 8 ] ; int s i x = ( g [ 4] + g [ 7] + g [10])% g [ 1 1 ] ; int seven = s i x + g[3]% g [ 5 ] ; int fortytwo = s i x ∗ seven ;

✝ ✆

pink: opaquely true predicate. blue: g is constantly changing at runtime. green: an opaque value 42. Initialize g at runtime!

slide-82
SLIDE 82
  • bfLDK: Jumps through branch functions

Replace unconditional jumps with a call to a branch function. Calls normally return to where they came from. . . But, a branch function returns to the target of the jump!

... call bf bf() { } ...

b b: a return to T[h(a)] + a T[h(a)] = b − a T[h(. . .)] = . . . jmp b b: a:

Complicating control flow 40/82

slide-83
SLIDE 83
  • bfLDK: Make branches explicit

✞ ☎

int modexp (int y,int x[], int w,int n) { int R, L; int k = 0; int s = 1; while (k < w) { if (x[k] == 1) R = (s*y) % n; else R = s; s = R*R % n; L = R; k++; } return L; }

✝ ✆

Complicating control flow 41/82

slide-84
SLIDE 84
  • bfLDK: Jumps through branch functions

A table T stores T[h(ai)] = bi − ai. Code in pink updated the return address! The branch function:

✞ ☎

char* T[2]; void bf() { char* old; asm volatile ("movl 4(%% ebp ),%0\n\t" : "=r" ( old )); char* new = ( char *)(( int)T[h(old)] + ( int)old ); asm volatile ("movl %0 ,4(%% ebp )\n\t" : : "r" (new )); }

✝ ✆

Complicating control flow 42/82

slide-85
SLIDE 85

✞ ☎

int modexp (int y, int x[], int w, int n) { int R, L; int k = 0; int s = 1; T[h(&& retaddr1 )]=( char *)(&& endif -&& retaddr1 ); T[h(&& retaddr2 )]=( char *)(&& beginloop -&& retaddr2 ); beginloop : if (k >= w) goto endloop ; if (x[k] != 1) goto elsepart ; R = (s*y) % n; bf (); // goto endif ; retaddr1 : asm volatile (".ascii \" bogus \"\n\t"); elsepart : R = s; endif : s = R*R % n; L = R; k++; bf (); // goto beginloop; retaddr2 : endloop : return L; }

✝ ✆

slide-86
SLIDE 86
  • bfLDK: Jumps through branch functions

Designed to confuse disassembly. 39% of instructions are incorrectly assembled using a linear sweep disassembly. 25% for recursive disassembly. Execution penalty: 13% Increase in text segment size: 15%.

Complicating control flow 44/82

slide-87
SLIDE 87

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Opaque Predicates 45/82

slide-88
SLIDE 88

Constructing opaque predicates

Construct them based on

number theoretic results

∀x, y ∈ Z : x2 − 34y 2 = 1 ∀x ∈ Z : 2|x2 + x

the hardness of alias analysis the hardness of concurrency analysis

Opaque Predicates 46/82

slide-89
SLIDE 89

Constructing opaque predicates

Construct them based on

number theoretic results

∀x, y ∈ Z : x2 − 34y 2 = 1 ∀x ∈ Z : 2|x2 + x

the hardness of alias analysis the hardness of concurrency analysis

Protect them by

making them hard to find making them hard to break

Opaque Predicates 46/82

slide-90
SLIDE 90

Constructing opaque predicates

Construct them based on

number theoretic results

∀x, y ∈ Z : x2 − 34y 2 = 1 ∀x ∈ Z : 2|x2 + x

the hardness of alias analysis the hardness of concurrency analysis

Protect them by

making them hard to find making them hard to break

If your obfuscator keeps a table of predicates, your adversary will too!

Opaque Predicates 46/82

slide-91
SLIDE 91

Algorithm obfCTJalias: Opaque predicates from pointer aliasing

Create an obfuscating transformation from a known computationally hard static analysis problem.

Opaque Predicates 47/82

slide-92
SLIDE 92

Algorithm obfCTJalias: Opaque predicates from pointer aliasing

Create an obfuscating transformation from a known computationally hard static analysis problem. We assume that

1 the attacker will analyze the program statically, and 2 we can force him to solve a particular static analysis problem

to discover the secret he’s after, and

3 we can generate an actual hard instance of this problem for

him to solve.

Opaque Predicates 47/82

slide-93
SLIDE 93

Algorithm obfCTJalias: Opaque predicates from pointer aliasing

Create an obfuscating transformation from a known computationally hard static analysis problem. We assume that

1 the attacker will analyze the program statically, and 2 we can force him to solve a particular static analysis problem

to discover the secret he’s after, and

3 we can generate an actual hard instance of this problem for

him to solve.

Of course, these assumptions may be false!

Opaque Predicates 47/82

slide-94
SLIDE 94

Algorithm obfCTJalias

Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true.

q1 q2 Opaque Predicates 48/82

slide-95
SLIDE 95

Algorithm obfCTJalias

Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q1 and q2 point into two graphs G1 (pink) and G2 (blue):

split q2 q1 q2 q1 Opaque Predicates 48/82

slide-96
SLIDE 96

Algorithm obfCTJalias

Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q1 and q2 point into two graphs G1 (pink) and G2 (blue):

split insert q2 q1 q1 q2 q2 q1 Opaque Predicates 48/82

slide-97
SLIDE 97

Algorithm obfCTJalias

Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q1 and q2 point into two graphs G1 (pink) and G2 (blue):

delete split insert q1 q2 q1 q2 q1 q2 q2 q1 Opaque Predicates 48/82

slide-98
SLIDE 98

Algorithm obfCTJalias

Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q1 and q2 point into two graphs G1 (pink) and G2 (blue):

move delete split insert q2 q1 q1 q2 q2 q1 q1 q2 q2 q1 Opaque Predicates 48/82

slide-99
SLIDE 99

Algorithm obfCTJalias

Two invariants:

“G1 and G2 are circular linked lists” “q1 points to a node in G1 and q2 points to a node in G2.”

Opaque Predicates 49/82

slide-100
SLIDE 100

Algorithm obfCTJalias

Two invariants:

“G1 and G2 are circular linked lists” “q1 points to a node in G1 and q2 points to a node in G2.”

Perform enough operations to confuse even the most precise alias analysis algorithm,

Opaque Predicates 49/82

slide-101
SLIDE 101

Algorithm obfCTJalias

Two invariants:

“G1 and G2 are circular linked lists” “q1 points to a node in G1 and q2 points to a node in G2.”

Perform enough operations to confuse even the most precise alias analysis algorithm, Insert opaque queries such as (q1 = q2)T into the code.

Opaque Predicates 49/82

slide-102
SLIDE 102

Algorithm obfCTJpointer: Opaque predicates from concurrency

Concurrent programs are difficult to analyze statically: n statements in a parallel region can execute in n! different

  • rders.

Opaque Predicates 50/82

slide-103
SLIDE 103

Algorithm obfCTJpointer: Opaque predicates from concurrency

Concurrent programs are difficult to analyze statically: n statements in a parallel region can execute in n! different

  • rders.

Construct opaque predicates based on the difficulty of analyzing the threading behavior of programs!

Opaque Predicates 50/82

slide-104
SLIDE 104

Algorithm obfCTJpointer: Opaque predicates from concurrency

Concurrent programs are difficult to analyze statically: n statements in a parallel region can execute in n! different

  • rders.

Construct opaque predicates based on the difficulty of analyzing the threading behavior of programs! Keep a global data structure G with a certain set of invariants I, to concurrently update G while maintaining I, and use I to construct opaque predicates over G

Opaque Predicates 50/82

slide-105
SLIDE 105

Opaque predicates from concurrency

b a d c b a move(a, b) move(c, d) b a d c d c

Opaque Predicates 51/82

slide-106
SLIDE 106

Opaque predicates from concurrency

Thread T1 updates a and b, such that each time a is updated to point to its next node in the cycle, b is also updated to point to its next node in the cycle.

Opaque Predicates 52/82

slide-107
SLIDE 107

Opaque predicates from concurrency

Thread T1 updates a and b, such that each time a is updated to point to its next node in the cycle, b is also updated to point to its next node in the cycle. Thread T2 updates c and d.

Opaque Predicates 52/82

slide-108
SLIDE 108

Opaque predicates from concurrency

Thread T1 updates a and b, such that each time a is updated to point to its next node in the cycle, b is also updated to point to its next node in the cycle. Thread T2 updates c and d. Opaquely true predicate (a = b)T is statically indistinguishable from an opaquely false predicate (c = d)F!

Opaque Predicates 52/82

slide-109
SLIDE 109

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Data encodings 53/82

slide-110
SLIDE 110

Encoding literal data

Literal data often carries much semantic information:

"Please enter your password:" 0xA17BC97A7E5F...FF67 (maybe a cryptographic key???)

Data encodings 54/82

slide-111
SLIDE 111

Encoding literal data

Literal data often carries much semantic information:

"Please enter your password:" 0xA17BC97A7E5F...FF67 (maybe a cryptographic key???)

Split up in pieces.

Data encodings 54/82

slide-112
SLIDE 112

Encoding literal data

Literal data often carries much semantic information:

"Please enter your password:" 0xA17BC97A7E5F...FF67 (maybe a cryptographic key???)

Split up in pieces. Xor with a constant.

Data encodings 54/82

slide-113
SLIDE 113

Encoding literal data

Literal data often carries much semantic information:

"Please enter your password:" 0xA17BC97A7E5F...FF67 (maybe a cryptographic key???)

Split up in pieces. Xor with a constant. Avoid ever reconstituting the literal in cleartext! (What about printf?)

Data encodings 54/82

slide-114
SLIDE 114

Encoding literal data

Literal data often carries much semantic information:

"Please enter your password:" 0xA17BC97A7E5F...FF67 (maybe a cryptographic key???)

Split up in pieces. Xor with a constant. Avoid ever reconstituting the literal in cleartext! (What about printf?) Print each character one at a time?

Data encodings 54/82

slide-115
SLIDE 115

Convert literals to code — Mealy machine

Encode the strings "MIMI" and "MILA" in a finite state transducer (a Mealy machine)

Data encodings 55/82

slide-116
SLIDE 116

Convert literals to code — Mealy machine

Encode the strings "MIMI" and "MILA" in a finite state transducer (a Mealy machine) The machine takes a bitstring and a state transition table as input and and generates a string as output.

Data encodings 55/82

slide-117
SLIDE 117

Convert literals to code — Mealy machine

Encode the strings "MIMI" and "MILA" in a finite state transducer (a Mealy machine) The machine takes a bitstring and a state transition table as input and and generates a string as output. Mealy(102) produces "MIMI". Mealy(1102) produces "MILA".

Data encodings 55/82

slide-118
SLIDE 118

Convert literals to code — Mealy machine

i/’l’ 1/’i’ 0/’a’ 0/’i’ 1/’b’ 0/’m’

1 2 3

✞ ☎

int next [][2] = {{1,2}, {3,0}, {3 ,2}}; char out [][2] = {{’m’,’l’}, {’i’,’i’}, {’a’,’b’}};

✝ ✆

s0

i/o

− → s1 means in state s0 on input i transfer to state s1 and produce an o. next[state][input]=next state

  • ut[state][input]=output

Data encodings 56/82

slide-119
SLIDE 119

Mealy machine — table driven

✞ ☎

char* mealy(int v) { char* str =(char *) malloc (10); int state =0, len =0; while ( state !=3) { int input = 1&v; v >>= 1; str [len ++]= out[state ][input ]; state = next[state ][ input ]; } str [len ]=’\0’; return str; }

✝ ✆

Data encodings 57/82

slide-120
SLIDE 120

Mealy machine — hardcoded

✞ ☎

char* mealy (int v) { char* str =( char*) malloc (10); int state =0, len =0; while (1) { int input = 1&v; v >>= 1; switch ( state ) { case 0: state =( input ==0)?1:2; str[len ++]=( input ==0)? ’m’:’l’; break; case 1: state =( input ==0)?3:0; str[len ++]= ’i’; break; case 2: state =( input ==0)?3:2; str[len ++]=( input ==0)? ’a’:’b’; break; case 3: str[len ]=’\0’; return str; } } }

✝ ✆

Data encodings 58/82

slide-121
SLIDE 121

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Dynamic Obfuscation 59/82

slide-122
SLIDE 122

Static vs. Dynamic obfuscation

Static obfuscations transform the code prior to execution.

Dynamic Obfuscation 60/82

slide-123
SLIDE 123

Static vs. Dynamic obfuscation

Static obfuscations transform the code prior to execution. Dynamic algorithms transform the program at runtime.

Dynamic Obfuscation 60/82

slide-124
SLIDE 124

Static vs. Dynamic obfuscation

Static obfuscations transform the code prior to execution. Dynamic algorithms transform the program at runtime. Static obfuscation counter attacks by static analysis.

Dynamic Obfuscation 60/82

slide-125
SLIDE 125

Static vs. Dynamic obfuscation

Static obfuscations transform the code prior to execution. Dynamic algorithms transform the program at runtime. Static obfuscation counter attacks by static analysis. Dynamic obfuscation counter attacks by dynamic analysis.

Dynamic Obfuscation 60/82

slide-126
SLIDE 126

Static vs. Dynamic obfuscation

Statically obfuscated code: the attacker sees the same mess every time.

Dynamic Obfuscation 61/82

slide-127
SLIDE 127

Static vs. Dynamic obfuscation

Statically obfuscated code: the attacker sees the same mess every time. Dynamic obfuscated code: the execution path changes as the program runs.

Dynamic Obfuscation 61/82

slide-128
SLIDE 128

Static vs. Dynamic obfuscation

Statically obfuscated code: the attacker sees the same mess every time. Dynamic obfuscated code: the execution path changes as the program runs. Some algorithms are “semi-dynamic” — they perform a small, constant number of transformations (often one) at runtime

Dynamic Obfuscation 61/82

slide-129
SLIDE 129

Static vs. Dynamic obfuscation

Statically obfuscated code: the attacker sees the same mess every time. Dynamic obfuscated code: the execution path changes as the program runs. Some algorithms are “semi-dynamic” — they perform a small, constant number of transformations (often one) at runtime Some algorithms are continuous: the code is in constant flux.

Dynamic Obfuscation 61/82

slide-130
SLIDE 130

Dynamic Obfuscation: Definitions

A dynamic obfuscator runs in two phases:

1 At compile-time transform the program to an initial

configuration and add a runtime code-transformer.

2 At runtime, intersperse the execution of the program with

calls to the transformer.

Dynamic Obfuscation 62/82

slide-131
SLIDE 131

Dynamic Obfuscation: Definitions

A dynamic obfuscator runs in two phases:

1 At compile-time transform the program to an initial

configuration and add a runtime code-transformer.

2 At runtime, intersperse the execution of the program with

calls to the transformer.

A dynamic obfuscator turns a “normal” program into a self-modifying one.

Dynamic Obfuscation 62/82

slide-132
SLIDE 132

Modeling dynamic obfuscation — compile-time

P Dynamic Obfuscation 63/82

slide-133
SLIDE 133

Modeling dynamic obfuscation — compile-time

Configuration Create Initial

I

P P′

Transformer I creates P’s initial configuration.

Dynamic Obfuscation 63/82

slide-134
SLIDE 134

Modeling dynamic obfuscation — compile-time

Transformer Embed Runtime Configuration Create Initial

I T

P P′ P′

T

Transformer I creates P’s initial configuration. T is the runtime obfuscator, embedded in P′.

Dynamic Obfuscation 63/82

slide-135
SLIDE 135

Modeling dynamic obfuscation — runtime

P′

T

Transformer T continuously modifies P′ at runtime.

Dynamic Obfuscation 64/82

slide-136
SLIDE 136

Modeling dynamic obfuscation — runtime T

P′ P′

T

Transformer T continuously modifies P′ at runtime.

Dynamic Obfuscation 64/82

slide-137
SLIDE 137

Modeling dynamic obfuscation — runtime

P′

T

P′

T

P′

T

Transformer T continuously modifies P′ at runtime.

Dynamic Obfuscation 64/82

slide-138
SLIDE 138

Modeling dynamic obfuscation — runtime T

P′

T

P′ P′

T T

P′

Transformer T continuously modifies P′ at runtime.

Dynamic Obfuscation 64/82

slide-139
SLIDE 139

Modeling dynamic obfuscation — runtime

...

P′

T

P′ P′

T T

P′

T

Transformer T continuously modifies P′ at runtime.

Dynamic Obfuscation 64/82

slide-140
SLIDE 140

Modeling dynamic obfuscation — runtime

P′

T

P′

T

P′

T

P′

T T

P′

Transformer T continuously modifies P′ at runtime. We’d like an infinite, non-repeating series of configurations. In practice, the configurations repeat.

Dynamic Obfuscation 64/82

slide-141
SLIDE 141

Dynamic obfuscation: Aucsmith’s algorithm

C0 : C1 : C2 : C3 : C4 : C5 : A function is split into cells.

Dynamic Obfuscation 65/82

slide-142
SLIDE 142

Dynamic obfuscation: Aucsmith’s algorithm

C0 : C1 : C2 : C3 : C4 : C5 : A function is split into cells. The cells are divided into two regions in memory, upper and lower.

Dynamic Obfuscation 65/82

slide-143
SLIDE 143

One step

C0 : C1 : C2 : C3 : C4 : C5 : C0 : C1 : C2 : C3 : C4 : C5 :

  • rig

M0

Dynamic Obfuscation 66/82

slide-144
SLIDE 144

XOR!

⊕ = ⊕ = ⊕ =

Dynamic Obfuscation 67/82

slide-145
SLIDE 145

The Dynamic Primitive — Aucsmith

Dynamic Obfuscation 68/82

slide-146
SLIDE 146

The Dynamic Primitive — Aucsmith

Dynamic Obfuscation 68/82

slide-147
SLIDE 147

The Dynamic Primitive — Aucsmith

                                              ⊗

Dynamic Obfuscation 68/82

slide-148
SLIDE 148

The Dynamic Primitive — Aucsmith

                                              ⊗

Dynamic Obfuscation 68/82

slide-149
SLIDE 149

The Dynamic Primitive — Aucsmith

                                              ⊗

Dynamic Obfuscation 68/82

slide-150
SLIDE 150

The Dynamic Primitive — Aucsmith

                                              ⊗

Dynamic Obfuscation 68/82

slide-151
SLIDE 151

The Dynamic Primitive — Aucsmith

                                              ⊗

Dynamic Obfuscation 68/82

slide-152
SLIDE 152

The Dynamic Primitive — Aucsmith

                                              ⊗

Dynamic Obfuscation 68/82

slide-153
SLIDE 153

Why does this work?

A B

Dynamic Obfuscation 69/82

slide-154
SLIDE 154

Why does this work?

A B ⇓ B ← B ⊕ A

Dynamic Obfuscation 69/82

slide-155
SLIDE 155

Why does this work?

A B ⇓ B ← B ⊕ A ⇓ A ← A ⊕ B

Dynamic Obfuscation 69/82

slide-156
SLIDE 156

Why does this work?

A B ⇓ B ← B ⊕ A ⇓ A ← A ⊕ B ⇓ B ← B ⊕ A

Dynamic Obfuscation 69/82

slide-157
SLIDE 157
  • bfCKSP: Code as key material

Encrypt the code to keep as little code as possible in the clear at any point in time during execution.

Dynamic Obfuscation 70/82

slide-158
SLIDE 158
  • bfCKSP: Code as key material

Encrypt the code to keep as little code as possible in the clear at any point in time during execution. Extremes:

1 Decrypt the next instruction, execute it, re-encrypt it, . . . ⇒

  • nly one instruction is ever in the clear!

Dynamic Obfuscation 70/82

slide-159
SLIDE 159
  • bfCKSP: Code as key material

Encrypt the code to keep as little code as possible in the clear at any point in time during execution. Extremes:

1 Decrypt the next instruction, execute it, re-encrypt it, . . . ⇒

  • nly one instruction is ever in the clear!

2 Decrypt the entire program once, prior to execution, and leave

it in cleartext. ⇒ easy for the adversary to capture the code.

Dynamic Obfuscation 70/82

slide-160
SLIDE 160
  • bfCKSP: Code as key material

The entire program is encrypted — except for main.

Dynamic Obfuscation 71/82

slide-161
SLIDE 161
  • bfCKSP: Code as key material

The entire program is encrypted — except for main. Before you jump to a function you decrypt it.

Dynamic Obfuscation 71/82

slide-162
SLIDE 162
  • bfCKSP: Code as key material

The entire program is encrypted — except for main. Before you jump to a function you decrypt it. When the function returns you re-encrypt it.

Dynamic Obfuscation 71/82

slide-163
SLIDE 163
  • bfCKSP: Code as key material

The entire program is encrypted — except for main. Before you jump to a function you decrypt it. When the function returns you re-encrypt it. On entry, a function first encrypts its caller.

Dynamic Obfuscation 71/82

slide-164
SLIDE 164
  • bfCKSP: Code as key material

The entire program is encrypted — except for main. Before you jump to a function you decrypt it. When the function returns you re-encrypt it. On entry, a function first encrypts its caller. Before returning, a function decrypts its caller.

Dynamic Obfuscation 71/82

slide-165
SLIDE 165
  • bfCKSP: Code as key material

The entire program is encrypted — except for main. Before you jump to a function you decrypt it. When the function returns you re-encrypt it. On entry, a function first encrypts its caller. Before returning, a function decrypts its caller. ⇒ At most two functions are ever in the clear!

Dynamic Obfuscation 71/82

slide-166
SLIDE 166
  • bfCKSP: Code as key material

What do we use as key? The code itself!

Dynamic Obfuscation 72/82

slide-167
SLIDE 167
  • bfCKSP: Code as key material

What do we use as key? The code itself! What cipher do we use? Something simple!

Dynamic Obfuscation 72/82

slide-168
SLIDE 168
  • bfCKSP: Code as key material

In the simplest case the call-graph is tree-shaped:

main play decode decrypt getkey

Dynamic Obfuscation 73/82

slide-169
SLIDE 169
  • bfCKSP: Code as key material

In the simplest case the call-graph is tree-shaped:

main play decode decrypt getkey

Before and after every procedure cally you insert calls to a guard function that decrypts/re-encrypts the callee, using a hash of the cleartext of the caller as key.

Dynamic Obfuscation 73/82

slide-170
SLIDE 170
  • bfCKSP: Code as key material

In the simplest case the call-graph is tree-shaped:

main play decode decrypt getkey

Before and after every procedure cally you insert calls to a guard function that decrypts/re-encrypts the callee, using a hash of the cleartext of the caller as key. On entrance and exit of the callee you encrypt/decrypt the caller using a hash of the cleartext of the callee as key.

Dynamic Obfuscation 73/82

slide-171
SLIDE 171

✞ ☎

int player_main (int argc , char *argv []) { int user_key = 0 xca7ca115 ; int digital_media [] = {10 ,102}; guard(play ,playSIZE ,player_main , player_mainSIZE); play(user_key ,digital_media ,2); guard(play ,playSIZE ,player_main , player_mainSIZE); } int getkey (int user_key ) { guard(decrypt ,decryptSIZE ,getkey ,getkeySIZE ); int player_key = 0 xbabeca75 ; int v = user_key ^ player_key ; guard(decrypt ,decryptSIZE ,getkey ,getkeySIZE ); return v; } int decrypt (int user_key , int media ) { guard(play ,playSIZE ,decrypt ,decryptSIZE ); guard(getkey ,getkeySIZE ,decrypt , decryptSIZE ); int key = getkey (user_key ); guard(getkey ,getkeySIZE ,decrypt , decryptSIZE ); int v = media ^ key; guard(play ,playSIZE ,decrypt ,decryptSIZE ); return v; }

✝ ✆

slide-172
SLIDE 172

✞ ☎

float decode (int digital ) { guard(play ,playSIZE ,decode ,decodeSIZE ); float v = ( float)digital ; guard(play ,playSIZE ,decode ,decodeSIZE ); return v; } void play(int user_key , int digital_media[], int len ) { int i; guard(player_main ,player_mainSIZE ,play ,playSIZE ); for(i=0;i<len;i++) { guard (decrypt ,decryptSIZE ,play ,playSIZE ); int digital = decrypt (user_key ,digital_media[i]); guard (decrypt ,decryptSIZE ,play ,playSIZE ); guard (decode ,decodeSIZE ,play ,playSIZE ); printf ("%f\n",decode (digital )); guard (decode ,decodeSIZE ,play ,playSIZE ); } guard(player_main ,player_mainSIZE ,play ,playSIZE ); }

✝ ✆

slide-173
SLIDE 173

✞ ☎

void crypto ( waddr t proc , uint32 key , int words ) { int i ; for ( i =1; i<words ; i ++) { ∗ proc ˆ= key ; proc++; } } void guard ( waddr t proc , int proc words , waddr t key proc , int key words ) { uint32 key = hash1 ( key proc , key words ) ; crypto ( proc , key , proc words ) ; }

✝ ✆

slide-174
SLIDE 174
  • bfCKSP: Code as key material

So, what if the call-graph is shaped like a DAG, like this:

main c1 c2 b1 b2 a

What key to use to decrypt a?

Dynamic Obfuscation 77/82

slide-175
SLIDE 175
  • bfCKSP: Code as key material

So, what if the call-graph is shaped like a DAG, like this:

main c1 c2 b1 b2 a

What key to use to decrypt a? We can’t use the cleartext of the caller as key, because now there are two callers!

Dynamic Obfuscation 77/82

slide-176
SLIDE 176
  • bfCKSP: Code as key material

So, what if the call-graph is shaped like a DAG, like this:

main c1 c2 b1 b2 a

What key to use to decrypt a? We can’t use the cleartext of the caller as key, because now there are two callers! Let the callers’ callers(c1 and c2) do the decryption using a combination of the ciphertexts of b1 and b2.

Dynamic Obfuscation 77/82

slide-177
SLIDE 177
  • bfCKSP: Code as key material

What if the program is recursive?

main

Dynamic Obfuscation 78/82

slide-178
SLIDE 178
  • bfCKSP: Code as key material

What if the program is recursive?

main

Keep the entire cycle in cleartext. . . .

Dynamic Obfuscation 78/82

slide-179
SLIDE 179

Outline

1

Introduction

2

Identifier renaming

3

Complicating control flow Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions

4

Opaque Predicates Opaque predicates from pointer aliasing

5

Data encodings

6

Dynamic Obfuscation Self-Modifying State Machine Code as key material

7

Discussion

Discussion 79/82

slide-180
SLIDE 180

Code Obfuscation — What’s it Good For?

Diversification — make every program unique to prevent malware attacks

Discussion 80/82

slide-181
SLIDE 181

Code Obfuscation — What’s it Good For?

Diversification — make every program unique to prevent malware attacks Prevent collusion — make every program unique to prevent diffing attacks

Discussion 80/82

slide-182
SLIDE 182

Code Obfuscation — What’s it Good For?

Diversification — make every program unique to prevent malware attacks Prevent collusion — make every program unique to prevent diffing attacks Code Privacy — make programs hard to understand to protect algorithms

Discussion 80/82

slide-183
SLIDE 183

Code Obfuscation — What’s it Good For?

Diversification — make every program unique to prevent malware attacks Prevent collusion — make every program unique to prevent diffing attacks Code Privacy — make programs hard to understand to protect algorithms Data Privacy — make programs hard to understand to protect secret data (keys)

Discussion 80/82

slide-184
SLIDE 184

Code Obfuscation — What’s it Good For?

Diversification — make every program unique to prevent malware attacks Prevent collusion — make every program unique to prevent diffing attacks Code Privacy — make programs hard to understand to protect algorithms Data Privacy — make programs hard to understand to protect secret data (keys) Integrity — make programs hard to understand to make them hard to change

Discussion 80/82

slide-185
SLIDE 185

Common Obfuscating Transformations

Many obfuscating transformations are built on some simple general operations:

Splitting/Merging Duplication Reordering Mapping Indirection

Discussion 81/82

slide-186
SLIDE 186

Common Obfuscating Transformations

Many obfuscating transformations are built on some simple general operations:

Splitting/Merging Duplication Reordering Mapping Indirection

Apply these basic operations to

Control structures Data structures Abstractions

Discussion 81/82

slide-187
SLIDE 187

Static VS. Dynamic Obfuscation

Static obfuscations confuse static analysis. Dynamic obfuscations confuse static and dynamic analysis.

the code segment is treated as code and data

Dynamic algorithms generate self-modifying code. Bad for performance:

1 flush instruction pipeline 2 write data caches to memory 3 invalidate instruction caches

Discussion 82/82