outline
play

Outline Introduction 1 Identifier renaming 2 Complicating - PowerPoint PPT Presentation

Outline Introduction 1 Identifier renaming 2 Complicating control flow 3 Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions Opaque Predicates 4 Opaque predicates from


  1. Complicating control flow Transformations that make it difficult for an adversary to analyze the flow-of-control: 1 insert bogus control-flow, 2 flatten the program 3 hide the targets of branches to make it difficult for the adversary to build control-flow graphs Complicating control flow 15/82

  2. Complicating control flow Transformations that make it difficult for an adversary to analyze the flow-of-control: 1 insert bogus control-flow, 2 flatten the program 3 hide the targets of branches to make it difficult for the adversary to build control-flow graphs None of these transformations are immune to attacks, Complicating control flow 15/82

  3. Opaque Expressions Simply put: an expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out Complicating control flow 16/82

  4. Opaque Expressions Simply put: an expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out Notation: P T for an opaquely true predicate P F for an opaquely false predicate P ? for an opaquely indeterminate predicate E = v for an opaque expression of value v Complicating control flow 16/82

  5. Opaque Expressions Simply put: an expression whose value is known to you as the defender (at obfuscation time) but which is difficult for an attacker to figure out Notation: P T for an opaquely true predicate P F for an opaquely false predicate P ? for an opaquely indeterminate predicate E = v for an opaque expression of value v Graphical notation: true false true false true false P T P F P ? Building blocks for many obfuscations. Complicating control flow 16/82

  6. Opaque Expressions An opaquely true predicate: true false 2 | ( x 2 + x ) T Complicating control flow 17/82

  7. Opaque Expressions An opaquely true predicate: true false 2 | ( x 2 + x ) T An opaquely indeterminate predicate: false true x mod 2 = 0 ? Complicating control flow 17/82

  8. Simple Opaque Predicates Look in number theory text books, in the problems sections: “Show that ∀ x , y ∈ Z : p ( x , y )” Complicating control flow 18/82

  9. Simple Opaque Predicates Look in number theory text books, in the problems sections: “Show that ∀ x , y ∈ Z : p ( x , y )” ∀ x , y ∈ Z : x 2 − 34 y 2 � = 1 Complicating control flow 18/82

  10. Simple Opaque Predicates Look in number theory text books, in the problems sections: “Show that ∀ x , y ∈ Z : p ( x , y )” ∀ x , y ∈ Z : x 2 − 34 y 2 � = 1 ∀ x ∈ Z : 2 | x 2 + x . . . Complicating control flow 18/82

  11. Algorithm obfCTJ bogus : Inserting bogus control-flow Insert bogus control-flow into a function: 1 dead branches which will never be taken Complicating control flow 19/82

  12. Algorithm obfCTJ bogus : Inserting bogus control-flow Insert bogus control-flow into a function: 1 dead branches which will never be taken 2 superfluous branches which will always be taken Complicating control flow 19/82

  13. Algorithm obfCTJ bogus : Inserting bogus control-flow Insert bogus control-flow into a function: 1 dead branches which will never be taken 2 superfluous branches which will always be taken 3 branches which will sometimes be taken and sometimes not, but where this doesn’t matter Complicating control flow 19/82

  14. Algorithm obfCTJ bogus : Inserting bogus control-flow Insert bogus control-flow into a function: 1 dead branches which will never be taken 2 superfluous branches which will always be taken 3 branches which will sometimes be taken and sometimes not, but where this doesn’t matter The resilience reduces to the resilience of the opaque predicates. Complicating control flow 19/82

  15. Algorithm obfCTJ bogus : Inserting bogus control-flow It seems that the blue block is only sometimes executed: true false P T Complicating control flow 20/82

  16. Algorithm obfCTJ bogus : Inserting bogus control-flow A bogus block (green) appears as it might be executed while, in fact, it never will: true false P T Complicating control flow 21/82

  17. Algorithm obfCTJ bogus : Inserting bogus control-flow Sometimes execute the blue block, sometimes the green block. The green and blue blocks should be semantically equivalent. true false P ? Complicating control flow 22/82

  18. Algorithm obfCTJ bogus : Inserting bogus control-flow Extend a loop condition P by conjoining it with an opaquely true predicate P T : false true false true P T P P true false Complicating control flow 23/82

  19. Algorithm obfWHKD : Control-flow flattening Removes the control-flow structure of functions. Complicating control flow 24/82

  20. Algorithm obfWHKD : Control-flow flattening Removes the control-flow structure of functions. Put each basic block as a case inside a switch statement, and wrap the switch inside an infinite loop. Complicating control flow 24/82

  21. Algorithm obfWHKD : Control-flow flattening Removes the control-flow structure of functions. Put each basic block as a case inside a switch statement, and wrap the switch inside an infinite loop. Known as chenxify , chenxification , after Chenxi Wang: Complicating control flow 24/82

  22. ✞ ☎ B 0 : k=0 int modexp (int y,int x[], s=1 int w,int n) { int R, L; B 1 : if (k<w) int k = 0; int s = 1; B 6 : while (k < w) { B 2 : if (x[k]==1) return L if (x[k] == 1) R = (s*y) % n; else B 3 : B 4 : R=(s*y) mod n R=s R = s; s = R*R % n; L = R; B 5 : s=R*R mod n k++; L = R } k++ return L; goto B 1 } ✝ ✆

  23. ✞ ☎ int modexp (int y, int x[], int w, int n) { int R, L, k, s; int next =0; for (;;) switch (next ) { case 0 : k=0; s=1; next =1; break ; case 1 : if (k<w) next =2; else next =6; break; case 2 : if (x[k]==1) next =3; else next =4; break; case 3 : R=(s*y)%n; next =5; break; case 4 : R=s; next =5; break; case 5 : s=R*R%n; L=R; k++; next =1; break ; case 6 : return L; } } ✝ ✆

  24. next=0 switch(next) R=(s*y)%n R=s S=R*R%n k=0 if (k<w) if (x[k]==1) return L s=1 next=5 next=5 L=R next=2 next=3 B 6 next=1 K++ else else B 3 B 4 next=6 next=4 next=1 B 0 B 2 B 1 B 5

  25. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Complicating control flow 28/82

  26. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Why? 1 The for loop incurs one jump, Complicating control flow 28/82

  27. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Why? 1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, Complicating control flow 28/82

  28. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Why? 1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table. Complicating control flow 28/82

  29. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Why? 1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table. Optimize? Complicating control flow 28/82

  30. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Why? 1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table. Optimize? 1 Keep tight loops as one switch entry. Complicating control flow 28/82

  31. Performance penalty Replacing 50% of the branches in three SPEC programs slows them down by a factor of 4 and increases their size by a factor of 2. Why? 1 The for loop incurs one jump, 2 the switch incurs a bounds check the next variable, 3 the switch incurs an indirect jump through a jump table. Optimize? 1 Keep tight loops as one switch entry. 2 Use gcc ’s labels-as-values ⇒ a jump table lets you jump directly to the next basic block. Complicating control flow 28/82

  32. Algorithm obfWHKD alias : Control-flow flattening Attack against Chenxification: 1 Work out what the next block of every block is. Complicating control flow 29/82

  33. Algorithm obfWHKD alias : Control-flow flattening Attack against Chenxification: 1 Work out what the next block of every block is. 2 Rebuild the original CFG! Complicating control flow 29/82

  34. Algorithm obfWHKD alias : Control-flow flattening Attack against Chenxification: 1 Work out what the next block of every block is. 2 Rebuild the original CFG! How does an attacker do this? 1 use-def data-flow analysis Complicating control flow 29/82

  35. Algorithm obfWHKD alias : Control-flow flattening Attack against Chenxification: 1 Work out what the next block of every block is. 2 Rebuild the original CFG! How does an attacker do this? 1 use-def data-flow analysis 2 constant-propagation data-flow analysis Complicating control flow 29/82

  36. Compute next as an opaque predicate! ✞ ☎ i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; next= E =0 ; i n t for ( ; ; ) switch ( next ) { k =0; s =1; next= E =1 ; case 0 : break ; next= E =2 ; next= E =6 ; case 1 : i f ( k < w) els e break ; ( x [ k]==1) next= E =3 ; next= E =4 ; case 2 : i f els e break ; next= E =5 ; case 3 : R=(s ∗ y)%n ; break ; next= E =5 ; case 4 : R=s ; break ; s=R ∗ R%n ; L=R ; k++; next= E =1 ; case 5 : break ; case 6 : return L ; } } ✝ ✆ Complicating control flow 30/82

  37. ✞ ☎ modexp ( i n t y , x [ ] , i n t w , i n t n ) { i n t i n t i n t R , L , k , s ; next =0; i n t g [] = { 10 ,9 ,2 ,5 ,3 } ; i n t for ( ; ; ) switch ( next ) { k =0; s =1; next=g[0]% g [ 1 ] =1 ; break ; 0 : case next=g [ g [ 2 ] ] =2 ; 1 : ( k < w) case i f next=g [0] − 2 ∗ g [ 2 ] =6 ; break ; els e ( x [ k]==1) next=g[3] − g [ 2 ] =3 ; 2 : case i f next =2 ∗ g [ 2 ] =4 ; break ; els e next=g [4]+ g [ 2 ] =5 ; break ; 3 : R=(s ∗ y)%n ; case next=g[0] − g [ 3 ] =5 ; 4 : R=s ; break ; case s=R ∗ R%n ; L=R ; k++; next=g [ g [4]]% g [ 2 ] =1 ; 5 : case break ; 6 : return L ; case } } ✝ ✆

  38. Modify the array at runtime! A function that rotates an array one step right: ✞ ☎ void permute ( int g [ ] , int n , int ∗ m) { i ; int int tmp=g [ n − 1]; for ( i=n − 2; i > =0; i −− ) g [ i +1] = g [ i ] ; g [0]=tmp ; ∗ m = (( ∗ m)+1)%n ; } ✝ ✆ Make static array aliasing analysis harder for the attacker! Modify the array at runtime! Complicating control flow 32/82

  39. ✞ ☎ i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next =0; i n t m=0; i n t g [] = { 10 ,9 ,2 ,5 ,3 } ; for ( ; ; ) { switch ( next ) { 0 : k =0; s =1; next=g[(0+m)%5]%g[(1+m)%5]; break ; case 1 : ( k < w) next=g [ ( g[(2+m)%5]+m)%5]; case i f next=g[(0+m)%5] − 2 ∗ g[(2+m)%5]; break ; els e 2 : ( x [ k]==1) next=g[(3+m)%5] − g [(2+m)%5]; case i f next =2 ∗ g[(2+m)%5]; break ; els e 3 : R=(s ∗ y)%n ; next=g[(4+m)%5]+g[(2+m)%5]; break ; case 4 : R=s ; next=g[(0+m)%5] − g[(3+m)%5]; break ; case 5 : s=R ∗ R%n ; L=R ; k++; case next=g [ ( g[(4+m)%5]+m)%5]%g[(2+m)%5]; break ; case 6 : return L ; } permute (g ,5 ,&m) ; } } ✝ ✆

  40. Make the array global! ✞ ☎ i n t g [ 2 0 ] ; i n t m; i n t modexp ( i n t y , i n t x [ ] , i n t w , i n t n ) { i n t R , L , k , s ; i n t next =0; for ( ; ; ) switch ( next ) { case 0 : k =0; s =1; next=g [m+0]%g [m+ 1]; break ; case 1 : i f ( k < w) next=g [m +g [m+ 2]]; els e next=g [m+0] − 2 ∗ g [m+ 2]; break ; case 2 : i f ( x [ k]==1) next=g [m+3] − g [m+2]; next =2 ∗ g [m+ 2]; break ; els e 3 : R = ( s ∗ y)%n ; next=g [m+4]+g [m+ 2]; break ; case 4 : R=s ; next=g [m+0] − g [m+ 3]; break ; case 5 : s = R ∗ R%n ; L=R ; k++; case next=g [m +g [m+4]]%g [m+ 2]; break ; 6 : return L ; case } } ✝ ✆ Complicating control flow 34/82

  41. With the array global you can initialize it differently at different call sites: ✞ ☎ g [0]=10; g [ 1] = 9; g [ 2] = 2; g [ 3] = 5; g [ 4] = 3; m=0; modexp ( y , x , w, n ) ; . . . g [5]=10; g [ 6] = 9; g [ 7] = 2; g [ 8] = 5; g [ 9] = 3; m=5; modexp ( y , x , w, n ) ; ✝ ✆

  42. Sprinkle pointer variables (pink), pointer manipulations (blue), dead code (green) over the program: ✞ ☎ modexp ( i n t y , x [ ] , i n t w , i n t n ) { i n t i n t i n t R , L , k , s ; next =0; i n t g [] = { 10 ,9 ,2 ,5 ,3 , 42 } ; i n t ∗ g2 ; i n t ∗ gr ; i n t for ( ; ; ) switch ( next ) { 0 : k =0; g2= &g [ 2 ] ; s =1; next=g [0]% g [ 1 ] ; case gr= &g [ 5 ] ; break ; 1 : ( k < w) next=g [ ∗ g2 ] ; case i f next=g[0] − 2 ∗ g [ 2 ] ; break ; els e case 2 : i f ( x [ k]==1) next=g[3] −∗ g2 ; els e next =2 ∗∗ g2 ; break ; case 3 : R=(s ∗ y)%n ; next=g [4]+ ∗ g2 ; break ; case 4 : R=s ; next=g[0] − g [ 3 ] ; break ; case 5 : s=R ∗ R%n ; L=R ; k++; next=g [ g [4]]% ∗ g2 ; break ; case 6 : return L ; case 7 : ∗ g2 =666; next= ∗ gr %2; gr=&g [ ∗ g2 ] ; break ; } } ✝ ✆

  43. Algorithm obfWHKD alias Hopefully, because of the obfuscated manipulations the attacker’s static analysis will conclude that nothing can be deduced about next . Complicating control flow 37/82

  44. Algorithm obfWHKD alias Hopefully, because of the obfuscated manipulations the attacker’s static analysis will conclude that nothing can be deduced about next . Not knowing next , he can’t rebuild the CFG. Complicating control flow 37/82

  45. Algorithm obfWHKD alias Hopefully, because of the obfuscated manipulations the attacker’s static analysis will conclude that nothing can be deduced about next . Not knowing next , he can’t rebuild the CFG. Symbolic execution? We know next starts at 0... Complicating control flow 37/82

  46. obfWHKD opaque : Opaque values from array aliasing 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 36 58 1 46 23 5 16 65 2 41 2 7 1 37 0 11 16 2 21 Invariants: 1 every third cell (in pink), starting will cell 0, is ≡ 1 mod 5; 2 cells 2 and 5 (green) hold the values 1 and 5, respectively; 3 every third cell (in blue), starting will cell 1, is ≡ 2 mod 7; 4 cells 8 and 11 (yellow) hold the values 2 and 7, respectively. You can update a pink element as often as you want, with any value you want, as long as you ensure that the value is always ≡ 1 mod 5! Complicating control flow 38/82

  47. ✞ ☎ g [] = { 36 ,58 ,1 ,46 ,23 ,5 ,16 ,65 ,2 ,41 , int 2 ,7 ,1 ,37 ,0 ,11 ,16 ,2 ,21 ,16 } ; i f (( g [3] % g[5])==g [ 2 ] ) p r i n t f ( ” true ! \ n” ) ; g [ 5 ] = ( g [ 1 ] ∗ g [4])% g [11] + g[6]% g [ 5 ] ; g [14] = rand ( ) ; g [ 4] = rand () ∗ g [11]+ g [ 8 ] ; int s i x = ( g [ 4] + g [ 7] + g [10])% g [ 1 1 ] ; seven = s i x + g[3]% g [ 5 ] ; int int fortytwo = s i x ∗ seven ; ✝ ✆ pink: opaquely true predicate. blue: g is constantly changing at runtime. green: an opaque value 42. Initialize g at runtime!

  48. obfLDK : Jumps through branch functions Replace unconditional jumps with a call to a branch function. Calls normally return to where they came from. . . But, a branch function returns to the target of the jump! a bf() { call bf return to T [ h ( a )] + a jmp b a : } ... ... b T [ h ( a )] = b − a b : b : T [ h ( . . . )] = . . . Complicating control flow 40/82

  49. obfLDK : Make branches explicit ✞ ☎ int modexp (int y,int x[], int w,int n) { int R, L; int k = 0; int s = 1; while (k < w) { if (x[k] == 1) R = (s*y) % n; else R = s; s = R*R % n; L = R; k++; } return L; } ✝ ✆ Complicating control flow 41/82

  50. obfLDK : Jumps through branch functions A table T stores T [ h ( a i )] = b i − a i . Code in pink updated the return address! The branch function: ✞ ☎ char* T[2]; void bf() { char* old; asm volatile ("movl 4(%% ebp ),%0\n\t" : "=r" ( old )); char* new = ( char *)(( int)T[ h ( old ) ] + ( int)old ); asm volatile ("movl %0 ,4(%% ebp )\n\t" : : "r" (new )); } ✝ ✆ Complicating control flow 42/82

  51. ✞ ☎ int modexp (int y, int x[], int w, int n) { int R, L; int k = 0; int s = 1; T[ h ( && retaddr1 ) ]=( char *)(&& endif -&& retaddr1 ); T[ h ( && retaddr2 ) ]=( char *)(&& beginloop -&& retaddr2 ); beginloop : if (k >= w) goto endloop ; if (x[k] != 1) goto elsepart ; R = (s*y) % n; bf (); // goto endif ; retaddr1 : asm volatile (".ascii \" bogus \"\n\t"); elsepart : R = s; endif : s = R*R % n; L = R; k++; bf (); // goto beginloop; retaddr2 : endloop : return L; } ✝ ✆

  52. obfLDK : Jumps through branch functions Designed to confuse disassembly. 39% of instructions are incorrectly assembled using a linear sweep disassembly. 25% for recursive disassembly. Execution penalty: 13% Increase in text segment size: 15%. Complicating control flow 44/82

  53. Outline Introduction 1 Identifier renaming 2 Complicating control flow 3 Inserting bogus control-flow Control-flow flattening Opaque values from array aliasing Jumps through branch functions Opaque Predicates 4 Opaque predicates from pointer aliasing Data encodings 5 Dynamic Obfuscation 6 Self-Modifying State Machine Code as key material Discussion 7 Opaque Predicates 45/82

  54. Constructing opaque predicates Construct them based on number theoretic results ∀ x , y ∈ Z : x 2 − 34 y 2 � = 1 ∀ x ∈ Z : 2 | x 2 + x the hardness of alias analysis the hardness of concurrency analysis Opaque Predicates 46/82

  55. Constructing opaque predicates Construct them based on number theoretic results ∀ x , y ∈ Z : x 2 − 34 y 2 � = 1 ∀ x ∈ Z : 2 | x 2 + x the hardness of alias analysis the hardness of concurrency analysis Protect them by making them hard to find making them hard to break Opaque Predicates 46/82

  56. Constructing opaque predicates Construct them based on number theoretic results ∀ x , y ∈ Z : x 2 − 34 y 2 � = 1 ∀ x ∈ Z : 2 | x 2 + x the hardness of alias analysis the hardness of concurrency analysis Protect them by making them hard to find making them hard to break If your obfuscator keeps a table of predicates, your adversary will too! Opaque Predicates 46/82

  57. Algorithm obfCTJ alias : Opaque predicates from pointer aliasing Create an obfuscating transformation from a known computationally hard static analysis problem. Opaque Predicates 47/82

  58. Algorithm obfCTJ alias : Opaque predicates from pointer aliasing Create an obfuscating transformation from a known computationally hard static analysis problem. We assume that 1 the attacker will analyze the program statically, and 2 we can force him to solve a particular static analysis problem to discover the secret he’s after, and 3 we can generate an actual hard instance of this problem for him to solve. Opaque Predicates 47/82

  59. Algorithm obfCTJ alias : Opaque predicates from pointer aliasing Create an obfuscating transformation from a known computationally hard static analysis problem. We assume that 1 the attacker will analyze the program statically, and 2 we can force him to solve a particular static analysis problem to discover the secret he’s after, and 3 we can generate an actual hard instance of this problem for him to solve. Of course, these assumptions may be false! Opaque Predicates 47/82

  60. Algorithm obfCTJ alias Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q 1 q 2 Opaque Predicates 48/82

  61. Algorithm obfCTJ alias Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q 1 and q 2 point into two graphs G 1 (pink) and G 2 (blue): split q 1 q 1 q 2 q 2 Opaque Predicates 48/82

  62. Algorithm obfCTJ alias Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q 1 and q 2 point into two graphs G 1 (pink) and G 2 (blue): insert split q 1 q 1 q 2 q 2 q 1 q 2 Opaque Predicates 48/82

  63. Algorithm obfCTJ alias Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q 1 and q 2 point into two graphs G 1 (pink) and G 2 (blue): insert split q 1 q 1 q 2 q 2 delete q 1 q 1 q 2 q 2 Opaque Predicates 48/82

  64. Algorithm obfCTJ alias Construct one or more heap-based graphs, keep pointers into those graphs, create opaque predicates by checking properties you know to be true. q 1 and q 2 point into two graphs G 1 (pink) and G 2 (blue): insert split q 1 q 1 q 2 q 2 delete move q 1 q 1 q 1 q 2 q 2 q 2 Opaque Predicates 48/82

  65. Algorithm obfCTJ alias Two invariants: “ G 1 and G 2 are circular linked lists” “ q 1 points to a node in G 1 and q 2 points to a node in G 2 .” Opaque Predicates 49/82

  66. Algorithm obfCTJ alias Two invariants: “ G 1 and G 2 are circular linked lists” “ q 1 points to a node in G 1 and q 2 points to a node in G 2 .” Perform enough operations to confuse even the most precise alias analysis algorithm, Opaque Predicates 49/82

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend