D EFENSE S TRATEGIES : Cover Given two frames x and y make x an - - PowerPoint PPT Presentation

d efense s trategies cover
SMART_READER_LITE
LIVE PREVIEW

D EFENSE S TRATEGIES : Cover Given two frames x and y make x an - - PowerPoint PPT Presentation

C ODE O BFUSCATION D EFENSE STRATEGIES Roberto Giacobazzi Dipartimento di Informatica Universit` a degli Studi di Verona Italy ASP 2009 Ingegneria e Scienze Informatiche Verona p.1/74 T HE SOURCE Most of the slides are taken from:


slide-1
SLIDE 1

CODE OBFUSCATION

DEFENSE STRATEGIES Roberto Giacobazzi Dipartimento di Informatica Universit` a degli Studi di Verona Italy

ASP 2009

Ingegneria e Scienze Informatiche – Verona – p.1/74

slide-2
SLIDE 2

THE SOURCE

Most of the slides are taken from: Ch. 4, 5 and 6 of Surreptitious Software: Obfuscation, Watermarking, and Tamperproofing for Software Protection

Christian Collberg Jasvir Nagra ISBN-10: 0321549252 ISBN-13: 9780321549259 Addison-Wesley Professional 2010, 792 pp.

and Roberto Giacobazzi. Hiding Information in Completeness Holes.

The 6th IEEE International Conferences on Software Engineering and Formal Methods, SEFM’08, pages 7-20, IEEE Press.

Ingegneria e Scienze Informatiche – Verona – p.2/74

slide-3
SLIDE 3

DEFENSE STRATEGIES

Actions in Frames: ... let us see the Bofo marinus

!

One or more slots: name = value

!

Slots may contain other slots

!

Conditional actions ⇒ can take place

!!!!!!!!!!!! "#$%&'(& )*$(*#*+(,-,./&(. &01&#,-2,30+(&,40'5 677

Ingegneria e Scienze Informatiche – Verona – p.3/74

slide-4
SLIDE 4

DEFENSE STRATEGIES: THE PRIMITIVES

We define 10 primitives which can be composed to design a generic defense strategy:

!

Cover

!

Duplicate

!

Split and Merge

!

Reorder

!

Map

!

Indirect

!

Mimic

!

Advertise

!

Detect/Respond

!

Dynamic

Ingegneria e Scienze Informatiche – Verona – p.4/74

slide-5
SLIDE 5

DEFENSE STRATEGIES: Cover

Given two frames x and y make x an element of y:

!"#$%&'% ()#& *'+ ,%&'#)"' !"#$%&'% ()#& *'+ ,%&'#)"'

!

Cover can be applied multiple times to hide information in an inner level

!

Hiding = covering = obscuring

!

Typical examples:

!

hiding keys or SW in hardened boxes (Military)

!

hiding watermarks in standard data-structures

!

hiding watermarks in images of media

Ingegneria e Scienze Informatiche – Verona – p.5/74

slide-6
SLIDE 6

DEFENSE STRATEGIES: Duplicate

Given a frame x, create a deep copy of x (keeping names unique):

!"#$%&'% ( ) !"#$%&'% (* )* ( )

!

Idea 1: Copy as decoy: make the universe larger and harder to scan

!

Idea 2: Copy as reduplication: make the universe larger, full of your copies (signatures)

!

Typical examples:

!

reduplication for protecting its own DNA

!

dummy targets for confusing adversaries

int THE WATERMARK IS HERE = 666

!

  • bfuscate f by f ′ := Duplicate(f ), f ′′ := Obf (f ′) and call f and f ′′

Ingegneria e Scienze Informatiche – Verona – p.6/74

slide-7
SLIDE 7

DEFENSE STRATEGIES: Split & Merge

Given a predicate π and a frame z create a new frame z ′ such that z ′ has all the properties of z and ∀x ∈ z ′. π(x). Merge is set union of frames (and related properties):

!"#$%&'% ()*)$+,

  • .

!"#$%&'% ()*)$+,

  • /

. ()*)$+,

  • 1(,#2

3%&4%

!

Typically used in combination: take two functions f and g:

Split(f ) = (f1, f2) and Split(g) = (g1, g2), then:

fg = Merge(f1, g2) = λx, y. if x = 1 then f1(y) else g2(y) gf = Merge(f2, g1) = λx, y. if x = 2 then f2(y) else g1(y)

call f (y) − → 2 6 6 6 6 6 4 x = 1 fg(x, y) x ++ gf (x, y)

Ingegneria e Scienze Informatiche – Verona – p.7/74

slide-8
SLIDE 8

DEFENSE STRATEGIES: Reorder

Given a frame z and a permutation function f , reorder the elments of z according to f :

!"#"$%& ' ( )*+$,-., / !"#"$%& ' / )*+$,-., (

!

Used in early SW watermarking e.g., by reordering basic blocks in CFG

!

Used in code obfuscation and metamorphism by reordering basic blocks in CFG

Ingegneria e Scienze Informatiche – Verona – p.8/74

slide-9
SLIDE 9

DEFENSE STRATEGIES: Map

Given a frame x and a function f , replace every element e in x with f (e):

!"#"$%& ' ( )*+$,-., / !"#"$%& ' 01/2 )*+$,-., 01(2

!

Implements Security-through-obscurity: translation, crypto, etc.

!

Protect confidentiality by name obfuscation (translation): variables, functions, data

Ingegneria e Scienze Informatiche – Verona – p.9/74

slide-10
SLIDE 10

DEFENSE STRATEGIES: Indirect

Given a frame x add an indirect reference r to x:

✂✄ ☎ ✆

( ) * + $ ,

  • .

, /

✂ ✄ ☎ ✆

( ) * + $ ,

  • .

, /

  • "

The cost of following pointers makes the analysis harder!!

!"#$%&'()*+,- !"#$%&'(.*+, /// %%%&'()*+0

  • !"#$%&'()*+,-

!"#$%*123&)+*+45&'()0 !"#$%*123&.+*+4523&)0% !"#$%&'(.*+, %*1123&.+*+0

  • Combined with map you can hide references.

Ingegneria e Scienze Informatiche – Verona – p.10/74

slide-11
SLIDE 11

DEFENSE STRATEGIES: Mimic

Given two frames x and y, where x holds a property prop, copy prop into y:

"#$%&'(& )'*)+,+-%./- ! "#$%&'(& )'*)+,+-%./- )'*)+,+-%./-

!

Common in wild-life (animals)

!

Fundamental for stealthy: the new (watermarked) code must resemble the same a standard code. Here is a static watermarking:

!"#$%&'%

!"##! !$%&!

(%' !"#$%&'%

!"##! !$%&!

(%'( !"#$%&'%

!"##! !$%&!

(%'

!"##! !$%&!

(%'

!)%*! !$%&!

(%'(

Ingegneria e Scienze Informatiche – Verona – p.11/74

slide-12
SLIDE 12

DEFENSE STRATEGIES: Advertise

Given a frame x with a property prop add a property advertise to x with value prop:

!"#$%&'% (&)(*+*,$-., / !"#$%&'% (&)(*+*,$-.,

  • 0$%&1#'%*+*,(&)(,

/

!

Opposite to security-through-obscurity

!

Openly display a situation in order to discourage attacks

!

Example by false advertising: Say that P is watermarked when it is not!

Ingegneria e Scienze Informatiche – Verona – p.12/74

slide-13
SLIDE 13

DEFENSE STRATEGIES: Detect/Respond

Given two frames x and y add a demon to x that exectutes action A if event E happens to y:

! " # $ % & ' % ( ! " # $ % & ' %

!"#$%"&'()*+,*(

  • .!"#$%"&'(/),*0

( ) )

Typical in tamper-proofing: Duplictate → Map → Detect/Respond

!"#$%&'%

!"#

(&)* !"#$%&'% +,"-.)(/ +," $%#&'%()!"#*+*,-***00 ..+/01203$%#&'%()!"#4*+-* !"#5+!"#)6%78 (&)* !"# Ingegneria e Scienze Informatiche – Verona – p.13/74

slide-14
SLIDE 14

DEFENSE STRATEGIES: Dynamic

Iterate a primitive f over a frame x generating a sequence (finite or infinite) of frames:

f (x) − → f (f (x)) − → f (f (f (x))) − → ....

where

f ∈ {Cover, Duplicate, Split, Merge, Reoder, Map, Indirect, Mimic, Advertise, Detect/Respond}

!

Dynamic change for confusing the attacker: Polymorphism and Metamorphism

!

Useful in polymorphic malware: dynamically decrypt (map) chunks of code, execute, rencrypt (map)

Ingegneria e Scienze Informatiche – Verona – p.14/74

slide-15
SLIDE 15

THE PROBLEM: OBFUSCATION VS DIVERSITY

F . Cohen, Operating systems protection through program evolution, 1993

!

Generate syntactically different semantic equivalent programs

!

P and Q are syntactically different if P = Q

!

P and Q are semantic equivalent if P ≡ Q i.e.,

!

If P(x) ↓ and Q(x) ↓ then P(x) = Q(x)

!

If P(x) ↑ or Q(x) ↑ then P(x) ↑ and Q(x) ↑

!

Given a program P: D(P)def =

  • Q

˛ ˛ ˛ P ≡ Q ∧P = Q

  • is a non recursive set!

Ingegneria e Scienze Informatiche – Verona – p.15/74

slide-16
SLIDE 16

LAYOUT OBFUSCATION

Standard and easy methods for making your code diverse:

!

Change your code by substituting equivalent expressions: y = x * 42 = ⇒ y = x << 5; y+= x <<3; y+= x<<1;

!

Reordering code: break locality which is a typical principle used in reverse engineering!

!

Identifier renaming: ... that is really too easy!!!

Ingegneria e Scienze Informatiche – Verona – p.16/74

slide-17
SLIDE 17

OBFUSCATION BY INTERPRETATION

see Y. Futamura, Partial Evaluation of Computation Process, 1971 Consider two programming languages (abstract machines): S and T with common data

!

Interpreter: ∀ℓ ∈ {S, T }

!

intS : S.prog × data − → data ∪ {↑} , intS ∈ S.prog

!

intT : T .prog × data − → data ∪ {↑} , intT ∈ S.prog

!

∀P ∈ ℓ.prog, ∀d ∈ data : Pℓ(d) = intℓS(p, d)

!

Code specializer: ∀ℓ ∈ {S, T }

!

specS : S.prog × data − → S.prog , specS ∈ S.prog

!

specT : S.prog × data − → T .prog , specT ∈ S.prog

!

∀P ∈ S.prog, ∀d, s ∈ data : PS(s, d) = specℓS(P, s)ℓ(d)

!

Idea: S is the source language (open) T is the secrete architecture (hidden)

Ingegneria e Scienze Informatiche – Verona – p.17/74

slide-18
SLIDE 18

OBFUSCATION BY INTERPRETATION

see Y. Futamura, Partial Evaluation of Computation Process, 1971 Obfuscated code for P is a compiled code from S to T and back to S: PS(d) = intSS(P, d) = specT S(intS, P)T (d) = intT S(specT S(intS, P), d) = specSS(intT , specT S(intS, P))S(d)

  • bf(P) = specSS(intT , specT S(intS, P)) ∈ S.prog

!

In order to attack obf(P) you need to understand the relation between S and T

!

  • bf(P) may run 10-100 time slower!!

Ingegneria e Scienze Informatiche – Verona – p.18/74

slide-19
SLIDE 19

OBFUSCATOR BY INTERPRETATION

see Y. Futamura, Partial Evaluation of Computation Process, 1971 An obfuscator can be generated by further specializing the specializer by an interpreter:

  • bf(P)

= specSS(intT , specT S(intS, P)) = specSS(specS, intT )S(specT S(intS, P)) = specSS(specS, intT )S(specSS(specT , intS)S(P))

  • bf = specSS(specT , intS) ; specSS(specS, intT ) ∈ S.prog

!

By modifying T you can generate different obfuscators by interpretation!

!

Highly modular bt extremely expensive (x100 slowdown by emulation)

Ingegneria e Scienze Informatiche – Verona – p.19/74

slide-20
SLIDE 20

OBFUSCATOR GENERATOR BY INTERPRETATION

see Y. Futamura, Partial Evaluation of Computation Process, 1971 A generator of obfuscators can be obtained by further specializing the specializer:

  • bf(P)

= specSS(intT , specT S(intS, P)) = specSS(specS, intT )S(specT S(intS, P)) = specSS(specS, intT )S(specSS(specT , intS)S(P))

  • bf = specSS(specT , intS) ; specSS(specS, intT ) ∈ S.prog

!

By modifying T you can generate different obfuscators by interpretation

Ingegneria e Scienze Informatiche – Verona – p.20/74

slide-21
SLIDE 21

COMBINING OBFUSCATORS: DIVERSIFICATION

Assume you have a set of obfuscation strategies: {T1, . . . , Tn}

!

Let prob[i] be the probability of applying Ti

!

Let thresholdspace be the bound to the space of obfuscated code input P; while space(P) ≤ thresholdspace { x ← y ← 0 a ← random in [0, 1] while x ≤ a { y ← y +1; x ← x +prob[y] } ; apply(Ty, P) }

Ingegneria e Scienze Informatiche – Verona – p.21/74

slide-22
SLIDE 22

COMBINING OBFUSCATORS: DIVERSIFICATION II

Assume you have a set of obfuscation strategies: T = {T1, . . . , Tn}

!

P consists of objects s1, . . . (routines, modules, etc

!

Let prio[si] is the importance of protecting object si

!

acceptCost is the maximum execution penalty allowed

!

reqObf is the amount of obfuscation required input P = s1, . . .; while !done(acceptCost,reqObf) { s ← max in prio; t ← selectTrans(s, T ); apply(t, s); update(prio[s]) } ;

Ingegneria e Scienze Informatiche – Verona – p.22/74

slide-23
SLIDE 23

COMBINING OBFUSCATORS: DEPENDENCIES

Assume you have a set of obfuscation strategies: T = {T1, . . . , Tn}

!

The order by which obfuscations are applied may be relevant!

!

Idea: build a FSA whose language describe the acceptable order of obfuscations: P = m() and    Obf .level 0.6 Perf .critical 0.2 T = {T1, T2} where after T1 do T2 2 6 6 4 Transf Potency Degradation T1 1.0 0.9 T2 0.5 0.3 3 7 7 5

Ingegneria e Scienze Informatiche – Verona – p.23/74

slide-24
SLIDE 24

COMBINING OBFUSCATORS: DEPENDENCIES

!"#$ !"#$ %"#$ %"#$

!

The accepted language of the FSA G is: (B(m) +A(m)+B(m))∗

!

The best sequences can be chosen by using weights on edge: ∀t ∈ T , ∀s ∈ P : weigh(t, s) = F(Obf .level(s), Perf .critical(s), Potency(t), Degradation(t)) where F : [0, 1]4 − → [0.1] combines the single weights!

!

Goal: maximize the weight along some path!

Ingegneria e Scienze Informatiche – Verona – p.24/74

slide-25
SLIDE 25

DEFINITIONS

!

Let us have some secret σ in P that we want to hide: O(P). What is σ?

!

The source code of P;

!

The module organization of P;

!

Function names

!

Location of critical functions (e.g., license check)

!

The value of particular data (crypto keys, license etc)

!

Let O : P − → P be a program transformation. O(P) is an obfuscation of P if:

!

P ≡α O(P);

!

σ is hidden in O(P). To understand how hidden is σ we need to know how to attack σ

Ingegneria e Scienze Informatiche – Verona – p.25/74

slide-26
SLIDE 26

DEFINITIONS

!

Let O : P − → P be an obfuscation. O(P) is the obfuscation of P ∈ P

!

Let A be an analysis for the language P:

!

If σ = A(P) then A reveals σ;

!

P ≡α O(P); What is the relation between A and O?

!

O is effective if A(O(P)) = σ or A(O(P)) is harder than A(P);

!

O is ineffective if A(O(P)) = σ;

!

O is defective if A(O(P)) = σ and A(O(P)) is easier than A(P). A good obfuscator either hides σ with respect to A or makes A harder!

Ingegneria e Scienze Informatiche – Verona – p.26/74

slide-27
SLIDE 27

DEFINITIONS: POTENCY

!

Let O : P − → P be an obfuscation. O(P) is the obfuscation of P ∈ P

!

Let {A1, . . . , An} be a set of program analysis for the language P: O is potent for P if

!

∃Ai ∈ {A1, . . . , An} such that O is effective for P and Ai;

!

∀Aj ∈ {A1, . . . , An} ∧ i = j O is not defective for P. Effectiveness deals with: precision and complexity!!

Ingegneria e Scienze Informatiche – Verona – p.27/74

slide-28
SLIDE 28

MEASURES OF POTENCY

!

Typical metric measures

!

Program length O(P) increases the operators and operands in P

!

Cyclomatic complexity O(P) increases the numer of predicates in P

!

Nesting complexity O(P) increases the nesting level of conditionals in P

!

Data-flow complexity O(P) increases the inter-block reference

!

Fan-in/Fan-out complexity O(P) increases the number of formal parameter and data structured referenced by P

!

Data-structure complexity O(P) increases the complexity of static data structures (arrays and records) declared in P

!

OO metric O(P) increases the number of methods, the depth in inheritance tree, the number of sub classes, and methods that can be generated in response to a message sent to an object Is there a more general notion of potency?

Ingegneria e Scienze Informatiche – Verona – p.28/74

slide-29
SLIDE 29

ABSTRACT INTERPRETATION

[Cousot & Cousot ’79]

!

A program P and a domain of computation for P: C

!

Semantic specification (interpreter): P : C −

→ C

!

(Approximate) observable properties: ρ ∈ uco(C)

!

DERIVE A SOUND APPROXIMATE SPECIFICATION P♯ ρ(P(x)) ≤ P♯(x)

!

THE LIMIT CASE: COMPLETENESS ρ(P(x)) = P♯(x) iff ρ(P(x)) = ρ(P(ρ(x)))

Ingegneria e Scienze Informatiche – Verona – p.29/74

slide-30
SLIDE 30

COMPLETENESS IN ABSTRACT INTERPRETATION

!

BACKWARD SOUNDNESS:

NO INFORMATION IS LOST BY APPROXIMATING THE INPUT/OUTPUT

!

ρ◦f ≤ ρ◦f ◦ρ ρ

f(x)

f

ρ(f(x))

ρ(f(ρ(x))) f♯(ρ(x))

!"#$%&'$

Ingegneria e Scienze Informatiche – Verona – p.30/74

slide-31
SLIDE 31

COMPLETENESS IN ABSTRACT INTERPRETATION

!

BACKWARD COMPLETENESS:

NO LOSS OF PRECISION IS ACCUMULATED BY APPROXIMATING THE INPUT

!

ρ◦f = ρ◦f ◦ρ ρ

f(x)

f

ρ(f(x))

ρ(f(ρ(x))) f♯(ρ(x))

!

!"#$%&'$

Ingegneria e Scienze Informatiche – Verona – p.30/74

slide-32
SLIDE 32

COMPLETENESS IN ABSTRACT INTERPRETATION

!

FORWARD COMPLETENESS:

NO INFORMATION IS LOST BY APPROXIMATING THE OUTPUT

!

f ◦ρ ≤ ρ◦f ◦ρ

ρ

f(x)

f

ρ(f(ρ(x))) f♯(ρ(x))

!"#$%&'$

ρ

f(ρ(x))

f

Ingegneria e Scienze Informatiche – Verona – p.30/74

slide-33
SLIDE 33

COMPLETENESS IN ABSTRACT INTERPRETATION

!

FORWARD COMPLETENESS:

NO INFORMATION IS LOST BY APPROXIMATING THE OUTPUT

!

f ◦ρ = ρ◦f ◦ρ

ρ

f(x)

f

ρ(f(ρ(x))) f♯(ρ(x))

!"#$%&'$

ρ f(ρ(x)) f

!

Ingegneria e Scienze Informatiche – Verona – p.30/74

slide-34
SLIDE 34

A CLASSICAL EXAMPLE

A SIMPLE EXAMPLE IN INTERVAL ANALYSIS Z

[0, +∞] [0, 10] [0, 2] [0, 0] [−∞, 0]

!

A simple domain of intervals

Ingegneria e Scienze Informatiche – Verona – p.31/74

slide-35
SLIDE 35

A CLASSICAL EXAMPLE

A SIMPLE EXAMPLE IN INTERVAL ANALYSIS

Z

[0, +∞] [0, 10] [0, 2] [0, 0] [−∞, 0]

!

A simple domain of intervals

!

sq(X ) =

  • x 2 ˛

˛ ˛ x ∈ X

  • !

{Z, [0, +∞], [0, 10]} is Forward but

not Backward complete

Ingegneria e Scienze Informatiche – Verona – p.31/74

slide-36
SLIDE 36

A CLASSICAL EXAMPLE

A SIMPLE EXAMPLE IN INTERVAL ANALYSIS

Z

[0, +∞] [0, 10] [0, 2] [0, 0] [−∞, 0]

!

A simple domain of intervals

!

sq(X ) =

  • x 2 ˛

˛ ˛ x ∈ X

  • !

{Z, [0, +∞], [0, 10]} is Forward but

not Backward complete

!

{Z, [0, 2], [0, 0]} is Backward but not

Forward complete

Ingegneria e Scienze Informatiche – Verona – p.31/74

slide-37
SLIDE 37

OBSCURITY BY INCOMPLETENESS

Failing precision means failing completeness! Obfuscating programs is making abstract interpreters incomplete

!

Let ρ ∈ uco(Σ) with Σ semantic objects (data, traces etc)

!

A program transformation τ : P → P: P = τ(P).

!

ρ B-complete for · if ρ(P) = Pρ τ obfuscates P if Pρ ❁ τ(P)ρ

Pρ ❁ τ(P)ρ ⇐⇒ ρ(τ(P)) ❁ τ(P)ρ

Ingegneria e Scienze Informatiche – Verona – p.32/74

slide-38
SLIDE 38

OBSCURITY BY INCOMPLETENESS

Failing precision means failing completeness! Obfuscating programs is making abstract interpreters incomplete

P : x = a ∗ b Sign is an obvious abstraction of ℘(Z):

0− 0+ ℘(Z) . . . 1 . . . . . . . . . . . . 0+ 0− ∅ ℘(Z) {−1, −3, −4} {2, 3, 5} ∅

Ingegneria e Scienze Informatiche – Verona – p.32/74

slide-39
SLIDE 39

OBSCURITY BY INCOMPLETENESS

Failing precision means failing completeness! Obfuscating programs is making abstract interpreters incomplete

P : x = a ∗ b Sign is an abstraction of ℘(Z):

0− 0+ ℘(Z) . . . 1 . . . . . . . . . . . . 0+ 0− ∅ ℘(Z) {−1, −3, −4} {2, 3, 5} ∅

Ingegneria e Scienze Informatiche – Verona – p.32/74

slide-40
SLIDE 40

OBSCURITY BY INCOMPLETENESS

Failing precision means failing completeness! Obfuscating programs is making abstract interpreters incomplete x = 0;

P :

x = a ∗ b

− →

τ(P) : if b ≤ 0 then {a =−a; b =−b}; while b = 0 {x = a + x; b = b − 1}

!

Sign is complete for P:

!

PSign = λa, b. Sign(a ∗ b)

!

Sign is incomplete for τ(P):

!

τ(P)Sign = λa, b.

  • if a = 0 ∨ b = 0

℘(Z)

  • therwise

!

Is there any way to get τ(P) systematically out of P?

Ingegneria e Scienze Informatiche – Verona – p.32/74

slide-41
SLIDE 41

GENERALIZING DATA-REFINEMENT I

We consider variable splitting:

v ∈ Var(P) is split into v1, v2 such that v1 = f1(v), v2 = f2(v) and v = g(v1, v2) f1(v) = v ÷ 10 f2(v) = v

mod 10

g(v1, v2) = 10 · v1 + v2

And the interval analysis: ι(x) = [min(x), max(x)]

P : " v = 0;

while v < N {v + +}

Pι = λv. [0, N ]

Ingegneria e Scienze Informatiche – Verona – p.33/74

slide-42
SLIDE 42

GENERALIZING DATA-REFINEMENT I

We consider variable splitting:

v ∈ Var(P) is split into v1, v2 such that v1 = f1(v), v2 = f2(v) and v = g(v1, v2) f1(v) = v ÷ 10 f2(v) = v

mod 10

g(v1, v2) = 10 · v1 + v2

And the interval analysis: ι(x) = [min(x), max(x)] τ(P) :

2 6 6 6 6 6 6 6 6 4 v1 = 0; v2 = 0;

while 10 · v1 + v2 < N {

v1 = v1 + (v2 + 1) ÷ 10 v2 = (v2 + 1)

mod 10

}; c : v = 10 · v1 + v2 τ(P); cι =

λv. 10 ⊙ [0, N⊖[0,9]

10

] ⊕ [0, 9] =

λv. [0, N ] ⊕ [0, 9]

=

λv. [0, N +9] Obfuscation induces errors

Ingegneria e Scienze Informatiche – Verona – p.33/74

slide-43
SLIDE 43

GENERALISING DATA-REFINEMENT II

We consider array splitting for weakening the invariant of Fibonacci’s Inv = 2 ≤ i ≤ N ∧ ∀j ∈ [2, i]. a[j] = a[j − 1] + a[j − 2] The invariant Inv can be generated by relational interval-Fib analysis

!

η = α+◦α where

!

α(X ) =

        

Fib if ∀S, x ∈ X. S ⊆ Dx ∧

(S = {0} ∧ x[0] = 0)∨ (S = {0, 1} ∧ x[0] = 0 ∧ x[1] = 1)∨ (∀j ∈ S. x[j] = x[j − 1] + x[j − 2])

Any

  • therwise

!

I −→Fib represents Fibonacci’s sequences until max(I )

!

I −→Any represents any array with domain including I (no overlow)

!

[n, m]− →Fib = [n, m − 1]− →Fib ⊕ [n, m − 2]− →Fib

Ingegneria e Scienze Informatiche – Verona – p.34/74

slide-44
SLIDE 44

GENERALISING DATA-REFINEMENT II

We consider array splitting for weakening the invariant of Fibonacci’s Inv = 2 ≤ i ≤ N ∧ ∀j ∈ [2, i]. a[j] = a[j − 1] + a[j − 2]

P : 2 6 6 6 6 6 6 6 6 6 6 4 a[0] = 0; a[1] = 1; i = 2;

while

i ≤ N { a[i] = a[i − 1] + a[i − 2]; i + + } P

ι ι−

→ η = a ∈ [0, N ]− →Fib ∧ i ∈ [2, N + 1]

Ingegneria e Scienze Informatiche – Verona – p.34/74

slide-45
SLIDE 45

GENERALISING DATA-REFINEMENT II

We consider array splitting for weakening the invariant of Fibonacci’s Inv = 2 ≤ i ≤ N ∧ ∀j ∈ [2, i]. a[j] = a[j − 1] + a[j − 2] τ(P) :

2 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 4 b[0] = 0; c[0] = 1; i = 2;

while

i ≤ N {

if

i

mod 2 == 0

{b[i ÷ 2] = c[(i − 1) ÷ 2] + b[(i − 2) ÷ 2]} {c[i ÷ 2] = b[(i) ÷ 2] + c[(i − 2) ÷ 2]}; i + + } τ(P)

ι ι−

→ η= b, c ∈ [0, N ÷ 2]− →Any ∧ i ∈ [2, N + 1]

Ingegneria e Scienze Informatiche – Verona – p.34/74

slide-46
SLIDE 46

GENERALISING DATA-REFINEMENT II

We consider array splitting for weakening the invariant of Fibonacci’s Inv = 2 ≤ i ≤ N ∧ ∀j ∈ [2, i]. a[j] = a[j − 1] + a[j − 2] The analysis of τ(P) is unable to get Inv back!!

Ingegneria e Scienze Informatiche – Verona – p.34/74

slide-47
SLIDE 47

STANDARD OBFUSCATION

1. Abstraction transformations: Destroy module structure, classes, functions, etc.! 2. Data transformations: Replace data structures with new representations! 3. Control transformations: Destroy if-, while-, repeat-, etc.! 4. Dynamic transformations: Make the program change at runtime!

Ingegneria e Scienze Informatiche – Verona – p.35/74

slide-48
SLIDE 48

OPAQUE PREDICATES

!

Simply put: an expression whose value is known to you as the defender (at

  • bfuscation time) but which is difficult for an attacker to figure out

!

Notation:

!

PT for an opaquely true predicate

!

PF for an opaquely false predicate

!

P? for an unknown predicate

P T

!"#$ %&'($ !"#$ %&'($ !"#$ %&'($

P F

P ?

Ingegneria e Scienze Informatiche – Verona – p.36/74

slide-49
SLIDE 49

OPAQUE PREDICATES

Examples of opaque predicates from number theory

∀x, y ∈ Z : 7y2 −1 = x 2 ∀x ∈ Z : 2 | (x +x 2) ∀x ∈ Z : 3 | (x 3 −x) ∀n ∈ Z+, x, y ∈ Z : (x −y) | (x n −yn) ∀n ∈ Z+, x, y ∈ Z : 2 | n ∨ (x +y) | (x n +yn) ∀n ∈ Z+, x, y ∈ Z : 2 | n ∨ (x +y) | (x n −yn) ∀x ∈ Z+ : 9 | (10x +3 · 4(x+2) +5) ∀x ∈ Z : 3 | (7x −5) ⇒ 9 | (28x 2 −13x −5) ∀x ∈ Z : 5 | (2x −1) ⇒ 25 | (14x 2 −19x −19) ∀x, y, z ∈ Z : (2 | x ∧ 2 | y) ⇒ x 2 +y2 = z 2 ∀x ∈ Z+ : 14 | (3 · 74x+2 +5 · 42x−1 −5)

Ingegneria e Scienze Informatiche – Verona – p.37/74

slide-50
SLIDE 50

OPAQUE PREDICATE INSERTION

The simplest block splitting!

P T

!"#$ %&'($

Attack: Crack the predicate PT

Ingegneria e Scienze Informatiche – Verona – p.38/74

slide-51
SLIDE 51

OPAQUE PREDICATE INSERTION

The green block can be bo- gous!

P T

!"#$ %&'($

Attack: Crack the predicate PT

Ingegneria e Scienze Informatiche – Verona – p.39/74

slide-52
SLIDE 52

OPAQUE PREDICATE INSERTION

The blue and green blocks should be semantically equiva- lent!

!"#$ %&'($

P ?

Attack: Analyse semantic equivalence!

Ingegneria e Scienze Informatiche – Verona – p.40/74

slide-53
SLIDE 53

OPAQUE PREDICATE INSERTION

By modifying loop conditions P becomes P ∧PT

"#$% &'(%

P

!"#$% &'(%

P P T

!"#$% &'(%

Attack: Crack the predicate PT

Ingegneria e Scienze Informatiche – Verona – p.41/74

slide-54
SLIDE 54

OPAQUE PREDICATE INSERTION

Making loops irreducible by jumping into a loop!

"#$% &'(%

P

!"#$% &'(%

P

P F

!"#$% &'(%

Attack: Crack the predicate PF

Ingegneria e Scienze Informatiche – Verona – p.42/74

slide-55
SLIDE 55

OPAQUE PREDICATE ATTACK

Making obfuscated loops reducible by code replication!

"#$% &'(%

P

P F

!"#$% &'(% !"#$% &'(%

P

P F

!"#$% &'(%

Attack: Very hard to get due to exponential blowup!

  • L. Carter, J. Ferrante, C. Thomborson

Folklore Confirmed: Reducible Flow Graphs are Exponentially Larger, POPL 2003

Ingegneria e Scienze Informatiche – Verona – p.43/74

slide-56
SLIDE 56

OPAQUE PREDICATE ATTACK

G is reducible if G T1/T2 = ⇒

∗⊙

! " !#"

T1 T2

Attack: Very hard to get due to exponential blowup!

Ingegneria e Scienze Informatiche – Verona – p.44/74

slide-57
SLIDE 57

OPAQUE PREDICATE ATTACK

c: if (cond) goto b a: x = ... b: y = ... goto a

" #$ % #& #'

Attack: Typical decompiler attack!

Ingegneria e Scienze Informatiche – Verona – p.45/74

slide-58
SLIDE 58

OPAQUE PREDICATE ATTACK

c: if (cond) then y = ... while (T) { x = ... y = ... }

" #$ % #& #' #&

Attack: Typical decompiler attack!

Ingegneria e Scienze Informatiche – Verona – p.46/74

slide-59
SLIDE 59

OPAQUE PREDICATE ATTACK

A leftist tree T is simply a rooted binary tree π = p1p2...pk is a leftist leaf sequence

  • f T if ∃Ti subtree of T such that

p1 ∈ Ti and pi+1 = leftl(Ti) ∨ pi+1 = leftr(Ti)

" # " $% $& $' $(

n0n2n2n1n0n1 is ok n0n3 is wrong!

Idea: Associate a leftist tree with reducible graphs! Fact: Γ ⊆ Σ, T is Γ-complete if ∀σ ∈ Γ∗ : σ is suffix of L(T) If |Γ| = n the minimal number of leaves of a Γ-complete tree is 2n−1

Ingegneria e Scienze Informatiche – Verona – p.47/74

slide-60
SLIDE 60

EXPONENTIAL BLOWUP

Theorem: If R is a reducible graph equivalent to a graph with n nodes (with labels in Γ) with initial label c and L(R) = cΓ∗, then R must have at least 2|Γ|−1 nodes.

T2

! " ! "

Proof: Associate a leftist tree with reducible graphs! The tree corresponds to the basic transformations required for reducibility

" #$ % #& #'

#&

T2

" #$ #& %

%

T2

"

"

%

T1

#' #& #' #& #$ #& #$ #& #' #& Ingegneria e Scienze Informatiche – Verona – p.48/74

slide-61
SLIDE 61

OPAQUE PREDICATES AND EXPONENTIAL BLOWUP

!

Note: complete flow graph, have no equivalent reducible flow graph that is not exponential in size

!

Consequence:

!

No node splitting technique can avoid this exponential blowup.

!

Automatic compiler analyses would find handling such large programs difficult

!

distribute code whose flow graph is irreducible!! = ⇒ exponential number of opaque predicates!!

  • L. Carter, J. Ferrante, C. Thomborson

Folklore Confirmed: Reducible Flow Graphs are Exponentially Larger, POPL 2003

Ingegneria e Scienze Informatiche – Verona – p.49/74

slide-62
SLIDE 62

OPAQUE PREDICATES FROM POINTER ALIASING

!

Generate opaque predicates by spurious aliases Potency measure by complexity of static analysis

!

1-level aliasing is easy P [Banning ’79]

!

≥ 2-level aliasing is hard NP [Horowitz ’97]

!

with dynamic memory allocation is undecidable!! understanding control-flow = solve a ≥ 2-level aliasing problem

  • C. Collberg, C. Thomborson, D. Low

Manufacturing cheap, resilient, and stealthy opaque precicates, POPL 1998

Ingegneria e Scienze Informatiche – Verona – p.50/74

slide-63
SLIDE 63

OPAQUE PREDICATES FROM POINTER ALIASING

!

Input: A program P 1. P := P ∪ Rb such that R builds a dynamically allocated pointer-structure G = {G1, G2, . . .} 2. P := P ∪ Rp such that Rp add pointers Q = {q1, q2, . . .} points to structures in G 3. Construct a series of invariants I = {I1, I2, . . .} over G and Q such that

!

qj always points to Gj

!

Gi is always strongly connected 4. P := P ∪ Rm such that Rm occasionally modifies the structures in G and pointers in Q while keeping invariants I 5. Using I construct opaque predicates over Q such that:

!

(qi = qj )T if qi and qj are known to point into different graphs

!

(qi = null)T if qi is inside Gi and Gi has no leaf nodes

!

(qi = qj )? if qi and qj are known to both point into Gk

Ingegneria e Scienze Informatiche – Verona – p.51/74

slide-64
SLIDE 64

OPAQUE PREDICATES FROM POINTER ALIASING

!

Two invariants: 1. G1 and G2 are circular linked lists 2. q1 points to a node in G1 and q2 points to a node in G2.

!

Perform enough operations to confuse even the most precise alias analysis algorithm

!

Insert opaque queries such as (q1 = q2)T into the code.

Ingegneria e Scienze Informatiche – Verona – p.52/74

slide-65
SLIDE 65

OPAQUE PREDICATES FROM POINTER ALIASING

!"#$% &' &( &( &' $ ) ! * + % &( &' ,-.* &( &'

Ingegneria e Scienze Informatiche – Verona – p.53/74

slide-66
SLIDE 66

A MEASURE OF POTENCY

Breaking ∀x ∈ Z, 2|(x 2 +x) in SPECint2000 Bench

!

The brute force attack took 8.83 seconds to detect only 1 opaque predicate Example: 3 mod (x 3 − x) = 0

!

Abstract interpretation-based de-obfuscation attack took 8.13 seconds to de-obfuscate 66176 opaque predicates in a 32-bit env.

Ingegneria e Scienze Informatiche – Verona – p.54/74

slide-67
SLIDE 67

WHAT WE HAVE LEARNED SO FAR?

Dalla Preda & Giacobazzi. Semantic-based Code Obfuscation by Abstract Interpretation.

  • J. of Computer Security 2009

!

Attacking is refining the abstract interpreter towards completeness

!

A good measure of Potency

!

We know how to do it!!!

!

By abstract domain refinement

!

By spurious counterexample elimination

!

By Dynamic analysis

!

Obfuscating is making (abstract) interpreters incomplete

!

Is there any systematic way to do it?

Ingegneria e Scienze Informatiche – Verona – p.55/74

slide-68
SLIDE 68

OBFUSCATION BY PROGRAM TRANSFORMATION

Cousot & Cousot, Systematic Design of Program Transformation Frameworks by Abstract Interpretation. POPL ’02

semantics t[SP] ⊑ SτP program P Subject Syntactic transformation τ program τP Transformed p S p S Semantic Transformed transformation t program program semantics SP Subject

Syntactic transformation: τ = p◦t◦S

Ingegneria e Scienze Informatiche – Verona – p.56/74

slide-69
SLIDE 69

THE GEOMETRY OF SEMANTICS TRANSFORMERS

MAKING SEMANTICS COMPLETE (FROM ABOVE AND BELOW):

F↑

η,ρ(f ) = {h : C −

→C | f ⊑ h, ρ ◦ h ◦ η = h ◦ η} F↓

η,ρ(f ) = F{h : C −

→C | f ⊒ h, ρ ◦ h ◦ η = h ◦ η} F↑

η,ρ(f ) and F↓ η,ρ(f ) are (Forward) complete

MAKING SEMANTICS MAXIMALLY IN-COMPLETE (FROM ABOVE AND BELOW):

O↑

η,ρ(f ) = F{g : C −

→C | F↓

η,ρ(g) = F↓ η,ρ(f )}

O↓

η,ρ(f ) = {g : C −

→C | F↑

η,ρ(g) = F↑ η,ρ(f )}

O↑

η,ρ(f ) and O↓ η,ρ(f ) are generally in-complete

Ingegneria e Scienze Informatiche – Verona – p.57/74

slide-70
SLIDE 70

THE GEOMETRY OF SEMANTICS TRANSFORMERS

+ + +

  • F↑

F↓

O↓ O↑

Minimal complete transformation from above Minimal complete transformation from below Maximal incomplete transformation from below Maximal incomplete transformation from above

(F↑)+ = F↓

and

(F↑)− = O↓

Giacobazzi & Mastroeni, Transforming abstract interpretations by abstract interpretation SAS’08

Ingegneria e Scienze Informatiche – Verona – p.57/74

slide-71
SLIDE 71

THE GEOMETRY OF SEMANTICS TRANSFORMERS

! "

⊤ ⊤ ⊥

ρ η

"#

Making FORWARD COMPLETENESS: Transforming the semantics upwards

F↑

η,ρ = λf .λx.

  • ρ ◦ f (x)

if x ∈ η(C)

f (x)

  • therwise

Ingegneria e Scienze Informatiche – Verona – p.57/74

slide-72
SLIDE 72

THE GEOMETRY OF SEMANTICS TRANSFORMERS

! "

⊤ ⊤ ⊥

ρ η

"# ρ+f(x) =

  • {ρ(y) | ρ(y) ≤ f(x)}

Making FORWARD COMPLETENESS: Transforming the semantics downwards

F↓

η,ρ = λf .λx.

  • ρ+ ◦ f (x)

if x ∈ η(C)

f (x)

  • therwise

Ingegneria e Scienze Informatiche – Verona – p.57/74

slide-73
SLIDE 73

THE GEOMETRY OF SEMANTICS TRANSFORMERS

! "

⊤ ⊤ ⊥

ρ η

"# ρ++f(x) =

  • {y | ρ+(y) = ρf(x)}

Making FORWARD IN-COMPLETENESS: Transforming the semantics upwards

O↑

η,ρ(f )(x) =

  • (ρ+)+(f (x)) = W

y ˛ ˛ ˛ ρ+(y) = ρ+(f (x))

  • if x ∈ η

f (x)

  • therwise

Ingegneria e Scienze Informatiche – Verona – p.57/74

slide-74
SLIDE 74

THE GEOMETRY OF SEMANTICS TRANSFORMERS

! "

⊤ ⊤ ⊥

ρ η

"# ρ−f(x)

Making FORWARD IN-COMPLETENESS: Transforming the semantics downwards

O↓

η,ρ(f )(x) =

  • ρ−(f (x)) = V

y ˛ ˛ ˛ ρ(y) = ρ(f (x))

  • if x ∈ η

f (x)

  • therwise

Ingegneria e Scienze Informatiche – Verona – p.57/74

slide-75
SLIDE 75

WHAT DO WE HAVE SO FAR?

!

We know how to transform semantics to make them:

!

Minimally complete

!

Maximally incomplete Incompleteness is essential for obfuscation

!

These are not transformations but deformations!

!

The semantics may change

!

We have to cope with this change

Ingegneria e Scienze Informatiche – Verona – p.58/74

slide-76
SLIDE 76

OBFUSCATION AS INCOMPLETENESS

We transform semantics in order to induce maximal incompleteness

P : 2 6 4 x = x ∗ x; c :

if 10 ≤ x ≤ 100 {y = 5} {y = 5000}; return(y)

!

Pι(x ∈ [5, 8]) = x ∈ [25, 64] ∧ y ∈ [5]

!

wlpcι(y ≤ 100) = x ∈ [10, 100] and wlpx = x ∗ xι(x ∈ [10, 100]) = x ∈ [4, 10].

!

Find c ′ such that wlpc ′ι(x ∈ [10, 100]) =

O↓

ι,ι(λX. wlpx = x ∗ xι(X ))(x ∈ [10, 100]) =

ι−(wlpx = x ∗ xι(x ∈ [10, 100])) = {4, 10}

Ingegneria e Scienze Informatiche – Verona – p.59/74

slide-77
SLIDE 77

OBFUSCATION AS INCOMPLETENESS

We transform semantics in order to induce maximal incompleteness

P : 2 6 4 x = x ∗ x; c :

if 10 ≤ x ≤ 100 {y = 5} {y = 5000}; return(y)

!

c ′ : if x == 4 ∨ x == 10 {x = 16} {x = x ∗ 200}

!

In order to ensure behaviour equivalence we derive if 4 ≤ x ≤ 10

{x = x − (x − 4) x = x − (x − 10)} {nil}

Ingegneria e Scienze Informatiche – Verona – p.59/74

slide-78
SLIDE 78

OBFUSCATION AS INCOMPLETENESS

We transform semantics in order to induce maximal incompleteness

P : 2 6 4 x = x ∗ x; c :

if 10 ≤ x ≤ 100 {y = 5} {y = 5000}; return(y)

!

The resulting obfuscated code is: τ(P) :

2 6 6 6 6 6 6 6 6 4

if 4 ≤ x ≤ 10

{x = x − (x − 4) x = x − (x − 10)} {nil};

if x == 4 ∨ x == 10 {x = 16} {x = x ∗ 200}; if 10 ≤ x ≤ 100 {y = 5} {y = 5000}; return(y) For x = 7 we have

τ(P)ι(x ∈ [5, 8]) = x ∈ [16, 1400] ∧ y ∈ [5, 5000]

Ingegneria e Scienze Informatiche – Verona – p.59/74

slide-79
SLIDE 79

OBFUSCATION AS INCOMPLETENESS

We transform semantics in order to induce maximal incompleteness

P : 2 6 4 x = x ∗ x; c :

if 10 ≤ x ≤ 100 {y = 5} {y = 5000}; return(y)

!

The resulting obfuscated code is: τ(P) :

2 6 6 6 6 6 6 6 6 6 6 4

if 4 ≤ [5, 8] ≤ 10

{x = [5, 8] − ([5, 8] − 4) x = x − (x − 10)} {nil}; {x ∈ [1, 7]}

if x == 4 ∨ x == 10 {x = 16} {x = x ∗ 200}; if 10 ≤ x ≤ 100 {y = 5} {y = 5000}; return(y) For x = 7 we have

τ(P)ι(x ∈ [5, 8]) = x ∈ [16, 1400] ∧ y ∈ [5, 5000]

Ingegneria e Scienze Informatiche – Verona – p.59/74

slide-80
SLIDE 80

OBFUSCATION AS INCOMPLETENESS

We can derive a methodology for systematically making code obscure:

!

P = M1; . . . ; Mj ; {Φj } Mj+1; . . . ; Mn

!

Assume the invariant Φj can be generated with abstract interpretation α

!

Find C such that: wlpCα(Φj ) = O↓,↑

α,α(λX. wlpMj ι(X ))(Φj )

!

Adjust C ❀ C ′ in order to keep concrete observational (I/O) behaviour

C ′ | = Φj

!

τ(P) = M1; . . . ; C ′; Φj Mj+1; . . . ; Mn

Ingegneria e Scienze Informatiche – Verona – p.60/74

slide-81
SLIDE 81

THWARTING DISASSEMBLY

!

Confuse as much as possible the position of the instruction boundaries in a program

!

Compare the set of instructions start addresses identified by a static disassembler and the “actual” instruction addresses encountered when the program is executed (maximize this difference)

!

Self-reparing disassembly: even when a disassembly error occurs the disassembly ends up re-syncronizing with the actual instruction stream

!

Delay the self-reparing process as much as possible

Ingegneria e Scienze Informatiche – Verona – p.61/74

slide-82
SLIDE 82

THWARTING DISASSEMBLY: JUNK INSERTION

!

Introduce disassembly errors by inserting junk bytes where the disassembler expects code

!

Junk bytes are partial instructions

!

Junk bytes are never executed at run time

!

Candidate block is a Basic Block that can have junk bytes inserted before (the BB preceding ends with a direct jump or a function call)

!

Given a candidate block we have to determine the junk bytes to insert before it in

  • rder to delay the self-reparing process as much as possible (we insert the first k byte
  • f a given instruction I where k maximizes the delay)

Ingegneria e Scienze Informatiche – Verona – p.62/74

slide-83
SLIDE 83

THWARTING LINEAR SWEEP

!

Attack limitation: Unable to distinguish data embedded in the instruction stream

!

Junk bytes insertion: 26% - 30% of instructions are incorrectly disassembled (candidate blocks are too far ∼ 30 instructions)

!

Increase the number of candidate blocks through branch flipping: bccAddr bccL L : jmpAddr L : . . .

!

L is now a candidate block!!

!

70% of the instructions are incorrectly disassembled

Ingegneria e Scienze Informatiche – Verona – p.63/74

slide-84
SLIDE 84

THWARTING RECURSIVE TRAVERSAL

Assumption: a function returns to the instruction following the call instruction Branch functions: a1: jmp b1 a1: call f a2: jmp b2 a2: call f an: jmp bn an: call f

!

!" !# !$

☛ ☛ ☛ ☛ ☛ ☛ ☛ ☛ ☛

Branch Fucntions:

!

Obscure the control flow of the program

!

Confuse disassembly by inserting junk bytes at the points immediately after call f

!

The resulting code is more expensive (depending on the implementation of the branch functions) and therefore such transformation is not applied to “hot code” (code frequently executed)

Ingegneria e Scienze Informatiche – Verona – p.64/74

slide-85
SLIDE 85

THWARTING DISASSEMBLY: EXPERIMENTAL

EVALUATION

!

Identify the hot blocks

!

Linear sweep: 75% of instructions are incorrectly disassembled

!

Recursive traversal: 40% of instructions are incorrectly disassembled

!

Average 52% execution speed penalty

Ingegneria e Scienze Informatiche – Verona – p.65/74

slide-86
SLIDE 86

DYNAMIC OBFUSCATION: DEFINITIONS

!

Static obfuscations transform the code prior to execution.

!

Dynamic algorithms transform the program at runtime.

!

Static obfuscation counter attacks by static analysis.

!

Dynamic obfuscation counter attacks by dynamic analysis

!

A dynamic obfuscator runs in two phases:

  • 1. At compile-time transform the program to an initial configuration

and add a runtime code-transformer.

  • 2. At runtime, intersperse the execution of the program with calls to

the transformer.

!

A dynamic obfuscator turns a normal program into a self-modifying one.

Ingegneria e Scienze Informatiche – Verona – p.66/74

slide-87
SLIDE 87

DYNAMIC OBFUSCATION: REPLACING INSTR.

!

Obfuscate(P): 1. Select three points A, B, and C in P, such that:

!

A strictly dominates B,

!

C strictly post-dominates B, and

!

any path from B to A passes through C.

!"#!$

! "

!"#$ %&'()&*+,-./*,(. %&'()0&,12-./*,(. ./*,(.3

2. Let orig be the instruction at B. 3. Select an instruction bogus of the same length as orig. 4. Replace orig at B with bogus. 5. At point A insert the instruction move orig, v=B where v is an opaque expression that evaluates to the address of point B. 6. Similarly, at point C insert the instruction move bogus, v=B.

Ingegneria e Scienze Informatiche – Verona – p.67/74

slide-88
SLIDE 88

DYNAMIC OBFUSCATION: AUCSMITH’S ALGORITHM

  • D. Aucsmith, Tamper resistant SW: An implementation, Patent: 5892899, 1999

!

Idea:

!

Split the program into pieces

!

xor them with each other

!

Add encryption to instructions

! " # $ % &

!"#$%

'()(*!+,,,+&- ! $ # &

&'(&)'&

A ⊕ B E ⊕ F

Ingegneria e Scienze Informatiche – Verona – p.68/74

slide-89
SLIDE 89

DYNAMIC OBFUSCATION: AUCSMITH’S ALGORITHM

  • D. Aucsmith, Tamper resistant SW: An implementation, Patent: 5892899, 1999

!

In general:

!

(A ⊕ B) ⊕ A = B

!

(A ⊕ B) ⊕ B = A

!

Idea

!

Reorder blocks such that upper and lower blocks are alternated

!

Execute a block and xor upper with lower if even iterate

!

Execute a block and xor lower with upper if odd iterate

Ingegneria e Scienze Informatiche – Verona – p.69/74

slide-90
SLIDE 90

DYNAMIC OBFUSCATION: AUCSMITH’S ALGORITHM

  • D. Aucsmith, Tamper resistant SW: An implementation, Patent: 5892899, 1999

!

In general:

!

(A ⊕ B) ⊕ A = B

!

(A ⊕ B) ⊕ B = A

Ingegneria e Scienze Informatiche – Verona – p.70/74

slide-91
SLIDE 91

DYNAMIC OBFUSCATION: AUCSMITH’S ALGORITHM

!

Obfuscate(P): 1. Split P into n pieces: P0, . . . , Pn−1. 2. Let C0, . . . , Cn−1 be the memory cells in which the pieces will reside at

  • runtime. The Ci’s are of equal size, large enough to fit the largest piece Pj .

3. Cells are, conceptually, divided into two spaces, upper (C0, . . . , Cn/2−1) and lower (Cn/2, . . . , Cn−1). Each cell in the upper space partners with a cell in the lower space. Select a partner function PF(c) that maps a cell number to the cell number of its partner, such as PF(c) = c +K, for some constant K. 4. Let IV0, . . . , IVn−1 be the initial values of cells C0, . . . , Cn−1, respectively. 5. Initialize a set of equations E1 = {C0 = IV0, . . . , Cn1 = IVn−1} which expresses the current state of the memory cells as a function of their initial values. 6. Initialize a set of equations E2 = {} which expresses how a piece Pi can be recovered in cleartext from the initial values IV0, . . . , IVn−1. 7. Initialize a table next = P0 = ?, . . . , Pn−1 = ? which maps each subprogram Pi to the cell it should jump to in order to execute Pi+1. 8. make obscure()

Ingegneria e Scienze Informatiche – Verona – p.71/74

slide-92
SLIDE 92

DYNAMIC OBFUSCATION: AUCSMITH’S ALGORITHM

!

make obscure():

!

For p ∈ [0 . . . n −1] do

!

Select a cell Cc to hold piece Pp in cleartext.

!

Consult E1 to find the current contents V of Cc. Update E2 := E2[P −p = V ]. Using Gaussian elimination, try to invert E2 (i.e. find values for all the IVi’s). If there is no solution select another cell for Pp.

!

next := next[Pp−1 = Cc]

!

For even (odd) p:s, simulate a mutation where every cell Ci in upper (lower) space is xor:ed with its partner cell CPF(i) in lower (upper) space.

!

E1 := E1[CPF(i) = CPF(i) ⊕ Ci].

Ingegneria e Scienze Informatiche – Verona – p.72/74

slide-93
SLIDE 93

DISCUSSION: THE FUCSIA IDEA

Obfuscation and Steganography by Abstract Interpretation

!

Define a uniform framework for information concealment in programming languages

!

General enough to include most known methods

!

Formal enough to provide a (possibly) provable secure environment for obfuscation and steganography

!

Rich enough to provide advanced design and evaluation tools

!

Practical enough to become a standard in the obfuscation and steganographic design and evaluation

!

The goal: develop a theory and practice for code obfuscation and steganography in order to make these technologies as practical as analogous ones in other media (e.g., in DRM of audio and video)

!

The code is a new media

!

Known concepts in digital media (compression, noise etc.) have to be studied on software

Ingegneria e Scienze Informatiche – Verona – p.73/74

slide-94
SLIDE 94

FUTURE DIRECTIONS

!

Move from syntactic to semantic-based metrics for potency

!

measuring incompleteness

!

measuring complexity of complete refinements

!

We need a program refinement-like calculus for code obfuscation

!

A kind of Program Obfuscation by Stepwise Refinement [Wirth ’71] ?

!

This is an open issue....

Ingegneria e Scienze Informatiche – Verona – p.74/74