Obfuscation: know your enemy Ninon EYROLLES neyrolles@quarkslab.com - - PowerPoint PPT Presentation

obfuscation know your enemy
SMART_READER_LITE
LIVE PREVIEW

Obfuscation: know your enemy Ninon EYROLLES neyrolles@quarkslab.com - - PowerPoint PPT Presentation

Obfuscation: know your enemy Ninon EYROLLES neyrolles@quarkslab.com Serge GUELTON sguelton@quarkslab.com Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Prelude Introduction Control flow obfuscation Data


slide-1
SLIDE 1

Obfuscation: know your enemy

Ninon EYROLLES neyrolles@quarkslab.com Serge GUELTON sguelton@quarkslab.com

slide-2
SLIDE 2

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Prelude

slide-3
SLIDE 3

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Prelude

slide-4
SLIDE 4

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Plan

1

Introduction What is obfuscation ?

2

Control flow obfuscation

3

Data flow obfuscation

4

Python obfuscation

slide-5
SLIDE 5

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation What is obfuscation ?

Code obfuscation

Definition Obfuscation is used to make code analysis as complex and expensive as possible, while keeping the original behaviour of the program (input/output equivalence). Malwares (try to avoid signature detection) Protection of sensitive algorithm (DRM, intellectual property...) Theoretically: transformation of symetric-key encryption in asymetric-key encryption, homomorphic encryption algorithm...

slide-6
SLIDE 6

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation What is obfuscation ?

Don’t shoot the messenger

Why this talk ? → Obfuscation exists and is widely used. → You might be interested in breaking it (to rewrite some code as free software for example). ⇒ If you want to break it, you need to know how it works!

slide-7
SLIDE 7

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation What is obfuscation ?

Several obfuscation types

Control flow obfuscation Data-flow obfuscation Symbols rewriting: variable names, function names... Code encryption, packing...

slide-8
SLIDE 8

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation What is obfuscation ?

Several obfuscation types

Control flow obfuscation Data-flow obfuscation Symbols rewriting: variable names, function names... Code encryption, packing...

slide-9
SLIDE 9

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Plan

1

Introduction

2

Control flow obfuscation Definitions Control-flow obfuscation Control flow flattening

3

Data flow obfuscation

4

Python obfuscation

slide-10
SLIDE 10

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Definitions

Control flow

Illustrates the execution flow of a program: the different paths that are possible during the execution Cycles (for, while...), conditions (if), calls to other functions... It’s represented with a Control Flow Graph (CFG): it’s formed of basic blocks and links between them

slide-11
SLIDE 11

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Definitions

Control flow

x = 10 y = 0 while(x ≥ 0) y = y + 2 x = x − 1 return y true false

Figure : CFG of pseudo-code Figure : CFG of assembly code

slide-12
SLIDE 12

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG:

slide-13
SLIDE 13

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling;

slide-14
SLIDE 14

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; → search for patterns

slide-15
SLIDE 15

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; → search for patterns

slide-16
SLIDE 16

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; → search for patterns → comparison of code

slide-17
SLIDE 17

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; junk code insertion; → search for patterns → comparison of code

slide-18
SLIDE 18

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; junk code insertion; → search for patterns → comparison of code → liveness analysis

slide-19
SLIDE 19

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; junk code insertion;

  • paque predicates;

→ search for patterns → comparison of code → liveness analysis

slide-20
SLIDE 20

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; junk code insertion;

  • paque predicates;

→ search for patterns → comparison of code → liveness analysis → SMT solver

slide-21
SLIDE 21

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control-flow obfuscation

Various techniques

The goal is to transform the structure of the CFG: loop unrolling; inlining of function; junk code insertion;

  • paque predicates;

control flow flattening. → search for patterns → comparison of code → liveness analysis → SMT solver

slide-22
SLIDE 22

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Definition

Control flow flattening Transforms the structure of the program to make CFG reconstruction difficult Encodes the control flow information and hide the result in the data flow

slide-23
SLIDE 23

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Principle

Implementation Basic blocks are numbered A dispatcher handles the execution A variable contains the value of the next block to be executed At the end of every block, this variable is updated, and the execution flow goes back to the dispatcher which then jumps to the next block

INIT val = 1 DISPATCHER switch(val) block 1 some code val = 2 block 2 some code val = 3 block 3 some code return

Figure : Principle of control flow flattening

slide-24
SLIDE 24

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Example

Figure : original CFG Figure : CFG after the control flow flattening

slide-25
SLIDE 25

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Weakness

INIT val = 1 DISPATCHER switch(val) block 1 some code val = 2 block 2 some code val = 3 block 3 some code return

What is the weakness of the control flow flattening ?

slide-26
SLIDE 26

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Weakness

INIT val = 1 DISPATCHER switch(val) block 1 some code val = 2 block 2 some code val = 3 block 3 some code return

What is the weakness of the control flow flattening ? ⇒ variable containing the execution flow

slide-27
SLIDE 27

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Weakness

INIT val = 1 DISPATCHER switch(val) block 1 some code val = 2 block 2 some code val = 3 block 3 some code return

What is the weakness of the control flow flattening ? ⇒ variable containing the execution flow Obfuscation techniques: multiple (context) variables

  • paque predicates

hash

slide-28
SLIDE 28

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Control flow flattening

Weakness

INIT val = 1 DISPATCHER switch(val) block 1 some code val = 2 block 2 some code val = 3 block 3 some code return

What is the weakness of the control flow flattening ? ⇒ variable containing the execution flow Obfuscation techniques: multiple (context) variables

  • paque predicates

hash ⇒ dynamic analysis (tracing) can also be used

slide-29
SLIDE 29

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Plan

1

Introduction

2

Control flow obfuscation

3

Data flow obfuscation Definition A few techniques

4

Python obfuscation

slide-30
SLIDE 30

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Definition

Data Flow analysis

Several ways to do it Information provided by the program’s data: strings, numbers, structures... Relations between the data or between the input and output (of a program, a function, a basic block) Interactions between the program and the data: reading, writing, location in memory... Formal notions: live variable, data flow equations, backward and forward analysis...

slide-31
SLIDE 31

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex:

slide-32
SLIDE 32

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex: encode constants (strings for example);

slide-33
SLIDE 33

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex: encode constants (strings for example); → look for decoding routine

slide-34
SLIDE 34

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex: encode constants (strings for example); → look for decoding routine insert useless data (close to junk code);

slide-35
SLIDE 35

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex: encode constants (strings for example); → look for decoding routine insert useless data (close to junk code); → use symbolic execution, data tainting / slicing

slide-36
SLIDE 36

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex: encode constants (strings for example); → look for decoding routine insert useless data (close to junk code); → use symbolic execution, data tainting / slicing complexify arithmetic operations on data; x + y ⇔ (x ⊕ y) + 2 ∗ (x ∧ y)

slide-37
SLIDE 37

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

To make data analysis more complex: encode constants (strings for example); → look for decoding routine insert useless data (close to junk code); → use symbolic execution, data tainting / slicing complexify arithmetic operations on data; x + y ⇔ (x ⊕ y) + 2 ∗ (x ∧ y) → use bruteforce and build heuristic

slide-38
SLIDE 38

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

modify the way data are stored / manipulated: split tables, change the calling convention of functions, etc;

slide-39
SLIDE 39

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

modify the way data are stored / manipulated: split tables, change the calling convention of functions, etc; → spot similar elements (probably processed by the same instructions) → dynamic analysis to get arguments of a function

slide-40
SLIDE 40

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

modify the way data are stored / manipulated: split tables, change the calling convention of functions, etc; → spot similar elements (probably processed by the same instructions) → dynamic analysis to get arguments of a function encode data while reading and writing.

f (x)

MEMORY

x

x

f f −1

slide-41
SLIDE 41

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few techniques

Examples

modify the way data are stored / manipulated: split tables, change the calling convention of functions, etc; → spot similar elements (probably processed by the same instructions) → dynamic analysis to get arguments of a function encode data while reading and writing.

f (x)

MEMORY

x

x

f f −1

→ find the relevant variables, and look for the corresponding encoding

slide-42
SLIDE 42

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Plan

1

Introduction

2

Control flow obfuscation

3

Data flow obfuscation

4

Python obfuscation Modified Interpreter Source-to-source obfuscation A few examples

slide-43
SLIDE 43

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Context

Applications are developed in Python (DropBox for example): a modified interpreter is delivered with the binary Creation of“packers”to make access to the code difficult Few traditional obfuscations here! Three ways to obfuscate:

  • modified interpreter so that access to compiled files is difficult;
  • measures to make decompilation harder;
  • source to source obfuscation in case of decompilation success.
slide-44
SLIDE 44

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

State of the art

Based on the work of Kholia and Wegrzyn1: Change the magic number Number specific to each version of CPython, prevent decompilation → bruteforce (∼ 50 possibilities)

1Looking Inside the (Drop) Box, by D. Kholia and P. Wegrzyn

slide-45
SLIDE 45

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

State of the art

Based on the work of Kholia and Wegrzyn1: Change the magic number Number specific to each version of CPython, prevent decompilation → bruteforce (∼ 50 possibilities) Suppress some features Remove some functions like PyRun_File(), or attributes like co_code

1Looking Inside the (Drop) Box, by D. Kholia and P. Wegrzyn

slide-46
SLIDE 46

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

State of the art

Based on the work of Kholia and Wegrzyn1: Change the magic number Number specific to each version of CPython, prevent decompilation → bruteforce (∼ 50 possibilities) Suppress some features Remove some functions like PyRun_File(), or attributes like co_code Opcode encryption Encrypt compiled files → find decryption routine

1Looking Inside the (Drop) Box, by D. Kholia and P. Wegrzyn

slide-47
SLIDE 47

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

State of the art

Opcode remapping Applies a permutation on the opcodes of the instruction set. 34 35 36 LOAD_GLOBAL CALL_FUNCTION POP_TOP

34 → 75 35 → 23 36 → 12

⇒ 75 23 12 LOAD_FAST LOAD_CONST ROT_TWO

slide-48
SLIDE 48

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

State of the art

Opcode remapping Applies a permutation on the opcodes of the instruction set. 34 35 36 LOAD_GLOBAL CALL_FUNCTION POP_TOP

34 → 75 35 → 23 36 → 12

⇒ 75 23 12 LOAD_FAST LOAD_CONST ROT_TWO → Compare permuted bytecode with standard bytecode for standard Python module

slide-49
SLIDE 49

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

State of the art

Opcode remapping Applies a permutation on the opcodes of the instruction set. 34 35 36 LOAD_GLOBAL CALL_FUNCTION POP_TOP

34 → 75 35 → 23 36 → 12

⇒ 75 23 12 LOAD_FAST LOAD_CONST ROT_TWO → Compare permuted bytecode with standard bytecode for standard Python module → Get into the application runtime and execute arbitrary code

slide-50
SLIDE 50

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Substitution of series of opcodes with a new opcode LOAD_GLOBAL CALL_FUNCTION POP_TOP ⇒ LOAD_GLOBAL CALL_AND_POP → Analyse the interpreter! Insertion of junk opcode

slide-51
SLIDE 51

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP ROT_THREE BINARY_ADD ROT_TWO POP_TOP

slide-52
SLIDE 52

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0

LOAD_FAST 1 BUILD_MAP ROT_THREE BINARY_ADD ROT_TWO POP_TOP

VAR 0

slide-53
SLIDE 53

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1

BUILD_MAP ROT_THREE BINARY_ADD ROT_TWO POP_TOP

VAR 1 VAR 0

slide-54
SLIDE 54

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP

ROT_THREE BINARY_ADD ROT_TWO POP_TOP

DICT VAR 1 VAR 0

slide-55
SLIDE 55

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP ROT_THREE

BINARY_ADD ROT_TWO POP_TOP

VAR 1 VAR 0 DICT

slide-56
SLIDE 56

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP ROT_THREE BINARY_ADD

ROT_TWO POP_TOP

VAR 0 + VAR 1 DICT

slide-57
SLIDE 57

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP ROT_THREE BINARY_ADD ROT_TWO

POP_TOP

DICT VAR 0 + VAR 1

slide-58
SLIDE 58

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP ROT_THREE BINARY_ADD ROT_TWO POP_TOP

VAR 0 + VAR 1

slide-59
SLIDE 59

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

New techniques

Addition of new opcodes Insertion of junk opcode Use opcodes for stack manipulation: ROT_TWO, ROT_THREE or POP_TOP Combine it to modify bytecode without changing computed values

LOAD_FAST 0 LOAD_FAST 1 BUILD_MAP ROT_THREE BINARY_ADD ROT_TWO POP_TOP

→ Prevent decompilation with uncompyle, but pycdc still works.

slide-60
SLIDE 60

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_SUB

slide-61
SLIDE 61

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_SUB > <

slide-62
SLIDE 62

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_SUB > <

slide-63
SLIDE 63

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_ADD > <

slide-64
SLIDE 64

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_ADD > <

slide-65
SLIDE 65

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_ADD > <

slide-66
SLIDE 66

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Modified Interpreter

Self-modifying code

1 def foo(b): 2 b += 1 3 ... 1 def foo(b): 2 modify_bytecode () 3 b -= 1 4 ...

During execution: LOAD_GLOBAL CALL_FUNCTION POP_TOP LOAD_FAST LOAD_CONST INPLACE_ADD > <

slide-67
SLIDE 67

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Source-to-source obfuscation

Abstract Syntax Tree

Abstract Syntax Tree (AST): tree representation of the abstract structure of source code.

  • Nodes are operators
  • Leaves are operands

× +

x y z

Figure : AST representation of (x + y) × z

slide-68
SLIDE 68

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Source-to-source obfuscation

Python source-to-source

Principle

Figure : Compilation flow for Python source-to-source

Obfuscation Examples

slide-69
SLIDE 69

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation Source-to-source obfuscation

Python source-to-source

Principle Obfuscation Examples Control flow transformations: loop unrolling, mixing if and while with opaque predicates, transformation in functional style Data flow transformations: string encoding, use of mixed boolean-arithmetic expressions. Symbols obfuscation: replacing names of functions and variables with random string

slide-70
SLIDE 70

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few examples

Loop unrolling

1 for i in range (3): 2 if b & 1 == 1: 3 p ^= a 4 hiBitSet = a & 0x80 5 a <<= 1 1 i = 0 2 if ((b & 1) == 1): 3 p ^= a 4 hiBitSet = (a & 128) 5 a <<= 1 6 i = 1 7 if ((b & 1) == 1): 8 p ^= a 9 hiBitSet = (a & 128) 10 a <<= 1 11 i = 2 12 if ((b & 1) == 1): 13 p ^= a 14 hiBitSet = (a & 128) 15 a <<= 1

→ Look for patterns (instructions, variables)

slide-71
SLIDE 71

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few examples

Mixing if and while

1 # original code 2 if cond1: 3 work () 1 # obfuscated if 2

  • paque_pred = 1

3 while

  • paque_pred & cond1:

4 work () 5

  • paque_pred = 0

→ Holds only if the predicates are difficult to evaluate statically and not

  • bvious for a human
slide-72
SLIDE 72

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few examples

Opaque predicates

1 x = ((((2 * (( -816744550) | 816744552)) - 2 (( -816744550) ^ 816744552)) * 3 (((3783141896 ^ 3921565134)

  • 4

(2 * ((∼3783141896) & 3921565134) )) | 5 ((4009184523 & (∼3870761249) ) - 6 ((∼4009184523) & 3870761249) ))) - 7 (((2105675179 & (∼2244098417) ) - 8 ((∼2105675179) & 2244098417) ) ^ 9 ((3657555079 + (∼3519131805) ) + 1)))

→ Use constant folding: x = 36

1 x = (80*b**2 + 160*b*(∼ b) + 36821*b + 2 80*(∼ b)**2 + 36821*(∼ b) + 4236969) % 256

→ Bruteforce, heuristics...

slide-73
SLIDE 73

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation A few examples

Transformation in functional style

1 def fibo(n): 2 return n if n < 2 else (fibo(n - 1) + fibo(n - 2))

1 fibo = (lambda n: (lambda _: (_. __setitem__ (’$’, ((_[’n’] if (’ n’ in _) else n) if ((_[’n’] if (’n’ in _) else n) < 2) else ((_[’fibo ’] if (’fibo ’ in _) else fibo)(((_[’n’] if ( ’n’ in _) else n) - 1)) + (_[’fibo ’] if (’fibo ’ in _) else fibo)(((_[’n’] if (’n’ in _) else n) - 2))))), _)[( -1)]) ({’n’: n, ’$’: None })[’$’])

→ Either you’re comfortable with functional style, or you use input/output analysis or symbols information.

slide-74
SLIDE 74

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Conclusion

There’s a lot of obfuscation techniques Understanding obfuscation can be useful (interoperability) Keep focused on the context and what you want to know Every obfuscation can be broken with time and resources

slide-75
SLIDE 75

contact@quarkslab.com I @quarkslab.com

Questions?

slide-76
SLIDE 76

Introduction Control flow obfuscation Data flow obfuscation Python obfuscation

Table of contents

1

Introduction What is obfuscation ?

2

Control flow obfuscation Definitions Control-flow obfuscation Control flow flattening

3

Data flow obfuscation Definition A few techniques

4

Python obfuscation Modified Interpreter Source-to-source obfuscation A few examples