Static Analysis of Executables to Detect Malicious Patterns [12 th - - PowerPoint PPT Presentation

static analysis of executables to detect malicious
SMART_READER_LITE
LIVE PREVIEW

Static Analysis of Executables to Detect Malicious Patterns [12 th - - PowerPoint PPT Presentation

Static Analysis of Executables to Detect Malicious Patterns [12 th USENIX Security Symposium, 2003] Mihai Christodorescu Somesh Jha CS @ University of Wisconsin, Madison Presented by K. Vikram Cornell University Problem & Motivation


slide-1
SLIDE 1

Static Analysis of Executables to Detect Malicious Patterns

Mihai Christodorescu Somesh Jha

CS @ University of Wisconsin, Madison [12th USENIX Security Symposium, 2003]

Presented by K. Vikram Cornell University

slide-2
SLIDE 2

Problem & Motivation…

Malicious code is … malicious Categorize: Propagation Method & Goal

Viruses, worms, trojan horses, spyware, etc.

Detect Malicious Code

In executables

slide-3
SLIDE 3

The Classical Stuff

Focus mostly on Viruses

Code to replicate itself + Malicious payload Inserted into executables

Look for signatures Not always enough Obfuscation-Deobfuscation Game

slide-4
SLIDE 4

Common Obfuscation Techniques

Encryption Dead Code insertion* Code transposition* Instruction Substitution* Register reassignment* Code Integration Entry Point Obscuring

slide-5
SLIDE 5

Common Deobfuscation Techniques

Regular Expressions Heuristic Analyses Emulation Mostly Syntactic…

slide-6
SLIDE 6

The Game

Signatures Regex Signatures Emulation/Heuristics ? ? Vanilla Virus Register Renaming Packing/Encryption Code Reordering Code Integration

slide-7
SLIDE 7

Current Technology

Antivirus Software

Norton, McAfee, Command

Brittle

Cannot detect simple obfuscations nop-insertion, code transposition

Chernobyl, z0mbie-6.b, f0sf0r0, Hare

slide-8
SLIDE 8

Theoretical Limits

Virus Detection is undecidable Some Static Analyses are undecidable But, Obfuscation is also hard

slide-9
SLIDE 9

The SAFE* Methodology

slide-10
SLIDE 10

Procedure

Key Ideas:

Analyze program’s semantic structure Use existing static analyses (extensible) Use uninterpreted symbols

Abstract Representation of Malicious Code Abstract Representation of Executable

Deobfuscation

Detect presence of malicious code

slide-11
SLIDE 11

The Annotator

Inputs:

CFG of the executable Library of Abstraction Patterns

Outputs:

Annotated CFG

slide-12
SLIDE 12

Some groundwork

Instruction I : τ1 × … × τk → τ Program P : I1, …, IN Program counter/point

pc : { I1, …, IN } → [1,…,N] pc(Ij) = j, ∀ 1 j N

Basic Block, Control Flow Graph* Static Analysis Predicates Types for data and instructions

slide-13
SLIDE 13

Example Predicates

slide-14
SLIDE 14

Abstraction Patterns

Abstraction pattern Γ : (V,O,C)

V = { x1 : τ1, …, xk : τk } O = I(v1, …, vm) | I : τ1 × … × τm → τ C = boolean expression involving static analysis predicates and logical operators

Represents a deobfuscation Predicate controls pattern application Unify patterns with sequence of instructions

slide-15
SLIDE 15

Example of a pattern

slide-16
SLIDE 16

Defeating Garbage Insertion

<instruction A> <instruction B> <instruction A> add ebx, 1 sub ebx, 1 nop <instruction B>

Pattern:

instr 1 … instr N Where Delta(state pre 1, state post N) = 0

slide-17
SLIDE 17

Defeating Code-reordering

Pattern:

jmp TARGET where Count (CFGPredecessors(TARGET)) = 1

slide-18
SLIDE 18

The Annotator

Given set of patterns Σ = { Γ1, …, Γm } Given a node n for program point p Matches each pattern in Σ with …, Previous2(Ip), Previous(Ip), Ip Associates all patterns that match with n Also stores the bindings from unification

slide-19
SLIDE 19

The Detector

Inputs:

Annotated CFG for a procedure Malicious code representation

Output:

Sequence of instructions exhibiting the malicious pattern

slide-20
SLIDE 20

Malicious Code Automaton

Abstraction of the vanilla virus 6-tuple (V,Σ,S,δ,S0,F)

V = { v1:τ1, …, vk:τk } Σ = { Γ1, …, Γn } S = finite set of states δ : S × Σ → 2S is a transition function S0 ⊆ S is a non-empty set of initial states F ⊆ S is a non-empty set of final states

slide-21
SLIDE 21

Malicious Code

slide-22
SLIDE 22
slide-23
SLIDE 23

Detector Operation

Inputs:

CFG PΣ A = (V,Σ,S,δ,S0,F)

Determines whether the same (malicious) pattern occurs both in A and Σ More formally, tests the emptiness of L(PΣ) ∩ (∪B ∈ BAllL(B (A)) )

slide-24
SLIDE 24

Detector Algorithm

Dataflow-like Algorithm Maintain a pre and post list at each node

  • f the CFG PΣ

List is of [s,Bs], s is a state in A Join operation is union

slide-25
SLIDE 25

Detector Algorithm

Transfer Function: Return:

slide-26
SLIDE 26

Defenses Against…

Code Re-ordering Register Renaming Insertion of irrelevant code

nops*, code that modifies dead registers Needs live-range and pointer analyses

slide-27
SLIDE 27

Experimental Results

False Positive Rate : 0 False Negative Rate : 0

not all obfuscations are detected

slide-28
SLIDE 28

Performance

slide-29
SLIDE 29

Future Directions

New languages

Scripts – VB, JavaScript, ASP Multi-language malicious code

Attack Diversity

worms, trojans too

Irrelevant sequence detection

Theorem provers

Use TAL/external type annotations

slide-30
SLIDE 30

Pitfalls/Criticisms?

Focus on viruses instead of worms Still fairly Ad-hoc Treatment of obfuscation is not formal enough Intractable techniques

Use of theorem provers to find irrelevant code

Slow No downloadable code Not enough experimental evaluation