Detecting Self-Mutating Malware Using Control-Flow Graph Matching
Danilo Bruschi Lorenzo Martignoni Mattia Monga
Dipartimento di Informatica e Comunicazione Universit` a degli Studi di Milano {bruschi,martign,monga}@dico.unimi.it
Detecting Self-Mutating Malware Using Control-Flow Graph Matching - - PowerPoint PPT Presentation
Detecting Self-Mutating Malware Using Control-Flow Graph Matching Danilo Bruschi Lorenzo Martignoni Mattia Monga Dipartimento di Informatica e Comunicazione Universit` a degli Studi di Milano { bruschi,martign,monga } @dico.unimi.it
Dipartimento di Informatica e Comunicazione Universit` a degli Studi di Milano {bruschi,martign,monga}@dico.unimi.it
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 2
◮ Code obfuscation is a semantic-preserving program transformation
◮ Self-mutation is a particular form of code obfuscation, which is
◮ Self-mutation is adopted by malicious code to defeat detectors ◮ Self-mutation is applied during malicious code replication to generate
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 3
◮ Substitution of instructions ◮ Permutation of instructions ◮ Garbage insertion ◮ Substitution of variables ◮ Control flow alteration
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 4
◮ Substitution of instructions ◮ Permutation of instructions ◮ Garbage insertion ◮ Substitution of variables ◮ Control flow alteration
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 4
◮ Cavity insertion ◮ Jump tables manipulation ◮ Data segment expansion
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 5
◮ Cavity insertion ◮ Jump tables manipulation ◮ Data segment expansion
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 5
◮ Pattern matching fails since fragmentation and mutation make hard to
◮ Emulation would require a complete tracing of analyzed programs as
◮ Heuristics based on ad-hoc predictable and observable alterations of
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 6
◮ Pattern matching fails since fragmentation and mutation make hard to
◮ Emulation would require a complete tracing of analyzed programs as
◮ Heuristics based on ad-hoc predictable and observable alterations of
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 6
◮ Given a piece of code P which represents (or
◮ P is consequently reduced into a form, PN,
◮ Detection is performed by looking for known
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 7
◮ Analysis of the transformations adopted to implement self-mutation
◮ Transformations led to the generation of useless computations ◮ Most transformations are invertible
◮ Different instances of the same malware can be viewed as
◮ Code normalization can be performed adopting some of the well known
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 8
◮ Executable code is disassembled and translated into an intermediate
◮ Control-flow analysis and data-flow analysis are performed on the code
◮ Code transformations aim at:
◮ Identify all the instructions that do not contribute to the computation
◮ Rewrite and simplify algebraic expressions in order to statically evaluate
◮ Propagate values computed by intermediate instructions to the
◮ Analyze and try to evaluate control-flow transition conditions to identify
◮ Analyze indirect control flow transitions to discover the smallest set of
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 9
◮ We cannot expect to find a perfect matching of M in PN even if most
◮ The code comparator must be able to cope with some impurities left
◮ The normalized control-flow of the malware is constant
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 10
◮ PN is represented through its
◮ The malicious code detection can be
◮ The graphs are augmented with labels to
◮ Instructions and flow transitions are
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 11
◮ PN is represented through its
◮ The malicious code detection can be
◮ The graphs are augmented with labels to
◮ Instructions and flow transitions are
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 11
◮ PN is represented through its
◮ The malicious code detection can be
◮ The graphs are augmented with labels to
◮ Instructions and flow transitions are
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 11
◮ PN is represented through its
◮ The malicious code detection can be
◮ The graphs are augmented with labels to
◮ Instructions and flow transitions are
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 11
◮ The code normalizer is built on top of Boomerang, an open-source
◮ Translate machine code into the intermediate form through a recursive
◮ Performs data-flow analysis on the intermediate form ◮ Performs the normalization steps previously described (some of the
◮ Able to solve know patterns of indirection
◮ The prototype receives an executable files and emits its normalized
◮ The ICFGPN of the normalized program and the CFGM of the searched
◮ In case of match the comparison routine returns the set of ICFGPN
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 12
◮ Several instances of the same self-mutating malicious code (the virus
◮ The normalized control-flow graphs were all isomorphic, they were not
◮ Different executables were collected and their ICFGs were built ◮ Each procedure CFG was used to simulate malicious code and searched
◮ The results of the subgraph isomorphism detection procedure were
◮ A random set of alleged false-positives and false-negatives were selected
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 13
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 14
◮ We proposed a general strategy, based on static analysis, that can be
◮ We developed a prototype tool and used it to show that a malware
◮ We showed that augmented control-flow graphs are well suited to
◮ Although the subgraph isomorphism is a NP-complete problem, our
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 15
◮ Extend our prototype to perform normalization on real world
◮ Evaluate algorithms for partial subgraph isomorphism matching and
◮ Perform more exhaustive experiments using new malicious code ◮ Investigate attacks and countermeasures to defeat static analysis
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 16
Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 17