Detecting Self-Mutating Malware Using Control-Flow Graph Matching - PowerPoint PPT Presentation

Detecting Self-Mutating Malware Using Control-Flow Graph Matching Danilo Bruschi Lorenzo Martignoni Mattia Monga Dipartimento di Informatica e Comunicazione Universit` a degli Studi di Milano { bruschi,martign,monga } @dico.unimi.it Conference on Detection of Intrusions and Malware & Vulnerability Assessment – 2006

Outline Code Obfuscation and Self-mutation Strategies adopted to achieve self-mutation and code insertion Challenges for the detection Unveiling malicious code Code normalization Code comparison Prototype implementation Experimental results Summary and future works D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 2

Code obfuscation and self-mutation ◮ Code obfuscation is a semantic-preserving program transformation that can be used to make a program harder to understand ◮ Self-mutation is a particular form of code obfuscation, which is performed automatically by the code on itself ◮ Self-mutation is adopted by malicious code to defeat detectors ◮ Self-mutation is applied during malicious code replication to generate completely new different instances D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 3

Self-mutation Common transformations adopted to achieve self-mutation: ◮ Substitution of instructions ◮ Permutation of instructions ◮ Garbage insertion ◮ Substitution of variables ◮ Control flow alteration Signature matching becomes useless D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 4

Code insertion Common techniques adopted for malicious code insertion: ◮ Cavity insertion ◮ Jump tables manipulation ◮ Data segment expansion The malicious code is seamless integrated into the host code D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 5

Challenges for the detection Conventional detection techniques are likely to fail: ◮ Pattern matching fails since fragmentation and mutation make hard to find signature patterns ◮ Emulation would require a complete tracing of analyzed programs as the entry point of the guest is not known; moreover every execution should be traced until the malicious payload is not executed ◮ Heuristics based on ad-hoc predictable and observable alterations of executables become useless when insertion is performed producing almost no alteration of any of the static properties of the original binary Theoretical studies (Chess & White) demonstrated that perfect detection of a self-mutating malware is an undecidable problem D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 6

Devised strategy Code interpretation and normalization ◮ Given a piece of code P which represents (or contains) an instance of a self-mutating malware we automatically revert all the mutations performed on it ◮ P is consequently reduced into a form, P N , which is pretty close to its archetype M and which can be recognized more easily Code comparison ◮ Detection is performed by looking for known abstract patterns into the transformed program P N D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 7

Code normalization Code normalization A program is transformed into a canonical form which is simpler in term of structure or syntax while preserving the original semantic and that is more suitable for comparison ◮ Analysis of the transformations adopted to implement self-mutation and experimental observations highlighted some weakness: ◮ Transformations led to the generation of useless computations ◮ Most transformations are invertible ◮ Different instances of the same malware can be viewed as under-optimized version of the archetype; the archetype is consequently the normal form of the malicious code ◮ Code normalization can be performed adopting some of the well known techniques used by compiler to produce compact and efficient code D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 8

Code normalization Some details ◮ Executable code is disassembled and translated into an intermediate form to explicit the semantic of each machine instruction ◮ Control-flow analysis and data-flow analysis are performed on the code to collect information that will be used by the next step ◮ Code transformations aim at: ◮ Identify all the instructions that do not contribute to the computation (dead and unreachable code elimination) ◮ Rewrite and simplify algebraic expressions in order to statically evaluate most of their sub-expressions (algebraic simplification) ◮ Propagate values computed by intermediate instructions to the appropriate use sites (expressions propagation) ◮ Analyze and try to evaluate control-flow transition conditions to identify tautologies and to rearrange the control to reduce the number of flow transitions (control-flow normalization) ◮ Analyze indirect control flow transitions to discover the smallest set of valid targets and the paths originating (indirections resolution) D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 9

Code comparison Given the normalized program we need to answer the question: “is the program P N hosting the malware M ?” ◮ We cannot expect to find a perfect matching of M in P N even if most of the transformations have been reverted ◮ The code comparator must be able to cope with some impurities left by normalization (we observed that these impurities are always local to basic blocks) ◮ The normalized control-flow of the malware is constant D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 10

Code comparison Some details ◮ P N is represented through its interprocedural-control flow graph (ICFG) and M through its control-flow graph ◮ The malicious code detection can be formulated as a subgraph isomorphism decision problem: “given two graphs G 1 and G 2 , is G 1 isomorphic to a subgraph of G 2 ?” ( G 1 is M and G 2 is P N ) ◮ The graphs are augmented with labels to achieve the necessary trade-off between Instruction classes precision and abstraction (to handle possible Integer arithmetic impurities) Float arithmetic Logic ◮ Instructions and flow transitions are Comparison partitioned into classes; labels describe the Function call set of classes in which instructions of a basic . . . block can be grouped D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 11

Code comparison Some details P N ◮ P N is represented through its interprocedural-control flow graph (ICFG) M and M through its control-flow graph ◮ The malicious code detection can be formulated as a subgraph isomorphism decision problem: “given two graphs G 1 and G 2 , is G 1 isomorphic to a subgraph of G 2 ?” ( G 1 is M and G 2 is P N ) ◮ The graphs are augmented with labels to achieve the necessary trade-off between Instruction classes precision and abstraction (to handle possible Integer arithmetic impurities) Float arithmetic Logic ◮ Instructions and flow transitions are Comparison partitioned into classes; labels describe the Function call set of classes in which instructions of a basic . . . block can be grouped D. Bruschi, L. Martignoni, M. Monga Detecting Self-Mutating Malware Using Control-Flow Graph Matching DIMVA2006 11

Detecting Self-Mutating Malware Using Control-Flow Graph Matching - PowerPoint PPT Presentation

Detecting Self-Mutating Malware Using Control-Flow Graph Matching Danilo Bruschi Lorenzo Martignoni Mattia Monga Dipartimento di Informatica e Comunicazione Universit` a degli Studi di Milano { bruschi,martign,monga } @dico.unimi.it

Malware Obfuscation Techniques: Packing November 18, 2014 Malware and packing Not packed (20%)

Linux malware presentation @r00tbsd Paul Rascagnres Malware.lu July 2013 @r00tbsd

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Android Malware Adventures Mert Can Cokuner Krat Ouzhan Aknc Android Malware

GOODWARE DRUGS FOR MALWARE: ON-THE-FLY MALWARE ANALYSIS AND CONTAINMENT DAMIANO BOLZONI

Entrapment: Tricking Malware with Transparent, Scalable Malware Analysis Paul Royal

Malware Halting 1. Malware 2. Software diversity Part I: Method Development 3. Computer

Android Malware Analysis on Attacks and Defense Android malware Android malware With the

Malware What is malware? Malware: malicious software worm ransomware adware

On Static Malware Detection Tayssir Touili LIPN, CNRS & Univ. Paris 13 Motivation: Malware

Malware What is malware? Malware: malicious software worm ransomware adware

Detecting Self-Interruptions during Reading Jan Pilzer and Sam Liu 2017-11-27 Detecting

Why Data Flow Models? Models from Chapter 5 emphasized control Control flow graph, call

Introduction to R Week 3: Selecting, ltering, and mutating Louisa Smith July 27 - July 31

1 What Is Control-Flow Analysis? Loop Concepts Control-flow analysis discovers the flow of

More complex scoring functions Until now: Bioinformatics Algorithms match, mismatch, gap

CSCE 471/871 Lecture 2: Pairwise Alignments Why should we care? How do we do it? Stephen

The absolutely neutralizing coalescence theory of mutation Paul de Lacy Rutgers University

Longest Cycle Crossover for Solving the Capacitated Vehicle Routing Problem Depar artment ment

CSE 331 Mutation and immutability slides created by Marty Stepp based on materials by M. Ernst,

CS 251 Fall 2019 CS 251 Fall 2019 Topics Principles of Programming Languages Principles

In a world where bindings and values Immutability: obstacle or tool? are

Miscellaneous: tracking on the web (& start on malware) CS 161: Computer Security Prof.

Detecting Self-Mutating Malware Using Control-Flow Graph Matching - PowerPoint PPT Presentation

Detecting Self-Mutating Malware Using Control-Flow Graph Matching Danilo Bruschi Lorenzo Martignoni Mattia Monga Dipartimento di Informatica e Comunicazione Universit` a degli Studi di Milano { bruschi,martign,monga } @dico.unimi.it

Malware Obfuscation Techniques: Packing November 18, 2014 Malware and packing Not packed (20%)

Linux malware presentation @r00tbsd Paul Rascagnres Malware.lu July 2013 @r00tbsd

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Android Malware Adventures Mert Can Cokuner Krat Ouzhan Aknc Android Malware

GOODWARE DRUGS FOR MALWARE: ON-THE-FLY MALWARE ANALYSIS AND CONTAINMENT DAMIANO BOLZONI

Entrapment: Tricking Malware with Transparent, Scalable Malware Analysis Paul Royal

Malware Halting 1. Malware 2. Software diversity Part I: Method Development 3. Computer

Android Malware Analysis on Attacks and Defense Android malware Android malware With the

Malware What is malware? Malware: malicious software worm ransomware adware

On Static Malware Detection Tayssir Touili LIPN, CNRS &amp; Univ. Paris 13 Motivation: Malware

Malware What is malware? Malware: malicious software worm ransomware adware

Detecting Self-Interruptions during Reading Jan Pilzer and Sam Liu 2017-11-27 Detecting

Why Data Flow Models? Models from Chapter 5 emphasized control Control flow graph, call

Introduction to R Week 3: Selecting, ltering, and mutating Louisa Smith July 27 - July 31

1 What Is Control-Flow Analysis? Loop Concepts Control-flow analysis discovers the flow of

More complex scoring functions Until now: Bioinformatics Algorithms match, mismatch, gap

CSCE 471/871 Lecture 2: Pairwise Alignments Why should we care? How do we do it? Stephen

The absolutely neutralizing coalescence theory of mutation Paul de Lacy Rutgers University

Longest Cycle Crossover for Solving the Capacitated Vehicle Routing Problem Depar artment ment

CSE 331 Mutation and immutability slides created by Marty Stepp based on materials by M. Ernst,

CS 251 Fall 2019 CS 251 Fall 2019 Topics Principles of Programming Languages Principles

In a world where bindings and values Immutability: obstacle or tool? are

Miscellaneous: tracking on the web (&amp; start on malware) CS 161: Computer Security Prof.

On Static Malware Detection Tayssir Touili LIPN, CNRS & Univ. Paris 13 Motivation: Malware

Miscellaneous: tracking on the web (& start on malware) CS 161: Computer Security Prof.