Automatic Analysis of Malware Behavior using Machine Learning - - PowerPoint PPT Presentation

automatic analysis of malware behavior using machine
SMART_READER_LITE
LIVE PREVIEW

Automatic Analysis of Malware Behavior using Machine Learning - - PowerPoint PPT Presentation

Automatic Analysis of Malware Behavior using Machine Learning Konrad Rieck, Philipp Trinius, Carsten Willems, Thorsten Holz Peng Su CISC850 Cyber Analytics CISC850 Cyber Analytics Automatic Analysis of Malware Behavior Malware threaten


slide-1
SLIDE 1

Automatic Analysis of Malware Behavior using Machine Learning

Konrad Rieck, Philipp Trinius, Carsten Willems, Thorsten Holz

Peng Su

CISC850 Cyber Analytics

slide-2
SLIDE 2

Automatic Analysis of Malware Behavior

  • Malware threaten the Internet
  • Dynamic VS Static
  • binary packers, encryption, or self-modifying code, to obstruct

analysis.

  • behavior of malicious software during run-time.

CISC850 Cyber Analytics

slide-3
SLIDE 3

Automatic Analysis of Malware Behavior

CISC850 Cyber Analytics

slide-4
SLIDE 4

Monitoring of Malware Behavior

  • Malware Sandboxes --CWSandbox
  • Malware Instruction Set

CISC850 Cyber Analytics

slide-5
SLIDE 5

Malware Instruction Set

  • MIST instruction keep the stable and

discriminative patterns such as directory and mutex name at the beginning.

CISC850 Cyber Analytics

slide-6
SLIDE 6

Embedding of Malware Behavior

  • Embedding using Instruction Q-grams
  • Comparing Embedding reports

CISC850 Cyber Analytics

slide-7
SLIDE 7

Embedding using Instruction Q-grams

  • For example, if report x=‘1|A 2|A 1|A 2|A’,

A={1|A, 2|A }, the q for q-grams is 2.

CISC850 Cyber Analytics

slide-8
SLIDE 8

Embedding using Instruction Q-grams

  • Normalization
  • Redundancy of behavior, considered alphabet,

length of reports

CISC850 Cyber Analytics

slide-9
SLIDE 9

Comparing Embedding reports

  • Euclidean distance

CISC850 Cyber Analytics

slide-10
SLIDE 10

Clustering and Classification

  • Prototypes->Clustering-> Classification

CISC850 Cyber Analytics

slide-11
SLIDE 11

Prototype Extraction

CISC850 Cyber Analytics

slide-12
SLIDE 12

Clustering using Prototypes

CISC850 Cyber Analytics

slide-13
SLIDE 13

Classification using Prototypes

CISC850 Cyber Analytics

slide-14
SLIDE 14

Incremental Analysis

CISC850 Cyber Analytics

slide-15
SLIDE 15

Experiments & Application

  • Evaluation Data
  • Three parameters to decide
  • Evaluation of Components
  • How to select the best parameters dp, dc, dr

CISC850 Cyber Analytics

slide-16
SLIDE 16

Evaluation Data

  • A reference data set
  • Evaluate and calibrate the framework
  • An application data set
  • See the performance on unknown malwares

CISC850 Cyber Analytics

slide-17
SLIDE 17

Reference Data Set

CISC850 Cyber Analytics

slide-18
SLIDE 18

Application Data Set

CISC850 Cyber Analytics

slide-19
SLIDE 19

Evaluation of Components

  • Precision and recall

CISC850 Cyber Analytics

slide-20
SLIDE 20

Evaluation of Components

  • F-measure
slide-21
SLIDE 21

Evaluation of Components--dp

CISC850 Cyber Analytics

slide-22
SLIDE 22

Evaluation of Components--dc

CISC850 Cyber Analytics

slide-23
SLIDE 23

Evaluation of Components--dr

CISC850 Cyber Analytics

slide-24
SLIDE 24

Comparative Evaluation with State-of- the-Art

CISC850 Cyber Analytics

slide-25
SLIDE 25

An Application Scenario

CISC850 Cyber Analytics