Compact Explanation of Why Malware is Bad Speaker: Wei Chen 1 Joint - - PowerPoint PPT Presentation

compact explanation of why malware is bad
SMART_READER_LITE
LIVE PREVIEW

Compact Explanation of Why Malware is Bad Speaker: Wei Chen 1 Joint - - PowerPoint PPT Presentation

Compact Explanation of Why Malware is Bad Speaker: Wei Chen 1 Joint Work with Charles Sutton 1 , David Aspinall 1 , Andrew D. Gordon 1 , 2 , Igor Muttik 3 , and Qi Shen 4 App Guarden Project 1 University of Edinburgh 2 Microsoft Research Cambridge


slide-1
SLIDE 1

Compact Explanation of Why Malware is Bad

Speaker: Wei Chen1 Joint Work with Charles Sutton1, David Aspinall1, Andrew D. Gordon1,2, Igor Muttik3, and Qi Shen4 App Guarden Project

1University of Edinburgh 2Microsoft Research Cambridge 3Intel Security 4Peking University

slide-2
SLIDE 2

Android Applications

slide-3
SLIDE 3

Threats Around Android Applications

Steal Personal Information Locate Bank Trojans Premium SMS Eavesdrop Botnet Control Root Access

slide-4
SLIDE 4

An Example: the Flashlight Application

slide-5
SLIDE 5

Flashlight: Some Comments from General Users

“Why in the world would I want a flashlight app that collects so much info about me?” from a user. “This app is extremely bright and does its job well. I don’t know what others mean when they say that they have so many problems with it.” from another user.

slide-6
SLIDE 6

Flashlight: the Generated Behaviour Automaton

slide-7
SLIDE 7

Motivation & Goal

◮ Motivation: it is not really enough to simply identify an

application as malware; we need to convince people the identification is correct and explain what it means.

◮ Goal: automatically generate compact explanations to a

broad range of people by producing short paragraphs, compact behaviour automata, and statistical charts.

◮ Potential Benefits: help people get better understanding of

potential threats hidden in mobile applications; provide hints for malware analysts before more expensive investigation; support automatic generation of security and anti-security policies, etc.

slide-8
SLIDE 8

Examples: Some Generated Explanations in Paragraphs

◮ This is a trojan which steals personal information from the

infected device. It can be controlled over the web through

  • HTTP. (an instance of Droidkungfu)

◮ It sends SMS messages to premium rated numbers. (an

instance of Opfake)

◮ This application might be a Chatting application, but, after a

USB massive storage is connected, it will: retrieve a class in a runnable package; read information about networks; connect to Internet. (an instance of Basebridge)

◮ This application is declared as an Anti-Virus application, but,

it will: read your phone state after a phone call is made; read your phone state then connect to Internet; send SMS messages after a phone call is made. (an instance of Zitmo)

slide-9
SLIDE 9

Examples: a Generated Explanation in Behaviour Automata

q0

SMS RECEIVED

q1

SEND SMS, INTERNET

q2

SEND SMS, INTERNET

  • (from instances of Ggtracker and Zitmo)
slide-10
SLIDE 10

Technical Challenges

◮ Formalisation: characterise and formalise an application’s

behaviours precisely and efficiently.

◮ Learning: develop an effective and efficient method to learn

unexpected behaviours from thousands of sample malware and benign applications.

◮ Explanation: decide whether a target application is malware,

if so, automatically generate explanations from its unexpected behaviours.

◮ Evaluation: evaluate identified unexpected behaviours and

generated explanations.

slide-11
SLIDE 11

The Approach: Formalisation

◮ We characterise an application’s behaviour by an extended

B¨ uchi automaton, i.e., finite and infinite control-dependence sequences of events, actions, and annotated API calls, a behaviour automaton so-called.

◮ We have designed and implemented a static analysis tool to

construct such a behaviour automaton directly from each Android application to approximate its behaviours.

◮ We have to consider a broad range of features of the Java and

the Android framework, e.g., multi-threads, multi-entries, inter-procedural calls, callbacks, component life-cycles, inter-component communications, and runtime-registered listeners, etc.

slide-12
SLIDE 12

The Approach: Formalisation

q0 MAIN

  • SMS RECEIVED
  • q1

click

  • q2

AsyncTask: sendTextMessage

  • q3

AsyncTask: sendTextMessage

  • q4

Receiver: getDeviceId

  • q5

Receiver: getLine1Number

  • Receiver: openConnection
  • q6

Receiver: openConnection

  • (the behaviour automaton of an Android application)
slide-13
SLIDE 13

The Approach: Formalisation

q0 MAIN

  • SMS RECEIVED
  • q1

click

  • q2

SEND SMS

  • q3

SEND SMS

  • q4

READ PHONE STATE

  • q5

READ PHONE STATE

  • INTERNET
  • q6

INTERNET

  • (the abstract behaviour automaton of an Android application)
slide-14
SLIDE 14

The Approach: Learning

◮ An unexpected behaviour is a salient sub-automaton. ◮ A common pattern exhibited by malware instances in a

human-decided malware family, which is rarely seen in benign applications.

◮ We have developed a learning-centred method to capture

unexpected behaviours from thousands of malware instances across hundreds of families.

◮ Differentiate salient features by adding benign applications for

training. Fk = {⊕A∈GkA | ⊕ ∈ {−, ∩}} min

w (Σ|wj| + Σ log(1 + e−yiwT xi)) ∧ xi ∈ F |Fk| k

Fk ← {j ∈ Fk | wj = 0} Fk ⊗ Fl = {A − B, A ∩ B, B − A | A ∈ Fk ∧ B ∈ Fl} U = {j ∈ F | wj < 0}

slide-15
SLIDE 15

The Approach: Learning

q0

SMS RECEIVED

q1

SEND SMS, INTERNET

q2

SEND SMS, INTERNET

  • (from instances of Ggtracker and Zitmo)
slide-16
SLIDE 16

The Approach: Learning

◮ A singular behaviour identified in a group of applications

which have similar behaviours.

slide-17
SLIDE 17

The Approach: Learning Accessing Your Location normal singular

slide-18
SLIDE 18

The Approach: Learning Sending SMS Messages normal singular

slide-19
SLIDE 19

The Approach: Learning

Pre-labelled Apps Group A Group B Group C

...

Clustering Auto_1 Auto_2 Auto_2 Auto_3 Training

...

Context-Sensitive Unexpected Behaviours

Fine-Grained Groups

Training Data

slide-20
SLIDE 20

The Approach: Explanation

A Target App Group B [Similarity] Auto_1 Auto_2 Check Auto_x Auto_y Explanation Templates Sentences Auto_a Generation

◮ Present behaviour automata directly. ◮ Extract singletons, i.e., API calls, actions, and events, from

automata, and search by keywords for sentences in malware analysis reports.

◮ Extract pairs, i.e., something happens before another thing,

from automata, and feed them through pre-defined templates.

slide-21
SLIDE 21

The Approach: Explanation — Sentence Searching

intercepts incoming sms messages and forwards them to a remote server including informations like imsi and imei these applications send premium sms messages the application will run in the background gathering sms activity and periodically send it to a proxy email address it sends sms messages to premium rated numbers and tries to hide this action from the malware investigators by using some kind of steganography this trojan steals personal information and receives commands via sms steals information sms messages imei imsi etc sending sms messages it sends sms messages to premium rated numbers it sends sms texts to all contacts on the device and sends an sms text to report its installation without the affected users knowledge and possibly resulting in data charges for the

  • wner of the affected device allows an application to send sms messages allows an application to write sms

messages sends sms messages to premium rated numbers this malware also sends sms messages sends sms spam messages afterwards the trojan sends sms messages to phone numbers listed in this configuration file this malware attempts to send premium rated sms messages the trojan gathers the following information from the compromised device sms messages phone number and the imei of the infected device it also sends sms messages

slide-22
SLIDE 22

The Approach: Explanation — Sentence Searching Choose the central sentence. V [s][a] = tfidf (a, s, C), if a is a word of s; 0,

  • therwise.

σV (s) =

  • t∈C

cos(θV [s],V [t]) arg max

s′∈C σV (s′)

slide-23
SLIDE 23

The Approach: Explanation — Sentence Searching

intercepts incoming sms messages and forwards them to a remote server including informations like imsi and imei these applications send premium sms messages the application will run in the background gathering sms activity and periodically send it to a proxy email address it sends sms messages to premium rated numbers and tries to hide this action from the malware investigators by using some kind of steganography this trojan steals personal information and receives commands via sms steals information sms messages imei imsi etc sending sms messages

it sends sms messages to premium rated numbers it sends sms texts to all

contacts on the device and sends an sms text to report its installation without the affected users knowledge and possibly resulting in data charges for the owner of the affected device allows an application to send sms messages allows an application to write sms messages sends sms messages to premium rated numbers this malware also sends sms messages sends sms spam messages afterwards the trojan sends sms messages to phone numbers listed in this configuration file this malware attempts to send premium rated sms messages the trojan gathers the following information from the compromised device sms messages phone number and the imei of the infected device it also sends sms messages

slide-24
SLIDE 24

The Approach: Explanation — Pre-Defined Templates

Feature Template Example API do sth. read your phone state ACTION sth. happens the app has finished booting EVENT the user does sth. the user clicks a view and holds (A, A) do sth. then do sth. read your phone state then connect to Internet (A, C) do sth. then sth. happens read SMS then the app makes a phone call (C, A) after sth. happens do sth. after the system has finished booting read your phone state (E, A) when the user does sth. do sth. when the user touches the screen get your precise location (E, C) when the user does sth. sth. happens when the user performs a gesture the app sends some data out

slide-25
SLIDE 25

The Approach: Evaluation

◮ Subjective comparison with descriptions of malware families,

which were collected from online technical reports produced by malware analysts or third-party researchers.

◮ Using unexpected behaviours as input features to train

classifiers with good classification performance.

◮ Do a suvery to ask users which explanation is better.

slide-26
SLIDE 26

The Approach: Evaluation

◮ Malware Family: BaseBridge

◮ Manually Produced Description: “A Trojan horse that

attempts to send premium-rate SMS messages to predetermined numbers.” from Symantec.“Forwards confidential details (SMS, IMSI, IMEI) to a remote server.” from Forensic.

◮ Automatic Explanation: It sends sms messages to premium

rated numbers. This is a Trojan which steals personal information from the infected device.

slide-27
SLIDE 27

The Approach: Evaluation

◮ Malware Family: DroidKungfu

◮ Manually Produced Description: “A Trojan that sends

sensitive information to an attacker and includes backdoor

  • functionality. It also exploits vulnerabilities to gain root

access.” from McAfee.“Collects a variety of information on the infected phone (IMEI, device, OS version, etc.). The collected information is dumped to a local file which is sent to a remote server afterwards” from Forensic.

◮ Automatic Explanation: This is a Trojan which steals

personal information from the infected device. It can be controlled over the web through http.

slide-28
SLIDE 28

The Approach: Evaluation

Family Manual Description Learnt Unexpected Behaviour in Regular Expressions and ω-Languages Arspam Sends spam SMS messages to contacts on the compro- mised device [4].

  • 1. BOOT COMPLETED . SEND SMS

Anserverbot Downloads, installs, and executes payloads [3].

  • 1. UMS CONNECTED . LOAD CLASSω . (ACCESS NETWORK STATE |

READ PHONE STATE | INTERNET) . (ACCESS NETWORK STATE | READ PHONE STATE | INTERNET | LOAD CLASS)ω Basebridge Forwards confidential de- tails (SMS, IMSI, IMEI) to a remote server [1]. Down- loads and installs pay- loads [3, 4].

  • 1. UMS CONNECTED . (INTERNET | LOAD CLASS |

READ PHONE STATE | ACCESS NETWORK STATE)ω+ Cosha Monitors and sends certain information to a remote lo- cation [4].

  • 1. MAIN . click . (click | ACCESS FINE LOCATION | DIAL)ω . DIAL .

(click | ACCESS FINE LOCATION | DIAL)ω . (INTERNET | ǫ)

  • 2. SMS RECEIVED . (INTERNET | ACCESS FINE LOCATION)ω+

Droiddream Gains root access, gath- ers information (device ID, IMEI, IMSI) from an in- fected mobile phone and connects to several URLs in

  • rder

to upload this data [1, 3].

  • 1. PHONE STATE . (ACCESS NETWORK STATE | READ PHONE STATEω+ .

INTERNET) . (ACCESS NETWORK STATE | INTERNET)ω Fakelogo Sends SMS messages to premium rate numbers [2].

  • 1. BOOT COMPLETED . RUNω+
  • 2. BOOT COMPLETED . READ PHONE STATEω+
  • 3. MAIN . click . SEND SMS . (SEND SMS | ǫ)
  • 4. MAIN . SEND SMS
slide-29
SLIDE 29

The Approach: Evaluation

Family Manual Description Learnt Unexpected Behaviour in Regular Expressions and ω-Languages Geinimi Monitors and sends certain information to a remote lo- cation [4].

  • 1. ǫ | MAIN . clickω+ . VIBRATE . (click | VIBRATE)ω . RESTART PACKAGES .

(MAIN . (click | VIBRATE)ω . RESTART PACKAGES)ω

  • 2. BOOT COMPLETED . (ACCESS NETWORK STATE | click | INTERNET |

RESTART PACKAGES | ACCESS FINE LOCATION)ω+ Ggtracker Monitors received SMS messages and intercepts SMS messages [1]

  • 1. MAIN . READ PHONE STATE
  • 2. SMS RECEIVED . SEND SMS

Ginmaster Sends received SMS messages to a remote server [5]. Downloads and installs applications without user concern [5].

  • 1. BOOT COMPLETED . LOAD CLASS
  • 2. MAIN . SEND SMS

Spitmo Filters SMS messages to steal banking confirmation codes [4].

  • 1. NEW OUTGOING CALL . READ PHONE STATE . INTERNET . (INTERNET | ǫ)

Zitmo Opens a backdoor that allows a remote attacker to steal information from SMS messages received

  • n the compromised de-

vice [4].

  • 1. SMS RECEIVED . SEND SMS
  • 2. MAIN . READ PHONE STATE
  • 3. MAIN . SEND SMS
slide-30
SLIDE 30

Thanks! Questions?

slide-31
SLIDE 31

Forensic Blog. http://forensics. spreitzenbarth.de/android-malware/ [Accessed on 10 April 2015]. Juniper Networks. https://www.juniper.net/security/auto/includes/mobile signature descriptions.html [Accessed on 17 August 2015]. Malware Genome Project. http://www.csc.ncsu.edu/faculty/jiang/alerts.html [Accessed on 15 August 2015]. Symantec Security Response. http://www.symantec.com/security response/ [Accessed on 11 August 2015]. McAfee Threat Center. http://www.mcafee.com/uk/threat-center.aspx [Accessed on 15 August 2015].