SLIDE 1 Compact Explanation of Why Malware is Bad
Speaker: Wei Chen1 Joint Work with Charles Sutton1, David Aspinall1, Andrew D. Gordon1,2, Igor Muttik3, and Qi Shen4 App Guarden Project
1University of Edinburgh 2Microsoft Research Cambridge 3Intel Security 4Peking University
SLIDE 2
Android Applications
SLIDE 3 Threats Around Android Applications
Steal Personal Information Locate Bank Trojans Premium SMS Eavesdrop Botnet Control Root Access
SLIDE 4
An Example: the Flashlight Application
SLIDE 5
Flashlight: Some Comments from General Users
“Why in the world would I want a flashlight app that collects so much info about me?” from a user. “This app is extremely bright and does its job well. I don’t know what others mean when they say that they have so many problems with it.” from another user.
SLIDE 6
Flashlight: the Generated Behaviour Automaton
SLIDE 7
Motivation & Goal
◮ Motivation: it is not really enough to simply identify an
application as malware; we need to convince people the identification is correct and explain what it means.
◮ Goal: automatically generate compact explanations to a
broad range of people by producing short paragraphs, compact behaviour automata, and statistical charts.
◮ Potential Benefits: help people get better understanding of
potential threats hidden in mobile applications; provide hints for malware analysts before more expensive investigation; support automatic generation of security and anti-security policies, etc.
SLIDE 8 Examples: Some Generated Explanations in Paragraphs
◮ This is a trojan which steals personal information from the
infected device. It can be controlled over the web through
- HTTP. (an instance of Droidkungfu)
◮ It sends SMS messages to premium rated numbers. (an
instance of Opfake)
◮ This application might be a Chatting application, but, after a
USB massive storage is connected, it will: retrieve a class in a runnable package; read information about networks; connect to Internet. (an instance of Basebridge)
◮ This application is declared as an Anti-Virus application, but,
it will: read your phone state after a phone call is made; read your phone state then connect to Internet; send SMS messages after a phone call is made. (an instance of Zitmo)
SLIDE 9 Examples: a Generated Explanation in Behaviour Automata
q0
SMS RECEIVED
q1
SEND SMS, INTERNET
q2
SEND SMS, INTERNET
- (from instances of Ggtracker and Zitmo)
SLIDE 10
Technical Challenges
◮ Formalisation: characterise and formalise an application’s
behaviours precisely and efficiently.
◮ Learning: develop an effective and efficient method to learn
unexpected behaviours from thousands of sample malware and benign applications.
◮ Explanation: decide whether a target application is malware,
if so, automatically generate explanations from its unexpected behaviours.
◮ Evaluation: evaluate identified unexpected behaviours and
generated explanations.
SLIDE 11
The Approach: Formalisation
◮ We characterise an application’s behaviour by an extended
B¨ uchi automaton, i.e., finite and infinite control-dependence sequences of events, actions, and annotated API calls, a behaviour automaton so-called.
◮ We have designed and implemented a static analysis tool to
construct such a behaviour automaton directly from each Android application to approximate its behaviours.
◮ We have to consider a broad range of features of the Java and
the Android framework, e.g., multi-threads, multi-entries, inter-procedural calls, callbacks, component life-cycles, inter-component communications, and runtime-registered listeners, etc.
SLIDE 12 The Approach: Formalisation
q0 MAIN
click
AsyncTask: sendTextMessage
AsyncTask: sendTextMessage
Receiver: getDeviceId
Receiver: getLine1Number
- Receiver: openConnection
- q6
Receiver: openConnection
- (the behaviour automaton of an Android application)
SLIDE 13 The Approach: Formalisation
q0 MAIN
click
SEND SMS
SEND SMS
READ PHONE STATE
READ PHONE STATE
INTERNET
- (the abstract behaviour automaton of an Android application)
SLIDE 14
The Approach: Learning
◮ An unexpected behaviour is a salient sub-automaton. ◮ A common pattern exhibited by malware instances in a
human-decided malware family, which is rarely seen in benign applications.
◮ We have developed a learning-centred method to capture
unexpected behaviours from thousands of malware instances across hundreds of families.
◮ Differentiate salient features by adding benign applications for
training. Fk = {⊕A∈GkA | ⊕ ∈ {−, ∩}} min
w (Σ|wj| + Σ log(1 + e−yiwT xi)) ∧ xi ∈ F |Fk| k
Fk ← {j ∈ Fk | wj = 0} Fk ⊗ Fl = {A − B, A ∩ B, B − A | A ∈ Fk ∧ B ∈ Fl} U = {j ∈ F | wj < 0}
SLIDE 15 The Approach: Learning
q0
SMS RECEIVED
q1
SEND SMS, INTERNET
q2
SEND SMS, INTERNET
- (from instances of Ggtracker and Zitmo)
SLIDE 16
The Approach: Learning
◮ A singular behaviour identified in a group of applications
which have similar behaviours.
SLIDE 17
The Approach: Learning Accessing Your Location normal singular
SLIDE 18
The Approach: Learning Sending SMS Messages normal singular
SLIDE 19
The Approach: Learning
Pre-labelled Apps Group A Group B Group C
...
Clustering Auto_1 Auto_2 Auto_2 Auto_3 Training
...
Context-Sensitive Unexpected Behaviours
Fine-Grained Groups
Training Data
SLIDE 20 The Approach: Explanation
A Target App Group B [Similarity] Auto_1 Auto_2 Check Auto_x Auto_y Explanation Templates Sentences Auto_a Generation
◮ Present behaviour automata directly. ◮ Extract singletons, i.e., API calls, actions, and events, from
automata, and search by keywords for sentences in malware analysis reports.
◮ Extract pairs, i.e., something happens before another thing,
from automata, and feed them through pre-defined templates.
SLIDE 21 The Approach: Explanation — Sentence Searching
intercepts incoming sms messages and forwards them to a remote server including informations like imsi and imei these applications send premium sms messages the application will run in the background gathering sms activity and periodically send it to a proxy email address it sends sms messages to premium rated numbers and tries to hide this action from the malware investigators by using some kind of steganography this trojan steals personal information and receives commands via sms steals information sms messages imei imsi etc sending sms messages it sends sms messages to premium rated numbers it sends sms texts to all contacts on the device and sends an sms text to report its installation without the affected users knowledge and possibly resulting in data charges for the
- wner of the affected device allows an application to send sms messages allows an application to write sms
messages sends sms messages to premium rated numbers this malware also sends sms messages sends sms spam messages afterwards the trojan sends sms messages to phone numbers listed in this configuration file this malware attempts to send premium rated sms messages the trojan gathers the following information from the compromised device sms messages phone number and the imei of the infected device it also sends sms messages
SLIDE 22 The Approach: Explanation — Sentence Searching Choose the central sentence. V [s][a] = tfidf (a, s, C), if a is a word of s; 0,
σV (s) =
cos(θV [s],V [t]) arg max
s′∈C σV (s′)
SLIDE 23 The Approach: Explanation — Sentence Searching
intercepts incoming sms messages and forwards them to a remote server including informations like imsi and imei these applications send premium sms messages the application will run in the background gathering sms activity and periodically send it to a proxy email address it sends sms messages to premium rated numbers and tries to hide this action from the malware investigators by using some kind of steganography this trojan steals personal information and receives commands via sms steals information sms messages imei imsi etc sending sms messages
it sends sms messages to premium rated numbers it sends sms texts to all
contacts on the device and sends an sms text to report its installation without the affected users knowledge and possibly resulting in data charges for the owner of the affected device allows an application to send sms messages allows an application to write sms messages sends sms messages to premium rated numbers this malware also sends sms messages sends sms spam messages afterwards the trojan sends sms messages to phone numbers listed in this configuration file this malware attempts to send premium rated sms messages the trojan gathers the following information from the compromised device sms messages phone number and the imei of the infected device it also sends sms messages
SLIDE 24
The Approach: Explanation — Pre-Defined Templates
Feature Template Example API do sth. read your phone state ACTION sth. happens the app has finished booting EVENT the user does sth. the user clicks a view and holds (A, A) do sth. then do sth. read your phone state then connect to Internet (A, C) do sth. then sth. happens read SMS then the app makes a phone call (C, A) after sth. happens do sth. after the system has finished booting read your phone state (E, A) when the user does sth. do sth. when the user touches the screen get your precise location (E, C) when the user does sth. sth. happens when the user performs a gesture the app sends some data out
SLIDE 25
The Approach: Evaluation
◮ Subjective comparison with descriptions of malware families,
which were collected from online technical reports produced by malware analysts or third-party researchers.
◮ Using unexpected behaviours as input features to train
classifiers with good classification performance.
◮ Do a suvery to ask users which explanation is better.
SLIDE 26 The Approach: Evaluation
◮ Malware Family: BaseBridge
◮ Manually Produced Description: “A Trojan horse that
attempts to send premium-rate SMS messages to predetermined numbers.” from Symantec.“Forwards confidential details (SMS, IMSI, IMEI) to a remote server.” from Forensic.
◮ Automatic Explanation: It sends sms messages to premium
rated numbers. This is a Trojan which steals personal information from the infected device.
SLIDE 27 The Approach: Evaluation
◮ Malware Family: DroidKungfu
◮ Manually Produced Description: “A Trojan that sends
sensitive information to an attacker and includes backdoor
- functionality. It also exploits vulnerabilities to gain root
access.” from McAfee.“Collects a variety of information on the infected phone (IMEI, device, OS version, etc.). The collected information is dumped to a local file which is sent to a remote server afterwards” from Forensic.
◮ Automatic Explanation: This is a Trojan which steals
personal information from the infected device. It can be controlled over the web through http.
SLIDE 28 The Approach: Evaluation
Family Manual Description Learnt Unexpected Behaviour in Regular Expressions and ω-Languages Arspam Sends spam SMS messages to contacts on the compro- mised device [4].
- 1. BOOT COMPLETED . SEND SMS
Anserverbot Downloads, installs, and executes payloads [3].
- 1. UMS CONNECTED . LOAD CLASSω . (ACCESS NETWORK STATE |
READ PHONE STATE | INTERNET) . (ACCESS NETWORK STATE | READ PHONE STATE | INTERNET | LOAD CLASS)ω Basebridge Forwards confidential de- tails (SMS, IMSI, IMEI) to a remote server [1]. Down- loads and installs pay- loads [3, 4].
- 1. UMS CONNECTED . (INTERNET | LOAD CLASS |
READ PHONE STATE | ACCESS NETWORK STATE)ω+ Cosha Monitors and sends certain information to a remote lo- cation [4].
- 1. MAIN . click . (click | ACCESS FINE LOCATION | DIAL)ω . DIAL .
(click | ACCESS FINE LOCATION | DIAL)ω . (INTERNET | ǫ)
- 2. SMS RECEIVED . (INTERNET | ACCESS FINE LOCATION)ω+
Droiddream Gains root access, gath- ers information (device ID, IMEI, IMSI) from an in- fected mobile phone and connects to several URLs in
to upload this data [1, 3].
- 1. PHONE STATE . (ACCESS NETWORK STATE | READ PHONE STATEω+ .
INTERNET) . (ACCESS NETWORK STATE | INTERNET)ω Fakelogo Sends SMS messages to premium rate numbers [2].
- 1. BOOT COMPLETED . RUNω+
- 2. BOOT COMPLETED . READ PHONE STATEω+
- 3. MAIN . click . SEND SMS . (SEND SMS | ǫ)
- 4. MAIN . SEND SMS
SLIDE 29 The Approach: Evaluation
Family Manual Description Learnt Unexpected Behaviour in Regular Expressions and ω-Languages Geinimi Monitors and sends certain information to a remote lo- cation [4].
- 1. ǫ | MAIN . clickω+ . VIBRATE . (click | VIBRATE)ω . RESTART PACKAGES .
(MAIN . (click | VIBRATE)ω . RESTART PACKAGES)ω
- 2. BOOT COMPLETED . (ACCESS NETWORK STATE | click | INTERNET |
RESTART PACKAGES | ACCESS FINE LOCATION)ω+ Ggtracker Monitors received SMS messages and intercepts SMS messages [1]
- 1. MAIN . READ PHONE STATE
- 2. SMS RECEIVED . SEND SMS
Ginmaster Sends received SMS messages to a remote server [5]. Downloads and installs applications without user concern [5].
- 1. BOOT COMPLETED . LOAD CLASS
- 2. MAIN . SEND SMS
Spitmo Filters SMS messages to steal banking confirmation codes [4].
- 1. NEW OUTGOING CALL . READ PHONE STATE . INTERNET . (INTERNET | ǫ)
Zitmo Opens a backdoor that allows a remote attacker to steal information from SMS messages received
vice [4].
- 1. SMS RECEIVED . SEND SMS
- 2. MAIN . READ PHONE STATE
- 3. MAIN . SEND SMS
SLIDE 30
Thanks! Questions?
SLIDE 31
Forensic Blog. http://forensics. spreitzenbarth.de/android-malware/ [Accessed on 10 April 2015]. Juniper Networks. https://www.juniper.net/security/auto/includes/mobile signature descriptions.html [Accessed on 17 August 2015]. Malware Genome Project. http://www.csc.ncsu.edu/faculty/jiang/alerts.html [Accessed on 15 August 2015]. Symantec Security Response. http://www.symantec.com/security response/ [Accessed on 11 August 2015]. McAfee Threat Center. http://www.mcafee.com/uk/threat-center.aspx [Accessed on 15 August 2015].