Modeling the Android Platform Etienne Payet LIM-ERIMIA, universit - - PowerPoint PPT Presentation

modeling the android platform
SMART_READER_LITE
LIVE PREVIEW

Modeling the Android Platform Etienne Payet LIM-ERIMIA, universit - - PowerPoint PPT Presentation

Modeling the Android Platform Etienne Payet LIM-ERIMIA, universit e de la R eunion BYTECODE13 Saturday 23 March 2013 Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE13 1 / 50 Reunion, a part of France


slide-1
SLIDE 1

Modeling the Android Platform

´ Etienne Payet

LIM-ERIMIA, universit´ e de la R´ eunion

BYTECODE’13 Saturday 23 March 2013

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 1 / 50

slide-2
SLIDE 2

Reunion, a part of France and Europe (OMR of EU)

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 2 / 50

slide-3
SLIDE 3

Outline

1

Analyzing Android applications

2

Operational semantics for Dalvik

3

Designing an operational semantics for Android

4

Conclusion

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 3 / 50

slide-4
SLIDE 4

What is Android?

An operating system (OS) for: mobile devices (smartphones, tablets), embedded devices (televisions, car radios, . . . ), x86 platforms (http://www.android-x86.org).

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 4 / 50

slide-5
SLIDE 5

Worldwide mobile device sales in 3Q12 (thousands of units)

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 5 / 50

slide-6
SLIDE 6

What is Android?

A language: for developping applications for the Android OS, Java with an extended library for mobile and interactive applications, based on an event-driven architecture.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 6 / 50

slide-7
SLIDE 7

Building an Android application

(http://developer.android.com/tools/building/index.html)

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 7 / 50

slide-8
SLIDE 8

.dex files

Their format is optimized for minimal memory usage: the design is driven by sharing of data, they contain Dalvik bytecode, dex stands for Dalvik executable.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 8 / 50

slide-9
SLIDE 9

Dalvik bytecode

It is run by an instance of the Dalvik Virtual Machine (DVM), DVM = JVM (register-based vs stack-based), register-based VMs are well-suited for devices with constrained processing power: on average, they are faster than stack-based VMs.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 9 / 50

slide-10
SLIDE 10

Android applications

They can be downloaded from anywhere

Google play (official store), Amazon, AppsApk.com, pandaapp, . . .

They are not necessarily digitally signed. ⇒ Reliability is a major concern for users and developers.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 10 / 50

slide-11
SLIDE 11

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 11 / 50

slide-12
SLIDE 12

Analyzing Android applications

For finding malicious code (e.g., security and privacy vulnerabilities) bugs

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 12 / 50

slide-13
SLIDE 13

Google’s analyses

“Google has started analyzing apps before putting them in their catalog in

  • rder to detect anomalous behavior. According to their own sources, they

have managed to reduce malicious app downloads by 40 percent.” (PandaLabs Annual Report 2012)

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 13 / 50

slide-14
SLIDE 14

Static analyses for finding security/privacy vulnerabilities

Barrera, Kayacik, van Oorschot, Somayaji. A methodology for empirical analysis of permission-based security models and its application to Android. Proc. of CCS’10. Chin, Felt, Greenwood, Wagner. Analyzing inter-application communication in Android. Proc. of MobiSys’11. Enck, Octeau, McDaniel, Chaudhuri. A study of Android application

  • security. Proc. of SEC’11.

Felt, Chin, Hanna, Song, Wagner. Android permissions demystified.

  • Proc. of CCS’11.

Fuchs, Chaudhuri, Foster. SCanDroid: Automated security certification of Android applications. Draft, 2009. Kim, Yoon, Yi, Shin. ScanDal: Static analyzer for detecting privacy leaks in Android applications. MoST’12. Wognsen, Karlsen. Static analysis of Dalvik bytecode and reflection in

  • Android. Master’s thesis, Aalborg University, 2012.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 14 / 50

slide-15
SLIDE 15

Static analyses for finding bugs

  • Klocwork. http://www.klocwork.com.

Payet, Spoto. Static analysis of Android programs. Information & Software Technology, 2012.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 15 / 50

slide-16
SLIDE 16

Dynamic analyses for finding security vulnerabilities

Bugiel, Davi, Dmitrienko, Fischer, Sadeghi, Shastry. Towards taming privilege-escalation attacks on Android. Proc. of NDSS’12. Dietz, Shekhar, Pisetsky, Shu, Wallach. QUIRE: Lightweight provenance for smart phone operating systems. Proc. of USENIX Security Symposium. 2011. Enck, Gilbert, Chun, Cox, Jung, McDaniel, Sheth. TaintDroid: An information-flow tracking system for realtime privacy monitoring on

  • smartphones. Proc. of OSDI’10.

Felt, Wang, Moshchuk, Hanna, Chin. Permission redelegation: Attacks and defenses. Proc. of USENIX Security Symposium. 2011.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 16 / 50

slide-17
SLIDE 17

Symbolic execution for analyzing programs

Jeon, Micinski, Foster. SymDroid: Symbolic execution for Dalvik

  • bytecode. Submitted, July 2012.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 17 / 50

slide-18
SLIDE 18

Modeling the Android platform

Dalvik = Android Some of these analyses rely on a formal operational semantics for Dalvik. But none of them provide a formal semantics for key specific features

  • f the Android platform.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 18 / 50

slide-19
SLIDE 19

Outline

1

Analyzing Android applications

2

Operational semantics for Dalvik

3

Designing an operational semantics for Android

4

Conclusion

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 19 / 50

slide-20
SLIDE 20

Dalvik registers

Each method has a fresh set of registers. Invoked methods do not affect the registers of invoking methods.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 20 / 50

slide-21
SLIDE 21

Dalvik instructions

Move between registers (move, move-object, move-wide, . . . ), constants to registers (const, const/4, const/16, . . . ),

  • perations on int, long, float, double (add-int, sub-int, . . . ),

instance creation (new-instance), read/write member fields (iget, iput, . . . ), read/write static fields (sget, sput, . . . ), array manipulation (new-array, array-length, . . . ), read/write array elements (aget, aput, . . . ), execution control (goto, if-eq, if-lt, . . . ), method invocation (invoke-virtual, invoke-super, . . . ), setting the result value (return-void, return, . . . ), getting the result value (move-result, move-result-object, . . . ), · · ·

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 21 / 50

slide-22
SLIDE 22

Example (smali syntax)

.class public LFactorial; .super Ljava/lang/Object; .method public static factorial(I)I .registers 2 const/4 v0, 1 if-lez v1, :end sub-int v0, v1, v0 invoke-static {v0}, LFactorial;->factorial(I)I move-result v0 mul-int v0, v1, v0 :end return v0 .end method

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 22 / 50

slide-23
SLIDE 23

Operational semantics for the whole Dalvik

[WK12] Wognsen, Karlsen. Static analysis of Dalvik bytecode and reflection in Android. Master’s thesis, Aalborg University, 2012. m.instructionAt(pc) = move r1 r2 S, H, m, pc, R :: SF ⇒ S, H, m, pc + 1, R[r1 → R(r2)] :: SF S is a static heap, H is a heap, SF is a call stack m is a method, R ∈ Register → Value is a set of local registers

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 23 / 50

slide-24
SLIDE 24

Intermediate languages

They consist of a small set of instructions into which Dalvik can be easily translated. Dalvik Core: Kim, Yoon, Yi, Shin. ScanDal: Static analyzer for detecting privacy leaks in Android applications. MoST’12. µ-Dalvik: Jeon, Micinski, Foster. SymDroid: Symbolic execution for Dalvik

  • bytecode. Submitted, July 2012.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 24 / 50

slide-25
SLIDE 25

µ-Dalvik vs the others

µ-Dalvik operational semantics constructs a path condition φ which records which conditional branches have been taken thus far: π = (Σ[ [r1] ] Σ[ [r2] ]) φt = π ∧ Σ.φ SAT(φt) Σ, if r1 r2 then pct ⇒ Σ[φ → φt, pc → pct] µ-Dalvik provides an instruction for checking a property of interest: ¬SAT(¬Σ[ [r] ]) Σ, assert r ⇒ Σ[pc → pc + 1]

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 25 / 50

slide-26
SLIDE 26

Outline

1

Analyzing Android applications

2

Operational semantics for Dalvik

3

Designing an operational semantics for Android

4

Conclusion

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 26 / 50

slide-27
SLIDE 27

Goal

Provide a formal basis for the development of analyses that consider the complex flow of information inside Android applications, that usually consist of interacting components.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 27 / 50

slide-28
SLIDE 28

Android application components

(Activities) single screens with a visual user interface (Services) background operations with no interaction with the user (Content providers) data containers such as databases (Broadcast receivers) objects reacting to broadcast messages

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 28 / 50

slide-29
SLIDE 29

Android application components

Each type of component has a distinct lifecycle that defines how the component changes state. A component can invoke another component, but component invocation = method invocation. A component is a possible entry point into the program.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 29 / 50

slide-30
SLIDE 30

Callback methods

Callback methods are automatically invoked by the system: when components switch from state to state, in reaction to events. Android programs do not usually call such methods explicitly.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 30 / 50

slide-31
SLIDE 31

The lifecycle of an activity

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 31 / 50

slide-32
SLIDE 32

XML files

They are used to build parts of Android applications (e.g., GUI). They are dynamically inflated by the system to create the objects that they describe. Inflation makes heavy use of reflection.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 32 / 50

slide-33
SLIDE 33

An example (1/5)

res/layout/caller.xml

<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android" android:orientation="vertical" android:layout_width="match_parent" android:layout_height="match_parent" > <TextView android:id="@+id/message" android:layout_width="match_parent" android:layout_height="wrap_content" android:text="@string/empty" /> <Button android:layout_width="wrap_content" android:layout_height="wrap_content" android:text="@string/launch" android:onClick="launchActivity" /> </LinearLayout>

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 33 / 50

slide-34
SLIDE 34

An example (2/5)

Caller.java

public class Caller extends android.app.Activity { private TextView mMessageView; protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.caller); mMessageView = (TextView) findViewById(R.id.message); } public void launchActivity(View v) { ... } ... }

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 34 / 50

slide-35
SLIDE 35

An example (3/5)

Caller.java

private final static int CALLEE = 0; public void launchActivity(View v) { startActivityForResult(new Intent(this, Callee.class), CALLEE); System.out.println("Hello!"); } protected void onActivityResult(int requestCode, int resultCode, ...) { switch(requestCode) { case CALLEE: switch(resultCode) { case RESULT_OK: mMessageView.setText("OK button clicked"); break; case RESULT_CANCELED: mMessageView.setText("Cancel button clicked"); break; } } }

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 35 / 50

slide-36
SLIDE 36

An example (4/5)

Callee.java

public class Callee extends android.app.Activity { protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.callee); } public void returnOk(View v) { // OK button clicked. setResult(RESULT_OK); finish(); } public void returnCancel(View v) { // Cancel button clicked. setResult(RESULT_CANCELED); finish(); } }

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 36 / 50

slide-37
SLIDE 37

An example (5/5)

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 37 / 50

slide-38
SLIDE 38

DVM state

r | | π | | µ ∈ Σ r is an array of registers (ri denotes the ith register) π is a stack of pending activities µ : Location → Object is a heap an object maps its fields into values

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 38 / 50

slide-39
SLIDE 39

Semantics of Dalvik instructions

const d, c = λr | | π | | µ.r[d → c] | | π | | µ iget d, i, f = λr | | π | | µ.

  • r[d → µ(ri)(f )] |

| π | | µ if ri = 0 undefined

  • therwise

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 39 / 50

slide-40
SLIDE 40

Semantics of library methods (macro instructions)

finished (boolean) and res (integer) are fields of the current activity startActivityForResult A = λr | | π | | µ.r | | A :: π | | µ where A is a subclass of android.app.Activity setResult i = λr | | π | | µ.

  • r |

| π | | µ if finished r | | π | | µ[res → i]

  • therwise

finish = λr | | π | | µ.r | | π | | µ[finished → true]

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 40 / 50

slide-41
SLIDE 41

Android programs as graphs of blocks

A program is a graph of blocks of code. A graph contains many disjoint subgraphs, each corresponding to a different method. A block with w instructions and p successor blocks is written as

ins1 ins2 ··· insw

b1 ··· bp

If m is a method, then bm denotes the block where m starts.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 41 / 50

slide-42
SLIDE 42

Operational semantics of method execution

(Instruction execution) ins ∈ {call, move-result, return} r′ | | π′ | | µ′ = ins(r | | π | | µ)

  • ins

rest

b1 ··· bp |

| r :: α ⋄ π ⋄ µ

rest

b1 ··· bp |

| r′ :: α ⋄ π′ ⋄ µ′ (Continuation) 1 ≤ i ≤ p

b1 ··· bp |

| r :: α ⋄ π ⋄ µ bi | | r :: α ⋄ π ⋄ µ

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 42 / 50

slide-43
SLIDE 43

Operational semantics of method execution

(Explicit method call) b =

call {s0,...,sw},m rest

b1 ··· bp

b′ =

rest

b1 ··· bp

r′ = [0 → rs0, . . . , w → rsw ] the lookup procedure of m selects m′ b | | r :: α ⋄ π ⋄ µ bm′ | | r′ :: b′ | | r :: α ⋄ π ⋄ µ (Method return) b =

move-result d rest

b1 ··· bp

b′ =

rest

b1 ··· bp

  • return s

| | r :: b | | r′ :: α ⋄ π ⋄ µ b′ | |r′[d → rs] :: α ⋄ π ⋄ µ

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 43 / 50

slide-44
SLIDE 44

Operational semantics of activity execution

Android manages activities using an activity stack (Ω). We formalize an activity as a tuple ℓ | | s | | π | | α:

ℓ is the location of the activity in memory, s is the lifecycle state of the activity.

Moves between lifecycle states.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 44 / 50

slide-45
SLIDE 45

Operational semantics of activity execution

(Implicit call to a callback method) s = running (s, s′) ∈ Lifecycle the lookup procedure of a method corresponding to s′ selects m ℓ | | s | | π | |

return

| | :: Ω ⋄ µ ⇒ ℓ | | s′ | | π | |bm | |[ℓ] :: Ω ⋄ µ s = running (s, s′) ∈ Lifecycle π = ε ∨ µ(ℓ)(finished) = true ⇒ s′ = pause the lookup procedure of a method corresponding to s′ selects m ℓ | | s | | π | |

return

| | :: Ω ⋄ µ ⇒ ℓ | | s′ | | π | |bm | |[ℓ] :: Ω ⋄ µ

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 45 / 50

slide-46
SLIDE 46

Operational semantics of activity execution

(Starting a new activity) s = pause α =

return

| | ℓ′ is a fresh location and a is a new object of class A the lookup procedure of a method corresponding to s′ selects m s′ = create α′ = bm | |[ℓ′] µ′ = µ[ℓ′ → a] ℓ | | s | | A :: π | | α :: Ω ⋄ µ ⇒ ℓ′ | | s′ | | ε | | α′ :: ℓ | | s | | π | | α :: Ω ⋄ µ′ (Returning from an activity) ϕ′ = ℓ′ | | pause | | ε | |

return

| | µ(ℓ′)(finished) = true ϕ = ℓ | | s | | ε | |

return

| | s ∈ {pause, stop} the lookup procedure of onActivityResult selects m ϕ′ :: ϕ :: Ω ⋄ µ ⇒ ℓ | | s | | ε | |bm | |[ℓ] :: ϕ′ :: Ω ⋄ µ

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 46 / 50

slide-47
SLIDE 47

Outline

1

Analyzing Android applications

2

Operational semantics for Dalvik

3

Designing an operational semantics for Android

4

Conclusion

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 47 / 50

slide-48
SLIDE 48

Analyzing Android applications

For finding bugs and malicious code. Formal semantics can provide a formal basis. Some operational semantics have been proposed for Dalvik. This work is the first attempt at defining an operational semantics for Android.

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 48 / 50

slide-49
SLIDE 49

Modeling the whole Android platform

We consider a simplified situation:

programs only consist of activities, activity interactions only occur in state running.

The whole platform is very complex to model:

applications may consist of several kinds of components, activity interactions may occur in other states than running, there is a large number of implicitly invoked callback methods, a component of another program may be invoked, . . .

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 49 / 50

slide-50
SLIDE 50

Thank you! Questions?

´ Etienne Payet (LIM-ERIMIA) Modeling the Android Platform BYTECODE’13 50 / 50