Scalable and Precise Taint Analysis for Android Wei Huang 12 , Yao - - PowerPoint PPT Presentation

scalable and precise taint analysis for android
SMART_READER_LITE
LIVE PREVIEW

Scalable and Precise Taint Analysis for Android Wei Huang 12 , Yao - - PowerPoint PPT Presentation

Scalable and Precise Taint Analysis for Android Wei Huang 12 , Yao Dong 1 , Ana Milanova 1 , Julian Dolby 3 1 Rensselaer Polytechnic Institute 2 Google 3 IBM Research 1 Taint Analysis for Android Tracks flow of private data Controlled at


slide-1
SLIDE 1

Scalable and Precise Taint Analysis for Android

Wei Huang12, Yao Dong1, Ana Milanova1, Julian Dolby3

1Rensselaer Polytechnic Institute 2Google 3IBM Research

1

slide-2
SLIDE 2

Controlled at installation

Taint Analysis for Android

 Tracks flow of private data

2

Private data Untrusted parties

unencrypted

SOURCES:

Phone number, Location, IMEI, etc.

SINKS:

Network, Logs, etc.

slide-3
SLIDE 3

Motivating Example [From DroidBench]

public class Data { String f; String get() { return f; } void set(String p) { f = p; } } public class FieldSensitivity3 { protected void onCreate(Bundle b) { Data dt = new Data(); … String sim = tm.getSimSerialNumber(); dt.set(sim); String sg = dt.get(); sms.sendTextMessage(…,sg,…); // sink } }

3

Leak!

slide-4
SLIDE 4

public class Data { String f; String get() { return f; } void set(String p) { f = p; } } public class FieldSensitivity3 { protected void onCreate(Bundle b) { tainted Data dt = new Data(); tainted String sim = tm.getSimSerialNumber(); dt.set(sim); tainted String sg = dt.get(); sms.sendTextMessage(…,sg,…); // sink } }

Solution – DFlow/DroidInfer

4

Source: the return value is tainted Sink: the parameter is safe Subtyping: safe <: tainted Type error!

slide-5
SLIDE 5

Contributions

 DFlow: A context-sensitive information flow

type system

 DroidInfer: An inference algorithm for DFlow  CFL-Explain: A CFL-reachability algorithm to

explain type errors

 Effective handling of Android-specific features  Implementation and evaluation

  • DroidBench, Contagio, Google Play Store

5

slide-6
SLIDE 6

Inference and Checking Framework

 Build DFlow/DroidInfer on top of our type

inference and checking framework

  • Programmers provide parameters to

instantiate their own type system

 Context sensitivity is encoded with viewpoint adaptation

  • Framework infers the “best” typing

 If inference succeeds, this verifies the absence of errors  Otherwise, this reveals errors in the program

6

slide-7
SLIDE 7

Framework Structure

Unified Typing Rules Set-Based Solver Extract Best Typing Type Checking Parameters Instantiated Rules Set-based Solution Concrete Typing

 Immutability (ReIm)  Universe Types (UT)  Ownership Types (OT)  SFlow

 DFlow

 AJ  EnerJ  More?

Program Source

7

Annotated Libraries

slide-8
SLIDE 8

DFlow

 Type qualifiers:

  • tainted: A variable x is tainted, if there is

flow from a sensitive source to x

  • safe: A variable x is safe if there is flow from

x to an untrusted sink

  • poly: The polymorphic qualifier, is

interpreted as tainted in some contexts and as safe in other contexts

 Subtyping hierarchy:

  • safe <: poly <: tainted

8

slide-9
SLIDE 9

DFlowTyping Rules (Simplified)

9

(TWRITE) T ( ) ( ) ( ) : q q typeof q q q q        

x y f x y f

x y f y.f x

(TREAD) ( ) ( ) ( ) : q q typeof q q q q        

x y f y f x

x y f x y.f T (TCALL) ( ) ( ) ( ) ( ) , : : : ( )

i i i i

q q q typeof q q q q q q q q q q q q             

x y z this p ret y this z p ret x

x y z m x y.m z T

slide-10
SLIDE 10

Inference Example

10

public class Data { { p

  • l

y , t a i n t e d } String f; { s a f e , p

  • l

y , t a i n t e d } String get({ s a f e , p

  • l

y , t a i n t e d } Data this) {return this.f;} void set({ s a f e , p

  • l

y , t a i n t e d } Data this, { s a f e , p

  • l

y , t a i n t e d } String p) {this.f = p;} } public class FieldSensitivity3 { protected void onCreate(Bundle b) { { s a f e , p

  • l

y , t a i n t e d } Data dt = new Data(); { s a f e , p

  • l

y , t a i n t e d } String sim = tm.getSimSerialNumber(); // source dt.set(sim); { s a f e , p

  • l

y , t a i n t e d } String sg = dt.get(); sms.sendTextMessage(…,sg,…); // sink } }

slide-11
SLIDE 11

Inference Example

11

public class Data { { p

  • l

y , t a i n t e d } String f; { s a f e , p

  • l

y , t a i n t e d } String get({ s a f e , p

  • l

y , t a i n t e d } Data this) {return this.f;} void set({ s a f e , p

  • l

y , t a i n t e d } Data this, { s a f e , p

  • l

y , t a i n t e d } String p) {this.f = p;} } public class FieldSensitivity3 { protected void onCreate(Bundle b) { { s a f e , p

  • l

y , t a i n t e d } Data dt = new Data(); { s a f e , p

  • l

y , t a i n t e d } String sim = tm.getSimSerialNumber(); // source dt.set(sim); { s a f e , p

  • l

y , t a i n t e d } String sg = dt.get(); sms.sendTextMessage(…,sg,…); // sink } } sg <: 𝑟 ⊳ safe

slide-12
SLIDE 12

Inference Example

12

public class Data { { p

  • l

y , t a i n t e d} String f; { s a f e , p

  • l

y , t a i n t e d } String get({ s a f e , p

  • l

y , t a i n t e d } Data this) {return this.f;} void set({ s a f e , p

  • l

y , t a i n t e d } Data this, { s a f e , p

  • l

y , t a i n t e d } String p) {this.f = p;} } public class FieldSensitivity3 { protected void onCreate(Bundle b) { { s a f e, p

  • l

y , t a i n t e d } Data dt = new Data(); { s a f e, p

  • l

y , t a i n t e d } String sim = tm.getSimSerialNumber(); // source dt.set(sim); { s a f e , p

  • l

y , t a i n t e d } String sg = dt.get(); sms.sendTextMessage(…,sg,…); // sink } }

Type Error!

dt <: sg

slide-13
SLIDE 13

CFL-Explain

 Type error:  Construct a dependency graph based on

CFL-reachability

 Map a type error into a source-sink path in

the graph

13

𝑟 ⊳ retgetSimSerialNumber {tainted} <: sim {safe}

slide-14
SLIDE 14

CFL-Explain – Construct Graph

 Field read:  Field write:

14

return this.f; this ⊳ f <: ret this

]𝐠 ret

this.f = p; p <: this ⊳ f p

[𝐠 this

slide-15
SLIDE 15

CFL-Explain – Construct Graph (Cont’d)

15

String sg = dt.get(); dt <: 𝑟2 ⊳ thisget dt

(𝟑 thisget

𝑟2 ⊳ retget <: sg retget

)𝟑 sg

slide-16
SLIDE 16

CFL-Explain Output

16

Type Error CFL-Explain Source-Sink Path Call Graph Dependency Graph No Path

source sim

(𝟓 p [𝐠 thisset )𝟓 dt (𝟑

thisget

]𝐠 retget )𝟑 sg sink

slide-17
SLIDE 17

CFL-Explain Output

17

Type Error CFL-Explain Source-Sink Path Call Graph Dependency Graph No Path

Reasons:

  • Unreachable methods on the call graph
  • False positive due to partial field insensitivity
slide-18
SLIDE 18

Outline

 DFlow type system  Inference algorithm for DFlow

 CFL-Explain  Handling Android-specific features  Implementation and evaluation

18

slide-19
SLIDE 19

Android-Specific Features

 Libraries

  • Flow through library method

 Multiple Entry Points and Callbacks

  • Connections among callback methods

 Inter-Component Communication(ICC)

  • Explicit/implicit Intents

19

slide-20
SLIDE 20

Libraries

 Insert annotations into Android library

  • source → {tainted}

sink → {safe}

 Type all parameters/returns of library

methods as

  • poly, poly → poly

 Method n overrides m:

20

(thisn, pn → retn) <: (thism, pm → retm) thism <: thisn pm <: pn retn <: retm

slide-21
SLIDE 21

public class MyListener { @Override public void onLocationChanged(Location loc) { double lat = loc.getLatitude(); Log.d(…, ”Latitude: ” + lat); // sink } } loc <: 𝑟 ⊳ poly 𝑟 ⊳ poly <: lat

 Library source:

LocationListener.onLocationChanged (tainted Location l)

 Type library method as:

poly double getLatitude (poly Location this)

Example

21

loc <: lat Type error: leak! l <: loc

slide-22
SLIDE 22

Callbacks

 Component objects (e.g., Activity) are

instantiated by the Android framework

 No explicit instance to “link” the this

parameters of callback methods

 DroidInfer creates equality constraints for

this parameters to “link” callback methods

22

thiscallbackMethod1 = thiscallbackMethod2

slide-23
SLIDE 23

Callbacks

public LocationLeak2 extends Activity { poly double latitude; void onResume(safe LocationLeak2 this) { safe double d = this.latitude; Log.d(…, ”Latitude: ” + d); // sink } void onLocationChanged(tainted Locationleak2 this, tainted Location loc) { tainted double lat = loc.getLatitude(); this.latitude = lat; } }

23

thisonResume ⊳ latitude <: safe tainted <: thisonLocationChanged ⊳ latitude thisonResume = thisonLocationChanged

Miss Leak!

slide-24
SLIDE 24

Inter-Component Communication (ICC)

 Android components interact through Intents  Explicit Intent

  • Have an explicit target component
  • DroidInfer connects them using placeholders

 Implicit Intent

  • Do not have a target component
  • DroidInfer conservatively considers them as

sinks

24

slide-25
SLIDE 25

public class SmsReceiver extends BroadcastReceiver { public void onReceiver(Context c, Intent i) { tainted String s = …; // source Intent it = new Intent(c, TaskService.class); it.putExtra(“data”, s); startService(i); } } public class TaskService exennds Service { public void onStart(Intent it, int d) { String body = it.getSerializableExtra(“data”); list.add(body); Entity e = new UrlEncodedFormEntity(list, “UTF8”); post.setEntity(e); // sink } }

ICC Example

25

slide-26
SLIDE 26

ICC Example

public class SmsReceiver extends BroadcastReceiver { public void onReceiver(Context c, Intent i) { tainted String s = …; // source TaskService_Intent it = new TaskService_Intent(); TaskService_Intent.data = s; // it.putExtra(“data”, s); startService(i); } } public class TaskService exennds Service { public void onStart(Intent it, int d) { String body = TaskService_Intent.data; // list.add(body); //it.getSerializableExtra(“data”); Entity e = new UrlEncodedFormEntity(list, “UTF8”); post.setEntity(e); // sink } }

26

slide-27
SLIDE 27

Outline

 DFlow type system  Inference algorithm for DFlow

 CFL-Explain  Handling Android-specific features  Implementation and evaluation

27

slide-28
SLIDE 28

Implementation

 Built on top of Soot [Vall´ee-Rai et al. CASCON’99] and

Dexpler [Bartel et al. SOAP’12]

 Publicly available at

  • https://github.com/proganalysis/type-inference

28

slide-29
SLIDE 29

Evaluation

 DroidBench 1.0

  • Recall: 96%, precision: 79%

 Contagio

  • Detect leaks from 19 out of total 22 apps

 Google Play Store

  • 144 free Android apps (top 30 free apps)
  • Maximal heap size: 2 GB
  • Time: 139 sec / app on average
  • False positive rate: 15.7%

29

slide-30
SLIDE 30

Results for Google Play Store Apps

30

144 111 84 40

Total Containing Sources/Sinks With Type Errors With Leaks to Network

Number of Apps 58% 48%

slide-31
SLIDE 31

Runtime Results

31

 Run 10 random apps on Android

phone/tablet

 Collect and analyze logs using Android

Device Monitor

 Cover 14 out of 76 true flows in 8 apps

(18.4%)

slide-32
SLIDE 32

Runtime Example

32

A source-sink path in Zillow App

source: getDeviceId r2

)i

ret r73 r75 r77

toString append

r12

toString

r277

append

ret List: r5 Iterator: r25 Object: r3

iterator

Sting: r28 (k p0

<init>URL: r30

sink: URL.openConnection

)j

r4

add next Method FiksuDeviceManager.getDeviceId Method EventUploader.buildURL Method EventUploader.uploadToTracking Method EventUploader.doUpload

slide-33
SLIDE 33

DroidInfer Running Time

33

200 400 600 800 1000 1200 1400 Running Time (sec) 139 average

 Maximal heap size is set to 2GB!

slide-34
SLIDE 34

Related Work

 FlowDroid [Arzt et al. PLDI’14]

  • Flow-sensitive
  • Memory-intensive, reports no network flows

 IFT [Ernst et al. CCS’14]

  • Enable collaborative verification of information flow
  • Need source code of apps
  • Annotation burden: 6 annotations per 100 LOC

 IccTA [Li et al. ICSE’15]

  • Focus on inter-component detection (ICC)

 Others

  • LeakMiner, Cassandra, SCANDAL, AndroidLeaks,

CHEX, SCanDroid, Epicc, and so on

34

slide-35
SLIDE 35

Conclusions

 DFlow and DroidInfer: context-sensitive

information flow type system and inference

 CFL-reachability algorithm to explain type

errors

 Effective handling of Android-specific features  Implementation and evaluation  Publicly available at

  • https://github.com/proganalysis/type-inference

35