ANALYZING JAVA WITH THE SAWJA FRAMEWORK F ROM RESEARCH - - PowerPoint PPT Presentation

analyzing java with the sawja framework
SMART_READER_LITE
LIVE PREVIEW

ANALYZING JAVA WITH THE SAWJA FRAMEWORK F ROM RESEARCH - - PowerPoint PPT Presentation

ANALYZING JAVA WITH THE SAWJA FRAMEWORK F ROM RESEARCH SPECIFICATIONS TO REALISTIC TOOLS Laurent Hubert and David Pichardie INRIA Rennes, France FMCO 2010 COST Action IC0701 Session 30 November 2010 Tuesday, November 30, 2010 1 Static


slide-1
SLIDE 1

Laurent Hubert and David Pichardie INRIA Rennes, France

FMCO 2010

COST Action IC0701 Session

30 November 2010

ANALYZING JAVA WITH THE SAWJA FRAMEWORK

FROM RESEARCH SPECIFICATIONS TO REALISTIC TOOLS

1 Tuesday, November 30, 2010

slide-2
SLIDE 2

Static Analysis & Type Systems

  • Powerful and automatic verification techniques
  • Should never miss a true alarm
  • a soundness proof ensures it
  • Necessarily incomplete (i.e., raise false alarms)
  • static analyses return «I don’t know»
  • some correct programs do not type checked
  • an experimental evaluation should ensure that false alarms does

not appear too often in practice

2 Tuesday, November 30, 2010

slide-3
SLIDE 3

Static Verification of Java

3 Tuesday, November 30, 2010

slide-4
SLIDE 4

Static Verification of Java

  • Provides already a strong type system (BCV)

class Point{ _x,_y; Point( x, y){ _x = x; _y = y; } equal( p){...} }

3 Tuesday, November 30, 2010

slide-5
SLIDE 5

Static Verification of Java

  • Provides already a strong type system (BCV)

class Point{ _x,_y; Point( x, y){ _x = x; _y = y; } equal( p){...} } Float int int Point boolean

3 Tuesday, November 30, 2010

slide-6
SLIDE 6

Static Verification of Java

  • Provides already a strong type system (BCV)

class Point{ _x,_y; Point( x, y){ _x = x; _y = y; } equal( p){...} } Float int int Point

Error!

boolean

3 Tuesday, November 30, 2010

slide-7
SLIDE 7

Static Verification of Java

  • Provides already a strong type system (BCV)
  • Many extensions are possible
  • Java annotation system
  • allows to specify new types

class Point{ _x,_y; Point( x, y){ _x = x; _y = y; } equal( p){...} } Float int int Point

Error!

boolean

3 Tuesday, November 30, 2010

slide-8
SLIDE 8

Static Verification of Java

  • Provides already a strong type system (BCV)
  • Many extensions are possible
  • Java annotation system
  • allows to specify new types
  • e.g., overrides,

nullness type system

class Point{ _x,_y; Point( x, y){ _x = x; _y = y; } equal( p){...} } @NonNull Float int int @overrides Point

Error! Error!

boolean

3 Tuesday, November 30, 2010

slide-9
SLIDE 9

Static Verification of Java

  • Provides already a strong type system (BCV)
  • Many extensions are possible
  • Java annotation system
  • allows to specify new types
  • e.g., overrides,

nullness type system

  • but does not provide support

for type checking them

class Point{ _x,_y; Point( x, y){ _x = x; _y = y; } equal( p){...} } @NonNull Float int int @overrides Point

Error! Error!

!"#$"% &'()*+$&,

boolean

  • ./%0.%

$"'$1%#0%,

3 Tuesday, November 30, 2010

slide-10
SLIDE 10

4 Tuesday, November 30, 2010

slide-11
SLIDE 11

Formal specification and proof

Soundness

4 Tuesday, November 30, 2010

slide-12
SLIDE 12

Formal specification and proof Prototype

Soundness Precision

4 Tuesday, November 30, 2010

slide-13
SLIDE 13

Formal specification and proof Prototype

Proved

  • n a toy

language JVM spec compliant

Exceptions Lazy class loading Interfaces ...

Soundness Precision

4 Tuesday, November 30, 2010

slide-14
SLIDE 14

Formal specification and proof Prototype

Proved

  • n a toy

language JVM spec compliant

Exceptions Lazy class loading Interfaces ...

4 Tuesday, November 30, 2010

slide-15
SLIDE 15

a backend: the Sawja framework Formal specification and proof

Prototype

Proved

  • n a toy

language JVM spec compliant

Exceptions Lazy class loading Interfaces ...

4 Tuesday, November 30, 2010

slide-16
SLIDE 16

Outlines

  • The Sawja framework [FoVeOOS'10]
  • A type system for Secure Object Initialization [ESORICS'10]
  • An implementation based on Sawja

5 Tuesday, November 30, 2010

slide-17
SLIDE 17

Sawja

  • OCaml library for developing Java bytecode static analyses

(Gnu LGPL)

  • High level intermediate representation (language)
  • Transformation proven sound
  • High level API for efficient browsing of class hierarchy
  • Implements a large part of the JVM Specification

(structural constraints, resolution, lookups, control flow, etc.)

  • Efficient

Did you ever look at the method resolution specification ?

6 Tuesday, November 30, 2010

slide-18
SLIDE 18

Intermediate Representation

  • In a few words
  • Stackless representation, no sub-routines, etc.
  • Stingy with local variables
  • Time efficient
  • Formally proved on paper
  • More information
  • D. Demange, T. Jensen, and D. Pichardie. A provably correct stackless

intermediate representation for Java bytecode. APLAS’10.

7 Tuesday, November 30, 2010

slide-19
SLIDE 19

type opcode = | OpNop | OpAConstNull | OpIConst of int32 | OpLConst of int64 | OpFConst of float | OpDConst of float | OpBIPush of int | OpSIPush of int | OpLdc1 of int | OpLdc1w of int | OpLdc2w of int | OpLoad of jvm_basic_type * int | OpALoad of int | OpArrayLoad of [ `Double | `Float | `Int | `Long ] | OpAALoad | OpBALoad | OpCALoad | OpSALoad | OpStore of jvm_basic_type * int | OpAStore of int | OpArrayStore of [ `Double | `Float | `Int | `Long ] | OpAAStore | OpBAStore | OpCAStore | OpSAStore | OpPop | OpPop2 | OpDup | OpDupX1 | OpDupX2 | OpDup2 | OpDup2X1 | OpDup2X2 | OpSwap | OpAdd of jvm_basic_type | OpSub of jvm_basic_type | OpMult of jvm_basic_type | OpDiv of jvm_basic_type | OpRem of jvm_basic_type | OpNeg of jvm_basic_type | OpIShl | OpLShl | OpIShr | OpLShr | OpIUShr | OpLUShr | OpIAnd | OpLAnd | OpIOr | OpLOr | OpIXor | OpLXor | OpIInc of int * int | OpI2L | OpI2F | OpI2D | OpL2I | OpL2F | OpL2D | OpF2I | OpF2L | OpF2D | OpD2I | OpD2L | OpD2F | OpI2B | OpI2C | OpI2S | OpLCmp | OpFCmpL | OpFCmpG | OpDCmpL | OpDCmpG | OpIfEq of int | OpIfNe of int | OpIfLt of int | OpIfGe of int | OpIfGt of int | OpIfLe of int | OpICmpEq of int | OpICmpNe of int | OpICmpLt of int | OpICmpGe of int | OpICmpGt of int | OpICmpLe of int | OpACmpEq of int | OpACmpNe of int | OpGoto of int | OpJsr of int | OpRet of int | OpTableSwitch of int * int32 * int32 * int array | OpLookupSwitch of int * (int32 * int) list | OpReturn of jvm_basic_type | OpAReturn | OpReturnVoid | OpGetStatic of int | OpPutStatic of int | OpGetField of int | OpPutField of int | OpInvokeVirtual of int | OpInvokeNonVirtual of int | OpInvokeStatic of int | OpInvokeInterface of int * int | OpNew of int | OpNewArray of java_basic_type | OpANewArray of int | OpArrayLength | OpThrow | OpCheckCast of int | OpInstanceOf of int | OpMonitorEnter | OpMonitorExit | OpAMultiNewArray of int * int | OpIfNull of int | OpIfNonNull of int | OpGotoW of int | OpJsrW of int | OpBreakpoint | OpInvalid

Low level representation

type jopcode = | OpLoad of jvm_type * int | OpStore of jvm_type * int | OpIInc of int * int | OpPop | OpPop2 | OpDup | OpDupX1 | OpDupX2 | OpDup2 | OpDup2X1 | OpDup2X2 | OpSwap | OpConst of [ `ANull | `Byte of int | `Class of object_type | `Double of float | `Float of float | `Int of int32 | `Long of int64 | `Short of int | `String of string ] | OpAdd of jvm_basic_type | OpSub of jvm_basic_type | OpMult of jvm_basic_type | OpDiv of jvm_basic_type | OpRem of jvm_basic_type | OpNeg of jvm_basic_type | OpIShl | OpLShl | OpIShr | OpLShr | OpIUShr | OpLUShr | OpIAnd | OpLAnd | OpIOr | OpLOr | OpIXor | OpLXor | OpI2L | OpI2F | OpI2D | OpL2I | OpL2F | OpL2D | OpF2I | OpF2L | OpF2D | OpD2I | OpD2L | OpD2F | OpI2B | OpI2C | OpI2S | OpCmp of [ `DG | `DL | `FG | `FL | `L ] | OpIf of [ `Eq | `Ge | `Gt | `Le | `Lt | `Ne | `NonNull | `Null ] * int | OpIfCmp of [ `AEq | `ANe | `IEq | `IGe | `IGt | `ILe | `ILt | `INe ] * int | OpGoto of int | OpJsr of int | OpRet of int | OpTableSwitch of int * int32 * int32 * int array | OpLookupSwitch of int * (int32 * int) list | OpNew of class_name | OpNewArray of value_type | OpAMultiNewArray of object_type * int | OpCheckCast of object_type | OpInstanceOf of object_type | OpGetStatic of class_name * field_signature | OpPutStatic of class_name * field_signature | OpGetField of class_name * field_signature | OpPutField of class_name * field_signature | OpArrayLength | OpArrayLoad of jvm_array_type | OpArrayStore of jvm_array_type | OpInvoke of [ `Interface of class_name | `Special of class_name | `Static of class_name | `Virtual of object_type ] * method_signature | OpReturn of jvm_return_type | OpThrow | OpMonitorEnter | OpMonitorExit | OpNop | OpBreakpoint | OpInvalid

Default representation

type instr = | Nop | AffectVar of var * expr | AffectArray of expr * expr * expr | AffectField of expr * class_name * field_signature * expr | AffectStaticField of class_name * field_signature * expr | Goto of int | Ifd of ([ `Eq | `Ge | `Gt | `Le | `Lt | `Ne ] * expr * expr) * int | Throw of expr | Return of expr option | New of var * class_name * value_type list * expr list | NewArray of var * value_type * expr list | InvokeStatic of var option * class_name * method_signature * expr list | InvokeVirtual of var option * expr * virtual_call_kind * method_signature * expr list | InvokeNonVirtual of var option * expr * class_name * method_signature * expr list | MonitorEnter of expr | MonitorExit of expr | MayInit of class_name | Check of check

Intermediate representation

Several code representations

Instruction sets

8 Tuesday, November 30, 2010

slide-20
SLIDE 20

Sawja Javalib

.class

file oriented (parsing, etc.) Static Analysis

  • riented

(IR, CFG computation, etc.)

Overview

9 Tuesday, November 30, 2010

slide-21
SLIDE 21

Sawja Javalib

.class

file oriented (parsing, etc.)

Eclipse Plugin

Static Analysis

  • riented

(IR, CFG computation, etc.)

Overview

NIT

Nullness Inference Tool,

[FMOODS'08], [PASTE'08]

9 Tuesday, November 30, 2010

slide-22
SLIDE 22

Sawja Javalib

.class

file oriented (parsing, etc.)

... Eclipse Plugin

Static Analysis

  • riented

(IR, CFG computation, etc.)

Overview

SecInit

Secure object initialization,

[ESORICS'10]

NIT

Nullness Inference Tool,

[FMOODS'08], [PASTE'08]

9 Tuesday, November 30, 2010

slide-23
SLIDE 23

Object Initialization

keep

10 Tuesday, November 30, 2010

slide-24
SLIDE 24

Object Initialization

  • Initialization is often a critical phase

keep

  • security checks may be performed
  • defense mechanisms may be installed
  • operations may be logged
  • fields are initialized,
  • etc.

10 Tuesday, November 30, 2010

slide-25
SLIDE 25

Object Initialization

  • Initialization is often a critical phase
  • The object invariant is established
  • Know issue, source of several security bugs
  • Recommendation of the CERT (OBJ04-J), Oracle's guidelines for

secure coding & Joshua Bloch in Effective Java

keep

Only fully initialized objects should be manipulated by the program

10 Tuesday, November 30, 2010

slide-26
SLIDE 26

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-27
SLIDE 27

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-28
SLIDE 28

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-29
SLIDE 29

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-30
SLIDE 30

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) call to the super constructor (Object) allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-31
SLIDE 31

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) call to the super constructor (Object) initialization of Object allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-32
SLIDE 32

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) call to the super constructor (Object) initialization of Object initialization of Foo allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-33
SLIDE 33

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) call to the super constructor (Object) initialization of Object initialization of Foo initialization of Bar allocation of an instance of Bar

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-34
SLIDE 34

Object Initialization In Java

Object Foo Bar

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) call to the super constructor (Object) initialization of Object initialization of Foo initialization of Bar allocation of an instance of Bar fully initialized

An instance of Bar

keep

11 Tuesday, November 30, 2010

slide-35
SLIDE 35

An instance of Bar

Object Initialization In Java

Timeline of an instance of class Bar

call to the constructor of Bar call to the super constructor (Foo) call to the super constructor (Object) initialization of Object initialization of Foo initialization of Bar allocation of an instance of Bar fully initialized

Enforced by the BCV Not enforced by the BCV

Object Foo Bar

keep

12 Tuesday, November 30, 2010

slide-36
SLIDE 36

Solutions

  • Oracle's solution: program your own monitor
  • Coding pattern
  • Failure to adhere siliently leads to security vulnerabilities
  • Our solution: a type system
  • No additional code
  • Only few type annotations are needed
  • Formally defined
  • Provably sound
  • Failure to adhere leads to a type error

13 Tuesday, November 30, 2010

slide-37
SLIDE 37

The Type Annotations

Example of a class hierarchy Corresponding type lattice structure A B Object Init Raw Raw(A) Raw(B) Raw(Object)

  • Raw(C) [FähndrichLeino2003]
  • The constructor of C has

terminated successfully

  • Sub-typing relation

Object B Raw(Object)

uninitialized partially initialized fully initialized

14 Tuesday, November 30, 2010

slide-38
SLIDE 38

Experimental results

  • Implementation includes additional features (setInit, casts)
  • We studied 3 packages of Oracle's JRE: java.lang,

java.security and javax.security

  • We verified 377 classes (131 KLoc)
  • + Few annotations needed (53)
  • + Few runtime checks (4 cast operators)
  • - synthetic methods (3)
  • - limitation on arrays (1)
  • we forbid to store partially initialized objects in arrays

15 Tuesday, November 30, 2010

slide-39
SLIDE 39

Experimental results

  • Implementation includes additional features (setInit, casts)
  • We studied 3 packages of Oracle's JRE: java.lang,

java.security and javax.security

  • We verified 377 classes (131 KLoc)
  • + Few annotations needed (53)
  • + Few runtime checks (4 cast operators)
  • - synthetic methods (3)
  • - limitation on arrays (1)
  • we forbid to store partially initialized objects in arrays

99% of classes verified 1 annotation every 2.3 kloc on average

Work in pratice

15 Tuesday, November 30, 2010

slide-40
SLIDE 40

The Type System

  • Typing rules for expressions are standard
  • OCaml function

L ⊢ e : τ

val eval_expr : LDom.t -> JBir.expr -> EDom.t

16 Tuesday, November 30, 2010

slide-41
SLIDE 41

The Type System

  • Typing rules for expressions are standard
  • OCaml function
  • Instructions are formalized using a type and effect system
  • Type of local variables may evolve (flow sensitive)
  • OCaml function

⊢ ←

P m ⊢ ins : L → L′

L ⊢ e : τ

val eval_expr : LDom.t -> JBir.expr -> EDom.t val cst_from_instr: JBir.instr -> LDom.t -> LDom.t

16 Tuesday, November 30, 2010

slide-42
SLIDE 42

The Type System

Expression Typing

L ⊢ e.f : (p.fields f) L ⊢ x : L(x) L ⊢ null : Init

17 Tuesday, November 30, 2010

slide-43
SLIDE 43

The Type System

Expression Typing

L ⊢ e.f : (p.fields f) L ⊢ x : L(x) L ⊢ null : Init

let rec eval_expr (l:LDom.t) : JBir.expr -> EDom.t = function | Field (_,cn,fs) | StaticField (cn,fs) -> let cl = PProgram.resolve_field (PProgram.Name (prog,cn)) fs in Annotations.get_field_annotation koc cl fs

17 Tuesday, November 30, 2010

slide-44
SLIDE 44

The Type System

Expression Typing

L ⊢ e.f : (p.fields f) L ⊢ x : L(x) L ⊢ null : Init

let rec eval_expr (l:LDom.t) : JBir.expr -> EDom.t = function | Field (_,cn,fs) | StaticField (cn,fs) -> let cl = PProgram.resolve_field (PProgram.Name (prog,cn)) fs in Annotations.get_field_annotation koc cl fs | Var (_,x) -> LDom.get_var (JBir.index x) l

17 Tuesday, November 30, 2010

slide-45
SLIDE 45

The Type System

Expression Typing

L ⊢ e.f : (p.fields f) L ⊢ x : L(x) L ⊢ null : Init

let rec eval_expr (l:LDom.t) : JBir.expr -> EDom.t = function | Field (_,cn,fs) | StaticField (cn,fs) -> let cl = PProgram.resolve_field (PProgram.Name (prog,cn)) fs in Annotations.get_field_annotation koc cl fs | Var (_,x) -> LDom.get_var (JBir.index x) l | Const `ANull -> EDom.Init

17 Tuesday, November 30, 2010

slide-46
SLIDE 46

The Type System

Expression Typing

L ⊢ e.f : (p.fields f) L ⊢ x : L(x) L ⊢ null : Init

let rec eval_expr (l:LDom.t) : JBir.expr -> EDom.t = function | Field (_,cn,fs) | StaticField (cn,fs) -> let cl = PProgram.resolve_field (PProgram.Name (prog,cn)) fs in Annotations.get_field_annotation koc cl fs | Var (_,x) -> LDom.get_var (JBir.index x) l | Const `ANull -> EDom.Init | Unop (Cast _, e) -> eval_expr l e | Binop (ArrayLoad _ , e1, e2) -> array_annot | _ -> EDom.bot

17 Tuesday, November 30, 2010

slide-47
SLIDE 47

The Type System

Local Variable Affectation

Instruction typing L ⊢ e : τ x = this m ⊢ x ← e : L → L[x → τ]

18 Tuesday, November 30, 2010

slide-48
SLIDE 48

The Type System

Local Variable Affectation

Instruction typing L ⊢ e : τ x = this m ⊢ x ← e : L → L[x → τ]

| AffectVar (var,expr) -> let this_var = snd (List.hd implem.params) in if (not m.cm_static) && (JBir.var_equal var this_var) then failwith "The receiver should not be overwritten" else (fun l -> let val_expr = eval_expr l expr in LDom.set_var (JBir.index var) val_expr l)

18 Tuesday, November 30, 2010

slide-49
SLIDE 49

The Implementation in Digits

  • Sawja+Javalib: 22.3 kloc + 5.9 kloi (kilo lines of interfaces)
  • Prototype: 1253 (non empty) loc (excluding interfaces)
  • Core analysis: 636 loc
  • Constraints: 288 loc (114 casts, 79 method calls, 20 expr., 97 other)
  • Rest: variance checking(56), error handling, printing functions
  • Domain: 163 loc
  • Annotations: 354 loc
  • Main: 100 loc

19 Tuesday, November 30, 2010

slide-50
SLIDE 50

The Implementation in Eclipse

20 Tuesday, November 30, 2010

slide-51
SLIDE 51

Related Libraries

  • Soot:

+mature SA framework, more analyses available

  • iterative IR construction, soundness not formalized, in Java
  • Wala

+mature SA framework, more analyses available

  • memory and CPU glutton, in Java
  • Bcel & ASM: in Java, adapted to run-time transformation, no IR
  • Barista: in OCaml, focused on parsing and saving .class files

21 Tuesday, November 30, 2010

slide-52
SLIDE 52

Conclusion

  • Sawja allowed us to develop
  • an efficient prototype (~200 methods / second)
  • handling the full Java 6 bytecode
  • within 1,300 loc
  • Paper typing rules and corresponding code are very close
  • Eclipse plugin (almost) for free

22 Tuesday, November 30, 2010

slide-53
SLIDE 53

Perspectives

  • On reducing the gap between soundness proof and

implementation

  • Make annotation handling simpler
  • Replace some Sawja components with Coq-extracted code
  • More case studies are currently being developed
  • Secure Cloning: checks if a copy method provides sufficiently deep

copies

  • Assertion Checker: checks user assertions without loop invariants (in

the spirit of the static checker of Microsoft Code Contracts)

autre: il y a encore des morceaux qui pourrait être simplifier sur l'appel au solver et la creation de l'état initial, les annotations comme annotation de type, annotation in the body of methods

23 Tuesday, November 30, 2010

slide-54
SLIDE 54

Thank you

The prototype is available as a web demo

http://irisa.fr/celtique/ext/rawtypes

24 Tuesday, November 30, 2010