Null Dereference Verification Via Over-approximated Weakest - - PowerPoint PPT Presentation

null dereference verification via over approximated
SMART_READER_LITE
LIVE PREVIEW

Null Dereference Verification Via Over-approximated Weakest - - PowerPoint PPT Presentation

Null Dereference Verification Via Over-approximated Weakest Precondition analysis Ravichandhran Madhavan Microsoft Research, India Joint work with Raghavan Komondoor, Indian Institute of Science Problem Definition Verify absence of Null


slide-1
SLIDE 1

Null Dereference Verification Via Over-approximated Weakest Precondition analysis

Ravichandhran Madhavan

Microsoft Research, India Joint work with Raghavan Komondoor, Indian Institute of Science

slide-2
SLIDE 2

Problem Definition

  • Verify absence of Null dereferences
  • Demand-driven
  • Analyze a dereference in almost real-time
  • Sound (i.e, no false negatives)
  • No programmer annotations
  • Reasonably precise
  • Should work on real-world Java programs
slide-3
SLIDE 3

Weakest (atleast once) Precondtion

  • WP(p,C)
  • Constraint on the initial state that ensures that C holds

whenever control reaches p

  • p may never be reached when WP(p,C) holds
  • WP1(p,C)
  • Constraint on the initial state that ensures that p is

reached atleast once in a state satisfying C.

  • WP(p,C) = ¬WP1(p,¬C)
slide-4
SLIDE 4

Null deref verification using WP1

1: foo(a) { 2: b = null; 3: if (a != null) 4: b.g = 10; 5: }

➢ We need to show that WP(S4, b != null) = true ➢ Equivalently, WP1(S4, b = null) = false

slide-5
SLIDE 5

Null deref verification using WP1

1: foo(a) { 2: b = null; 3: if (a != null) 4: b.g = 10; 5: }

b = null if(a != null) b.g = 10

〈b=null 〉 〈b=null∧a≠null 〉 〈a≠null 〉

Dereference of b is not safe

slide-6
SLIDE 6

Our approach

  • Computing WP and WP1 is undecidable
  • Our approach: compute a condition that is weaker

than WP1

WP1(r=null)

ϕ

ϕ= false ⇒WP1(r=null)= false

slide-7
SLIDE 7

Design Goals

  • Perform strong updates even in the presence of

aliasing

  • Incorporate path-sensitivity: track relevant branches
  • Perform context-sensitive inter-procedural analysis
slide-8
SLIDE 8

Abstract Domain

AccessPath(AP) → Variable.Fields ∣ Variable Fields → field.Fields ∣ ϵ Atom → AP | null Predicate → Atom op Atom ∣ true ∣ false

  • p

→ = ∣ ≠ Disjunct → 2

Predicate

Domain → 2

Disjunct

〈a.f.g.h=null ,b=null 〉 ,〈b.h=null 〉 ∈ Domain

  • The domain excludes access-paths in which a field repeats.
  • Eg:

〈a.f.g.h.f =null 〉 ∉ Domain

slide-9
SLIDE 9

Illustration

if(y != null)

〈t.data=null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

T F

slide-10
SLIDE 10

Illustration

if(y != null)

〈t.data=null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈t.next.data=null 〉

T F

slide-11
SLIDE 11

Illustration

if(y != null)

〈t.data=null 〉 〈 z=null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈t.next.data=null 〉

T F

slide-12
SLIDE 12

Illustration

if(y != null)

〈t.data=null 〉 〈z=null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg

〈o1=null〉

t = t.next

〈t.next.data=null 〉

T F

slide-13
SLIDE 13

Illustration

if(y != null)

〈t.data=null 〉 〈z=null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg

〈 false〉

t = t.next

〈t.next.data=null 〉

T F

slide-14
SLIDE 14

Illustration

if(y != null)

〈t.data=null 〉 〈z=null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg

〈 false〉

t = t.next

〈t.next.data=null 〉 〈t.data=null 〉

T F

slide-15
SLIDE 15

Illustration

if(y != null) x = t z = new ... x.data = y t.next.data = z ...= t.data.msg

〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉

t = t.next

〈t.data=null 〉 〈z=null 〉 〈 false〉 〈t.next.data=null 〉 〈t.data=null 〉

T F

slide-16
SLIDE 16

Illustration

if(y != null) x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈t=t , y=null 〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-17
SLIDE 17

Illustration

if(y != null) x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈 y=null 〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-18
SLIDE 18

Illustration

if(y != null) x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈 y=null 〉 〈t≠t ,t.data=null 〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-19
SLIDE 19

Illustration

if(y != null) x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈 y=null 〉 , 〈 false〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-20
SLIDE 20

Illustration

if(y != null) x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈 y=null 〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-21
SLIDE 21

Illustration

if(y != null)

〈 y=null , y≠null 〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈 y=null 〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-22
SLIDE 22

Illustration

if(y != null)

〈 false〉

x = t z = new ... x.data = y t.next.data = z ...= t.data.msg t = t.next

〈 y=null 〉 〈 x=t , y=null 〉 , 〈 x≠t ,t.data=null 〉 〈t.data=null 〉 〈 false〉 〈t.next.data=null 〉 〈z=null 〉 〈t.data=null 〉

T F

slide-23
SLIDE 23

Simplification rules

ap=ap → true

{ap1=ap2,ap1≠ap2}

→ false

  • i=o j

→ false

  • i=null

→ false

  • i=oi

→ true (over-approximation)

  • i=ap2

ap1 = new ...

ap1=ap2

slide-24
SLIDE 24

Simplification rules

ap=ap → true

{ap1=ap2,ap1≠ap2}

→ false

  • i=o j

→ false

  • i=null

→ false

  • i=oi

→ true (over-approximation)

ap1 = new ...

ap1=ap2

false

slide-25
SLIDE 25

Abstract Interpretation Formulation

  • Concrete semantics
  • Domain: sets of concrete stores ordered by set inclusion
  • Backward collecting semantics

– set union is the join operator

  • γ(ϕ)={s∣s satisfies ϕ}
slide-26
SLIDE 26

Abstract Interpretation Formulation

  • Abstract semantics:
  • , but the converse does not hold
  • Transfer functions and simplification rules are monotonic
  • Abstract transfer functions over-approximate the concrete

transfer functions

ϕ1≤ϕ2 iff ϕ1⊆ϕ2 ϕ1⊆ϕ2⇒ϕ1⇒ϕ2

slide-27
SLIDE 27

E() { ... ... ... }

Inter-procedural analysis

Main { if(*) A() else B() } B() { ... C() ... } A() { ... D() ... C() } C() { ... F() ... D() ... r.f = ... } F() { ... ... ... } D() { ... E() ... }

〈r=null 〉

slide-28
SLIDE 28

Inter-procedural analysis

main B C A D E F

slide-29
SLIDE 29

Inter-procedural analysis

main B C A D E F

(S, r = null) ┴

slide-30
SLIDE 30

Inter-procedural analysis

main B C A D E F

┴ φ1

(S, r = null) ┴

φ1

slide-31
SLIDE 31

Inter-procedural analysis

main B C A D E F

┴ φ2

(S, r = null) ┴

φ1 φ2 ┴ φ1

slide-32
SLIDE 32

Inter-procedural analysis

main B C A D E F

φ3 φ2

(S, r = null) ┴

φ1 φ2 ┴ φ1

slide-33
SLIDE 33

Inter-procedural analysis

main B C A D E F

(S, r = null) ┴

φ1 φ3 φ2 φ4 φ1

slide-34
SLIDE 34

Inter-procedural analysis

main B C A D E F

┴ φ5

(S, r = null) ┴

φ5 φ3 φ2 φ4 φ1

slide-35
SLIDE 35

Inter-procedural analysis

main B C A D E F

φ6 φ5

(S, r = null) ┴

φ5 φ3 φ2 φ4 φ1

slide-36
SLIDE 36

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7

φ6 φ5 φ3 φ2 φ4 φ1

slide-37
SLIDE 37

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (call C, φ7) ┴

φ5 φ6 φ6 φ5 φ3 φ2 φ4 φ1

slide-38
SLIDE 38

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) ┴

φ6 φ5 φ3 φ2 φ4 φ1

slide-39
SLIDE 39

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) φ9

φ6 φ5 φ3 φ2 φ4 φ1

slide-40
SLIDE 40

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) φ9 (call C, φ7) ┴

φ6 φ5 φ3 φ2 φ4 φ1

slide-41
SLIDE 41

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) φ9 (call C, φ7) ┴

φ1 φ6 φ5 φ3 φ2 φ4 φ1

slide-42
SLIDE 42

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) φ9 (call C, φ7) φ10

φ6 φ5 φ3 φ2 φ4 φ1

slide-43
SLIDE 43

Inter-procedural analysis

main B C A D E F

(S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) (call A, φ10) φ9 ┴ (call C, φ7) φ10

φ6 φ5 φ3 φ2 φ4 φ1

slide-44
SLIDE 44

Inter-procedural analysis

main B C A D E F

(call B, φ8) (call A, φ10) φ9 φ11 (S, r = null) φ7 (call C, φ7) φ8 (call C, φ7) φ10

φ6 φ5 φ3 φ2 φ4 φ1

slide-45
SLIDE 45

E() { ... ... ... }

Inter-procedural analysis

B() { ... C() ... } A() { ... D() ... C() } C() { ... F() ... D() ... r.f = ... } F() { ... ... ... } D() { ... E() ... }

〈r=null 〉 φ1 φ2 φ3 φ5 φ6 φ7 φ8 φ9 φ10 φ11 φ4 φ7

Main { if(*) A() else B() }

φ1 φ4

slide-46
SLIDE 46

Inter-procedural analysis

  • Uses depth-first strategy instead of conventional

chaotic iteration

  • Analyzes callees before callers
  • Pros:
  • Uses less memory
  • Can abort search on discovering a satisfiable path
slide-47
SLIDE 47

Handling Recursion

main B C A D E F

slide-48
SLIDE 48

Handling Recursion

main B C A D F

slide-49
SLIDE 49

Handling Recursion

(S, r = null) ┴

main B C A D F

slide-50
SLIDE 50

Handling Recursion

(S, r = null) ┴

main B C A D F

φ1 ┴ φ1

slide-51
SLIDE 51

Handling Recursion

φ1

(S, r = null) ┴

main B C A D F

φ1 ┴ φ1 Recursion detected

slide-52
SLIDE 52

Handling Recursion

φ1

(S, r = null) ┴

main B C A D F

φ2 φ1

slide-53
SLIDE 53

Handling Recursion

φ1

(S, r = null) ┴

main B C A D F

φ2 φ1 φ1

slide-54
SLIDE 54

Handling Recursion

φ1

(S, r = null) ┴

main B C A D F

φ3 φ1

Iterate until precondition for φ1 stabilises

slide-55
SLIDE 55

Handling Recursion

φ1

(S, r = null) ┴

main B C A D F

φ10 φ1

slide-56
SLIDE 56

Handling Recursion

(S, r = null) ┴

main B C A D F

φ10 φ1

(S, r = null) φ7 (call C, φ7) φ8 (call B, φ8) φ9 (call C, φ7) ┴

φ1

slide-57
SLIDE 57

Challenges: Calls to/from standard library

main B

Library call call back Standard library (S, r = null)

... . . .

Library call

... ...

Do not enter callers in the library Reduce to true predicates modified by the library method

slide-58
SLIDE 58

Challenges: Explosion of formulas

  • Formula sizes can increase due to
  • Put-field statements (aliasing predicates)
  • Branch statements
  • Put-field statements
  • We use an inexpensive alias analysis to invalidate alias

predicates

  • if the access-paths are not may-aliases

(ap1=ap2)→ false

slide-59
SLIDE 59

Limiting Path-sensitivity

  • Tracking all branch conditions without bounds will not

scale

  • Can we do better than arbitrarily bounding the disjunct

sizes ?

  • For null-dereference verification, null-checks are a

good target !

slide-60
SLIDE 60

Targeting null-checks

if(b != null) b.g = 10

〈b=null 〉 〈b=null∧b≠null 〉

b = a if(a != null) b.g = 10

〈b=null 〉 〈b=null∧a≠null 〉 〈b=null∧b≠null 〉

Null-check which needs to be tracked

slide-61
SLIDE 61

Limited path-sensitivity

  • Track “AP op null” branches where AP aliases with AP in

the root predicate

  • Ageing:
  • Drop predicates propagated beyond a threshold
  • Track only k recent branch conditions
  • Parameterizable: Predicate age, disjunct sizes and

branches to be tracked can be configured.

slide-62
SLIDE 62

Results

Benchmark Bytecode Derefs

% deref verified jlex

25K 2.5K 96.3% javacup 29K 2.8K 78.7% bcel 86K 10.1K 88.3% jbidwatcher 105K 9.6K 84.7% sablecc 157K 14K 84.9%

  • urtunes

127K 16.4K 91.5% proguard 185K 17.7K 84.4% antlr 251K 17.4K 76.8% freecol 260K 24K 70.9% l2j 373K 36.8K 85.3%

In all programs (except antlr), 93% of the derefs took < 250ms

slide-63
SLIDE 63

Complexity of verified dereferences

  • Context Depth

c1 c2 c3 r3 r2 c4 r4 1 2 3 4 Path context depth

Max context depth

slide-64
SLIDE 64

Max context depth Vs safe derefs

slide-65
SLIDE 65

Example with high context depth

class L2Object { ObjectPosition _position public L2Object(){ InitPosition(); } public void initPosition(){ SetObjectPosition( new CharPosition(this)) } public void setObjectPosition(...) { _position = value } public ObjectPosition getPosition() { return _position; } } class L2Character extends L2Object { public L2Character() { super() } } class L2AirShipInstance extends Character{ public L2AirShipInstance() { super() } } public void L2AirShipInstanceParseLine() { airship = new L2AirShipInstance(..); t = airship.getPosition() t.setHeading(); }

slide-66
SLIDE 66

Max propagation count Vs safe derefs

slide-67
SLIDE 67

Evaluation of limited path-sensitivity

Benchmark

Ltd path sensitivity No path sensitivity K-recent branches

bcel 1184 63s 2501 97s 1119 8123s javacup 607 19s 1036 29s 463 881s jlex 93 13s 296 33s 105 148s

Did not scale to the rest

slide-68
SLIDE 68

Summary miss percentage

slide-69
SLIDE 69

Related Work

  • SALSA
  • Non demand-driven analysis
  • Performs limited scope analysis for scalability.
  • Findbugs
  • Bug finding tool based on a set of heuristics
  • Xylem
  • Uses a similar WP based approach
  • Bug finding tool (has false negatives/positives)
slide-70
SLIDE 70

Related Work

  • Snugglebug
  • Under-approximates WP1 (computes a condition

stronger than WP1)

  • Can prove presence of bugs (not its absence)
  • ESC/Java
  • Uses pre/post, loop-invariant annotations for computing

WP

slide-71
SLIDE 71

Conclusion

  • Our evaluations on real Java programs reveal that
  • Deep inter-procedural analysis is important
  • Complex interactions with libraries (esp. with GUI, DB libs)

present a huge challenge

  • Complications arise due to virtual method dispatch and

exceptions

  • Our design trade-offs result in an highly responsive analysis

with reasonable precision

  • Ideal for use in desktop development environments