A True Positives Theorem for a Static Race Detector Presentation by - - PowerPoint PPT Presentation

a true positives theorem for a static race detector
SMART_READER_LITE
LIVE PREVIEW

A True Positives Theorem for a Static Race Detector Presentation by - - PowerPoint PPT Presentation

A True Positives Theorem for a Static Race Detector Presentation by Julia Belyakova and Artem Pelenitsyn For CS 7580 (instructor: Jan Vitek), 10/30/2019 A subset of slides is taken from Ilya Sergeys web page Static Analyses for Program


slide-1
SLIDE 1

A True Positives Theorem for a Static Race Detector

Presentation by Julia Belyakova and Artem Pelenitsyn For CS 7580 (instructor: Jan Vitek), 10/30/2019

A subset of slides is taken from Ilya Sergey’s web page

slide-2
SLIDE 2

Static Analyses 
 for Program Validation

7

slide-3
SLIDE 3

C p

α

e

program 
 execution property


  • f interest

“abstraction”

8

The Essence of Static Analysis

slide-4
SLIDE 4

e1 p

α

e2

α

9

slide-5
SLIDE 5

concreteSem(c) =

p2 p3 p4 p1 e2 e3 e1 e4 e6 e5

Static Analysis

10

slide-6
SLIDE 6

p2 p3 p4 p1 }

}

“has bugs”

e6 e2 e3 e1 e4 e5

“correct”

Static Analysis

concreteSem(c) =

11

slide-7
SLIDE 7

Verifier 


  • r a 


Bug Detector?

12

slide-8
SLIDE 8

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Program Verifier

true negative true positive true negative false positive

13

slide-9
SLIDE 9

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Sound Program Verifier

true negative true positive true negative false positive

14

slide-10
SLIDE 10

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Sound Program Verifier

<

abstract over-approximation

false positive true negative true positive true negative

15

slide-11
SLIDE 11

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Sound Program Verifier

<

abstract over-approximation

true negative true positive true negative false positive

16

slide-12
SLIDE 12

p2 p3 p4 p1 e2 e3 e1 e4 e5

Sound Program Verifier

e6

if (n == VERY_UNLIKELY_VALUE) { bug.explode(); } else { // do nothing }

true positive true negative true positive false positive

17

Developer: Go away, that never happens!

slide-13
SLIDE 13

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Unsound Program “Verifier”

if (n == VERY_UNLIKELY_VALUE) { bug.explode(); } else { // do nothing }

false negative true negative true positive false positive

18

slide-14
SLIDE 14

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

false negative

e6

“Sound” Program Verifier

false positive true negative true positive

19

slide-15
SLIDE 15

e6 p2 p3 p1 e2 e3 e1 e4 e5

“Sound” Program Verifier

<

concrete under-approximation abstract over-approximation

true negative true positive false positive

20

slide-16
SLIDE 16
  • False negatives (bugs missed) are bad
  • False positives (non-bugs reported) are okay
  • Constructed as over-approximation (of under-approximation)
  • Soundness Theorem: 


Under certain assumptions about the programs, the analyser has no false negatives.

Sound Static Verifiers

21

slide-17
SLIDE 17

p2 p3 p4 p1 }

}

“has bugs”

e6 e2 e3 e1 e4 e5

“correct”

22

slide-18
SLIDE 18

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Static Bug Finder

true negative false negative true positive false positive

23

slide-19
SLIDE 19

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

Unsound Static Bug Finder

false positive true negative false negative true positive

24

slide-20
SLIDE 20

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

<

abstract under-approximation

Sound (but imprecise) Static Bug Finder

false negative true negative false negative true positive

25

slide-21
SLIDE 21

if (n != VERY_UNLIKELY_VALUE) { // bug happens here } else { // normal execution }

Loss of Precision in Static Bug Finders

e2 e3

Idea: over-approximate in concrete semantics!

26

slide-22
SLIDE 22

p2 p3 p4 p1 e6 e2 e3 e1 e4 e5

false negative

Sound (but Imprecise) Static Bug Finder

Let’s consider these two equivalent! Let’s merge these executions into

  • ne that subsumes both!

true negative false negative true positive

27

slide-23
SLIDE 23

false negative

e2 e3 p2 p3 p4 p1 e6 e1 e4 e5

true positive

e23 p2 if (*) { // bug happens here } else { // normal execution }

1.

  • verApproxConcreteSem(c) =

true negative false negative true positive

28

slide-24
SLIDE 24

e23

true positive

p2 p3 p4 p1 e6 e1 e4 e5

true positive true negative false negative

Sound Static Bug Finder

if (*) { // bug happens here } else { // normal execution }

<

abstract under-approximation concrete over-approximation

1.

  • verApproxConcreteSem(c) =

29

slide-25
SLIDE 25
  • False negatives (bugs missed) are okay
  • False positives (non-bugs reported) are bad
  • Constructed as under-approximation of over-approximation
  • Soundness (True Positives) Theorem: 


Under certain assumptions about the programs, the analyser has no false positives.

Towards Sound Static Bug Finders

(this work)

30

slide-26
SLIDE 26

A True Positives Theorem 
 for a Static Race Detector

Ilya Sergey Nikos Gorogiannis Peter O’Hearn

slide-27
SLIDE 27

2

Unsound (and incomplete) static analyses can be principled, satisfying meaningful theorems 
 that help to understand their behaviour and guide their design

One can have an unsound but effective static analysis, 
 which has significant industrial impact, 
 and which is supported by a meaningful theorem.

Key Messages

slide-28
SLIDE 28

Context

5

1. We had a demonstrably-effective industrial analysis: 
 RacerD (OOPSLA'18); >3k fixes in Facebook Java

  • 2. No soundness theorem
  • 3. Architecture: compositional abstract interpreter
  • 4. No heuristic alarm filtering

Just ad hoc? Our reaction: 
 Semantics/theory should understand/explain, not lecture.

slide-29
SLIDE 29

Case Study: RacerDX

  • A provably TP-Sound version of Facebook’s RacerD concurrency analyser


(Blackshear et al., OOPSLA’18)

  • Buggy executions: data races in lock-based concurrent programs
  • Syntactic assumptions:


Java programs with well-scoped locking (synchronised), no recursion, reflection, dynamic class loading; global variables are ignored.

  • Concrete over-approximation: 


Loops and conditionals are non-deterministic.

32

slide-30
SLIDE 30

Formal Result

RacerDX enjoys the True Positives Theorem

  • wrt. Data Race Detection

(Details in the paper)

38

slide-31
SLIDE 31

Static Analysis with True Positives Theorem*

Goal: to build a static analysis s.t. if the analysis reports a bug, it is a true bug

For an Idealized Language

1

slide-32
SLIDE 32

True bug can be exhibited

The race reported by the analysis for program P is a true race There exists an execution of P that exhibits the race

2

slide-33
SLIDE 33

Ingredients

  • f the formalism
  • program
  • execution
  • race
  • analysis
  • proof

For an Idealized Language

3

slide-34
SLIDE 34

Ingredients of a data race

lock() println(b.f) unlock() b.f := 42 path

4

Racy Program: b = new Bloop() u = new Burble() u.meps(b) || u.reps(b) parallel composition

slide-35
SLIDE 35

Concurrent program syntax

5

slide-36
SLIDE 36

Single-threaded program C: concrete semantics

  • State

(command, stack, heap, locks)

  • Trace (list of states)
  • Concrete semantics

(set of traces)

6

slide-37
SLIDE 37

Concrete semantics of commands

7

slide-38
SLIDE 38

Concrete semantics of commands

empty trace top stack frame location pointed by π given stack s & heap h value of var x command stack heap # of locks

8

slide-39
SLIDE 39

Concrete semantics of compound statements

(1) run C, get all its traces (2) take the last state of each trace (3) run c from the last state, get its traces (4) glue traces of C and c together

9

slide-40
SLIDE 40

Concrete trace example

a p1 (p1, f) p2 (p2, _) 666

lock() a.f := 5 unlock()

Stack s0 Heap h0

[ 〈lock(), s0, h0, 1〉,

(p1, f) p2 (p2, _) 5 Heap h1

Initial state: 〈skip, s0, h0, 0〉 program execution trace memory location no lock 〈a.f:=5, s0, h1, 1〉, 〈unlock(), s0, h1, 0〉 ]

10

slide-41
SLIDE 41

Two-threaded program C1။C2: concrete semantics

  • State
  • Trace
  • Concrete semantics

c။ε or ε။c

11

slide-42
SLIDE 42

2-threaded program interleaves single traces

(1) run components individually (2) interleave all individual traces (full and partial)

12

slide-43
SLIDE 43

Concurrent traces example

lock() x := 5 unlock() print(1) lock() x := 777 unlock() print(2) lock() ။ ε x := 5 ။ ε unlock() ။ ε print(1) ။ ε ε ။ lock() ε ။ y := 777 ε ။ unlock() ε ။ print(2) lock() x := 5 unlock() lock() y := 777 unlock() print(1) lock() x := 5 lock() lock() y := 777 unlock() print(2) lock() … … interleave (taking care of locks)

13

slide-44
SLIDE 44

Data race means concurrent access to location

14

slide-45
SLIDE 45

Data race means concurrent access to location

print(1) lock() a.f := 5 unlock() print(2) y := a.f print(1) ။ ε ε ။ print(2) lock() ။ ε ε ။ y := a.f a.f := 5 ။ ε concurrent Write ။ Read to the same location no lock! print(1) ။ ε ε ။ print(2) lock() ။ ε a.f := 5 ။ ε print(1) ။ ε ε ။ print(2) lock() ။ ε ε ။ y := a.f …

15

slide-46
SLIDE 46

Can we identify a data race without building the traces?

16

slide-47
SLIDE 47

Abstract Semantics

  • Abstract State

(wobblies, locks, path accesses)

  • Abstraction of a set of

concrete single-threaded traces

tracks accesses to memory locations helps identify true races

17

=

slide-48
SLIDE 48

Wobblies can evade data races (produce false positives)

same path b.f refers to different locations

18

slide-49
SLIDE 49

Abstraction keeps track of accesses and wobblies

(1) abstract the beginning of the trace 𝜐 without heap, stack, locks

substitution (needed for method calls)

(2) remember access to π (3) mark x & π as wobbly abstract set of traces using exec

discard substitution

19

slide-50
SLIDE 50

Why reading is wobbly?

lock() a.f.n := 5 unlock() x := a // a.f.n x.f := new … y := a.f.n

(p1, f) p4 (p2, n) p3 (p4, n) p5 (p1, f) p2 (p2, n) p3 a p1

same path a.f.n refers to different locations, so there is no race

20

slide-51
SLIDE 51

Abstract access captures concrete access

If a path access is recorded in the abstract state, there is a concrete trace exhibiting the access

21

slide-52
SLIDE 52

Stable path preserves memory location

If a good path is not wobbly, it preserves memory location along a trace

〈skip, s, h, L〉 … 〈c, s’, h’, L’〉 π π

… … … p Heap h … … … p … … Heap h’

22

slide-53
SLIDE 53

Static Analysis

Galois connection

  • Does not need traces
  • Compositional
  • Complete wrt. abstraction
  • perates in

abstract domain enjoys benefits

  • f the abstraction

23

slide-54
SLIDE 54

True Positives Theorem

access the same path π in W။W or W။R or R။W and π is not wobbly π refers to the same location in C1 & C2 in the initial state, and it still refers to the same location when concurrently accessed

24

slide-55
SLIDE 55

Evaluation

What is the price to pay for having the TP Theorem?

(Reporting no bugs whatsoever is TP-Sound)

39

slide-56
SLIDE 56

RacerD vs RacerDX

Target LOC D CPU DX CPU CPU ±% D Reps DX Reps Reps ±% D avrora 76k 103 102 0.4% 143 92 36% Chronicle-Map 45k 196 196 0.1% 2 2 0% jvm-tools 33k 106 109

  • 3.6%

30 26 13% RxJava 273k 76 69 9.2% 166 134 19% sunflow 25k 44 44

  • 1.4%

97 42 57% xalan-j 175k 144 137 5.0% 326 295 10%

(b) Evaluation results. CPU columns are in seconds; Reps are distinct reports;

40

slide-57
SLIDE 57

RacerD vs RacerDX

Target LOC D CPU DX CPU CPU ±% D Reps DX Reps Reps ±% D avrora 76k 103 102 0.4% 143 92 36% Chronicle-Map 45k 196 196 0.1% 2 2 0% jvm-tools 33k 106 109

  • 3.6%

30 26 13% RxJava 273k 76 69 9.2% 166 134 19% sunflow 25k 44 44

  • 1.4%

97 42 57% xalan-j 175k 144 137 5.0% 326 295 10%

(b) Evaluation results. CPU columns are in seconds; Reps are distinct reports;

41

slide-58
SLIDE 58

RacerD vs RacerDX

Target LOC D CPU DX CPU CPU ±% D Reps DX Reps Reps ±% D avrora 76k 103 102 0.4% 143 92 36% Chronicle-Map 45k 196 196 0.1% 2 2 0% jvm-tools 33k 106 109

  • 3.6%

30 26 13% RxJava 273k 76 69 9.2% 166 134 19% sunflow 25k 44 44

  • 1.4%

97 42 57% xalan-j 175k 144 137 5.0% 326 295 10%

(b) Evaluation results. CPU columns are in seconds; Reps are distinct reports;

42

slide-59
SLIDE 59

The Artifact

At Glance

slide-60
SLIDE 60

Contents of Artifact

  • Data: 6 packages source code
  • Facebook’s Infer package (OCaml code, Git repo): holds RacerD
  • RacerDX Patch
  • Set of Bash scripts to:

○ clean up ○ run vanilla Infer ○ patch and ran patched Infer, ○ collect stats.

  • README: dependencies, entry points to run scripts, etc.
slide-61
SLIDE 61

6 packages (incl. 2 invalid) by Build Technology

  • Ant: 3 pkgs (avrora, sunflow, xalan-j)
  • Gradle: 1 pkg (RxJava)
  • Maven: 2 pkgs (Chronicle-Map, jvm-tools) — the invalid ones
slide-62
SLIDE 62

Reproduction & reanalysis

War Stories

slide-63
SLIDE 63

Repetition

  • First try — failed with too new Java (noted in README):

error: as of release 9, '_' is a keyword, and may not be used as an identifier

  • Second try — failed: unrecognized parameter to cloc (not noted in the README).
  • Third try — partial success: numbers for RacerD are slightly off — ???

○ Reason: Missing native dependency

  • Finally, numbers for 4 packages did reproduce.
slide-64
SLIDE 64

What’s Wrong with Maven?

slide-65
SLIDE 65

Authors’ words (README)

Since submission of the paper for review, the sources of two of the projects (Chronicle-Map and jvm-tools) we used for evaluation became uncompilable (due to how maven works -- it always downloads dependencies from the internet, and it seems the newer versions are breaking the build of the version we originally tested). Dependencies? This should have to do with the build scripts!

slide-66
SLIDE 66

pom.xml of Chronicle-map

Remember: “always downloads dependencies from the internet”… Release Artifact

slide-67
SLIDE 67

pom.xml of jvm-tools (I)

Release Artifact

slide-68
SLIDE 68

pom.xml of jvm-tools (II)

slide-69
SLIDE 69

Infer’s bug in management of pom.xml

  • Error message like the one in the artifact (when enabling stderr):

Error while running epilogue restoring Maven's pom.xml to its original state: (Unix.Unix_error "No such file or directory" rename "((src /data/videoRecorder/videoRecorder-rpm/pom.xml.infer-orig) (dst /data/videoRecorder/videoRecorder-rpm/pom.xml))").

  • There’s a fix also!
slide-70
SLIDE 70

The Fix

slide-71
SLIDE 71

Happy End with Repetition

After

  • Applying the Infer fix (kudos to the authors for preserving the Git repo)
  • Checking out released versions of Maven-based packages

We were able to get the numbers from the paper.

slide-72
SLIDE 72

Our Experiments (mostly support the claims)

  • Full aws-sdk-java died with disk overflow (hundreds of gigabytes of reports)

○ Just one module (aws-java-sdk-s3): ○ test-aws-sdk-java, 3’847’035, 666, 639, 64, 48

  • spring-kafka — success, equal results:

○ test-spring-kafka, 30’461, 31, 31, 16,16

  • azkaban — success, equal, zeroes:

○ test-azkaban, 76’156, 0, 0, 0, 0

slide-73
SLIDE 73

Race Report Example

slide-74
SLIDE 74

Report Subsets RacerD RacerDX

slide-75
SLIDE 75

Takeaways: How NOT To Make an Artifact

  • No Environment Management (e.g. a VM, Docker, Nix, etc.):

a. a bunch of source codes (sometimes non-released versions; not tracked by a VCS, e.g. Git) b. (lose) description of dependencies (some dependencies didn’t have corresponding versions, e.g. cloc) c. No way to account for transitive deps of tools, esp. native deps (e.g. sqlite3-dev)

  • Clearing the $PATH -- poor man’s env management
  • Piping stdout and stderr (!!!) to /dev/null