specification free monitoring CS 119 can we avoid writing specs? - - PowerPoint PPT Presentation
specification free monitoring CS 119 can we avoid writing specs? - - PowerPoint PPT Presentation
specification free monitoring CS 119 can we avoid writing specs? specification and programming specification runtime verification program input output 2 properties can be hard to write To quote quite excellent NASA software engineer
2
program specification runtime verification input
- utput
specification and programming
3
properties can be hard to write
To quote quite excellent NASA software engineer when asked what properties his system would have to satisfy: “I have absolutely no! idea what properties this system should satisfy”. the problem is real: the problem is real: can we avoid writing specs? can we avoid writing specs?
4
program algorithm runtime verification input
- utput
generic checking-algorithms
5
program specification specification learning input
- utput
learning specifications
visualization
6
7
how to avoid writing specs
- generate specs from runs (reverse dynamic engineering)
– instrument program to detect certain events – collect information during one or more runs and establish database of results – this database accumulates specification of nominal behavior
- error checking algorithms for specific narrow problems
– data races – high level data races – atomicity violations – deadlocks
8
learning specs from runs
- DAIKON, learning JML:
– http://groups.csail.mit.edu/pag/daikon/
- Perecotta, learning temporal formulas:
– http://www.cs.virginia.edu/perracotta
- Redux : learning call graphs etc.
– http://valgrind.org/docs/redux2003.ps
9
so many traces
good run Hmm … looks pretty good to me. bad run
1 0
algorithmic “potential” analysis
good run Shoot … some foot prints
- f a bug!
bad run
turn a hard to test property into a stronger easy to test property.
1 1
pros and cons
+ Scales well (one trace) + Often finds the bugs it is supposed to find
- Gives false positives
- Gives false negatives
- Only special bugs
data races
1 3
what is a data race?
A data race occurs when two concurrent threads access a shared variable and when at least one access is a write, and the threads use no explicit mechanism to prevent the accesses from being simultaneous.
The traditional definition:
1 4
a classic Java example
public void increase() { counter++; }
Let’s consider the function increase(), which is a part
- f a class that acts as a counter
Although written as a single “increment” operation, the “++” operator is actually mapped into three JVM instructions [load operand, increment, write-back]
1 5
example – continued
… … write-back increment load oper … … write-back increment load oper
Thread A Thread B
Context Switch
counter = 3 4 We shall refer to this traditional notion of data race as a low-level data race, since it focuses on a single variable
1 6
low-level data races
- the standard way to avoid low-level data races on a
variable is to protect the variable with a lock: all accessing threads must acquire this lock before accessing the variable, and release it again after.
- there exist several algorithms for analyzing multi-
threaded programs for low-level data races.
– we will mention the Eraser algorithm here (Savage et al 97), developed for C – and Racer, developed for Java …
Racer
an algorithm for detecting data races in Java programs shifting to Racer slides send email to havelund@gmail.com to obtain these
high-level data races
work work by y Cyrille Artho yrille Artho, Klaus Havelund, and laus Havelund, and Armin Biere rmin Biere slides by slides by Cyrille Artho yrille Artho included ncluded
1 9
data race
void swap() { lx = c.x; ly = c.y; c.x = ly; c.y = lx; } void reset() { synchronized(this){ c.x = 0; } synchronized(this){ c.y = 0; } }
Pair of coordinates x and y. Two threads. Problem: thread order non-deterministic. Data corruption possible! Lock protection needed.
2 0
repairing the situation: protecting x and y in swap
void swap() { synchronized(this){ lx = c.x; ly = c.y; } synchronized(this){ c.x = ly; c.y = lx; } } void reset() { synchronized(this){ c.x = 0; } synchronized(this){ c.y = 0; } }
All field accesses synchronized: Racer reports no errors. No classical data race for these threads, but clearly undesired behavior! Problem: swap may run while reset is in progress!
High-Level Data Race
5,8 0,8 8,0 8,0
Result is neither a swap or a reset
2 1
the problem
- the reset method releases its lock in between
setting x and then setting y.
- this gives the swap method the chance to
interleave the two partial resets.
- the swap method “has it right”: it holds its
lock during operation on x and y.
2 2
the solution
- this difference in views can be detected
dynamically.
- essentially, this approach tries to infer what
the developer intended when writing the multi-threaded code, by discovering view inconsistencies.
- depends on at least one thread getting it right.
2 3
some examples of views:
synchronized(L) { access(x); access(y); }
Thread a
{ {x,y} } synchronized(L) { access(x); } synchronized(L) { access(y); }
Thread b
{ {x},{y} } synchronized(L) { access(x); } synchronized(L) { access(x); access(y); }
Thread c
{ {x},{x,y} } synchronized(L) { access(x); }
Thread d
{ {x} }
a and b Inconsistent: a and c, a and d Consistent:
Views express per thread what fields are guarded by a lock.
2 4
the algorithm
1) For each thread, for each lock, identify all fields covered by that lock (views). 2) For each thread, find the views that have no other view that contains them (maximal views). 3) For each pair of threads t1 and t2: find the intersection between t1’s maximal view and the views of t2. 4) Verify that those intersections form a chain. That is: s1 ⊆ s2 ⊆ s3 ⊆ ⋯
2 5
low-level versus high-level data races
x
L1 L2 L2
x x y y
Low-Level High-Level
For each variable: a lock set For each lock: a variable set (several)
y
L3 L1 L1 L2
2 6
applying algorithm to example
void swap() { synchronized(this){ lx = c.x; ly = c.y; } synchronized(this){ c.x = ly; c.y = lx; } } void reset() { synchronized(this){ c.x = 0; } synchronized(this){ c.y = 0; } }
x y x , y {x} ⊆ {y} {y} ⊆ {x}
Overlaps are: {x} and {y}. maximal of swap
2 7
a formal view
Let I be the set of object instances generated by a particular run of a Java program. Then F is the set of all fields of all instances in I Let l be a lock, t a thread, and B(t,l) the set of all synchronized blocks using lock l executed by thread t. For b ∈ B(t, l), a view generated by t with respect to l, is defined as the set of fields accessed in b by t A view v ∈ P(F) is a subset of F The set of generated views V (t) ⊆ P(F) of a thread t is the set of all views v generated by t
2 8
maximal view
A view vm generated by a thread t is a maximal view, if it is maximal with respect to set inclusion in V(t):
∀v ∈ V(t) [(vm ⊆ v) →(vm = v)] Note that this definition suggests that there might be more than a single maximal view
Let M(t) denote the set of all maximal views of thread t
2 9
- verlapping
Only two views which have fields in common can be responsible for a conflict. This observation is the motivation for the next definition:
Given a set of views V(t) generated by t, and a view v’ generated by another thread, the overlapping views of t with v’ are all non-empty intersections of views in V(t) with v’ :
- verlap (t, v’) = {v ∩ v’ | (v ∈ V(t)) ∧ (v ∩ v’ ≠ Ø)}
3 0
view vompatibility
A set of views V(t) is compatible with the maximal view vm of another thread if all overlapping views of t with vm form a chain: compatible (t, vm) if ∀v1, v2 ∈ overlap (t, vm) [(v1 ⊆ v2) ∨ (v2 ⊆ v1)]
3 1
view consistency
View consistency is the mutual compatibility between all threads: A thread is only allowed to use views that are compatible with the maximal views of all other threads.
∀t1 ≠ t2, vm ∈ M(t1) [compatible(t2, vm)]
3 2
If( & not ok( )) issueWarning() update( ) set( )
Task Database Flag Monitor
HL data race in Remote Agent
3 3
neither sound nor complete
False positive when one thread uses coarser locking
than required due to efficiency.
False negatives when:
All threads use the same
locking
Random execution trace does
not expose problem
L x y L x y L x y L x y
3 4
so, what is it good for?
- much higher chance of detecting an error than if one
relies on actually executing the particular interleaving that leads to an error, without requiring much computational resources.
- developers seem to follow the guideline of view
consistency to a surprisingly large extent.
atomicity checking stale (outdated) values
work work by y Cyrille Artho yrille Artho, Klaus Havelund, and laus Havelund, and Armin Biere rmin Biere slides by slides by Cyrille Artho yrille Artho included ncluded
3 6
recall the high level data race
void swap() { synchronized(this){ lx = c.x; ly = c.y; } synchronized(this){ c.x = ly; c.y = lx; } } void reset() { synchronized(this){ c.x = 0; } synchronized(this){ c.y = 0; } }
Problem: swap may run while reset is in progress!
3 7
repairing the situation: making reset atomic
void swap() { synchronized(this){ lx = c.x; ly = c.y; } synchronized(this){ c.x = ly; c.y = lx; } } void reset() { synchronized(this){ c.x = 0; c.y = 0; } }
5,8 0,0 5,8 8,5
Problem:
- reset may run while swap is in progress!
- swap then continues operating on outdated values
reset has no effect
3 8
the problem: data flow across synchronized blocks
void swap() { synchronized(this){ lx = c.x; ly = c.y; } synchronized(this){ c.x = ly; c.y = lx; } }
lx and ly defined! store values locally lx and ly used! may be outdated Shared data “escape” beyond first synchronized block! Algorithm checks whether shared data escape synchronized blocks.
3 9
algorithm
- enumerate synchronized blocks.
- mark values as shared or unshared.
- mark local variables with
– the identity of synchronization block where defined. – whether they contain a shared variable.
- for each use of a local variable containing a
shared value, check: block where used = block where defined.
4 0
determining sharedness
If instruction creates stack elements (getfield, method call)
– if inside a synchronized block: stack elements generated are marked as shared – else: stack elements generated are marked as local
4 1
determining sharedness
- f return values of methods
synchronized(this){ lx = c.getX(); } Method call inside synchronization: return value is shared Method call outside synchronization: callee uses synchronization: return value is shared Method call outside synchronization: no synchronization in callee: return value is local synchronized int getX() { return x; } lx = c.getX(); int getX() { return x; } lx = c.getX();
deadlocks
work work by y Saddek addek Bensalem and Klaus Havelund ensalem and Klaus Havelund
4 3
resource deadlocks
A resource deadlock can occur when
two or more threads block each other in a cycle while trying to access synchronization locks (held by other threads) needed to continue their activities.
4 4
T1: synchronized(L1){ ... synchronized(L2){ ...
} ... }
T2: synchronized(L2){ ... synchronized(L1){ ... } ... } Deadlock: T1 takes L1 and T2 takes L2 synchronized statements A basic scenario
a Java example
4 5
a second Java example
class Value{ int x = 1; synchronized void add(Value v){x = x + v.get();} synchronized int get(){return x;} } synchronized methods Dynamic locks
4 6
a second Java example
class Value{ int x = 1; synchronized void add(Value v){x = x + v.get();} synchronized int get(){return x;} }
v1.add(v2) Thread T1 Thread T2
v1 = new Value(); v2 = new Value();
v2.add(v1)
v1 v2
synchronized methods Dynamic locks
4 7
class Main{ Fork[] forks = new Fork[N]; .. for(int i=0;i<N;i++){ new Phisosopher(forks[i],forks[(i+1)%N]; }; } Philosopher: while(count<10){ synchronized(left){ synchronized(right){count++} } }
a third Java example
dynamic locking A challenge for static analysis arithmetic
synchronized(salt_chaker)
N = number of philosophers
4 8
analyzing the dining phil. with static analysis and model checking
- Jlint: a static Java analyzer
– No deadlocks found in deadlocking version
- JPF: a Java model checker
– Deadlocking version:
- N=15 (32 seconds)
- N=20 (2.51 minutes)
- N=21 (out of mem.)
– Deadlock free version:
- N=3 (3 minutes)
- N=4 (26 minutes and out of mem.)
4 9
extract trace compute graph detect cycles
Error report Error X in line Y …
detecting cycles in lock graphs
5 0
basic algorithm (Harrow 2000)
T1: synchronized(L1){ ... synchronized(L2){ ...
} ... }
T2: synchronized(L2){ ... synchronized(L1){ ... } ... } L1 L2 a cycle
=
5 1
basic algorithm
Input: an execution trace δ GL : (Lock × Lock)-set lock graph CL : [Thread → Lock-set] lock context for(i = 1..|δ|) do case δ[i] of l(t,o) → GL := GL ∪ { (o’,o) | o’ ∈ CL(t) }; CL := CL ⊕ [t CL(t) ∪ {o} ]; u(t,o) → CL := CL ⊕ [t CL(t) \ {o} ]; endcase enddo foreach c in cycles(GL) print(“deadlock potential: ” + c);
5 2
false positives
T1: sync(G){ sync(L1){ sync(L2){} } }; T3 = new T3(); j3.start(); j3.join(); sync(L2){ sync(L1){} } 4 deadlock potentials 4 deadlock potentials Only one is real T2: sync(G){ sync(L2){ sync(L1){} } } T3: sync(L1){ sync(L2){} }
Guarded cycle Thread segmented cycle Singular cycle Deadlock cycle!
5 3
extract execution trace
T1: sync(G){ sync(L1){ sync(L2){} } }; T3 = new T3(); j3.start(); J3.join(); sync(L2){ sync(L1){} } T2: sync(G){ sync(L2){ sync(L1){} } } T3: sync(L1){ sync(L2){} }
l(T1,G) l(T1,L1) l(T1,L2) u(T1,L2) u(T1,L1) s(T1,T3) l(T2,G) l(T2,L2) l(T2,L1) u(T2,L1) u(T2,L2) u(T2,G) l(T3,L1) l(T3,L2) u(T3,L2) u(T3,L1) j(T1,T3) l(T1,L2) l(T1,L1) u(T1,L1) u(T1,L2)
Trace Event format: l(<thread>,<lock>) - lock u(<thread>,<lock>) - unlock s(<thread>,<thread>) - start j(<thread>,<thread>) - join
5 4
does basic algorithm work?
T1: sync(G){ sync(L1){ sync(L2){} } }; T3 = new T3(); j3.start(); J3.join(); sync(L2){ sync(L1){} } T2: sync(G){ sync(L2){ sync(L1){} } } T3: sync(L1){ sync(L2){} } L1 L2
4 cycles = 4 deadlock potentials reported (Visual Threads). 1 real deadlock! (3 false positives)
T3: L1 -> L2 T2: L2 -> L1
Algorithm: build lock graph and detect cycles in graph. An edge goes from X to Y if a thread holds X while locking Y. Guarded cycle Thread segmented cycle Singular cycle Deadlock cycle!
5 5
T1,{G} 1,{G}
guarded algorithm
T1: sync(G){ sync(L1){ sync(L2){} } }; T3 = new T3(); j3.start(); J3.join(); sync(L2){ sync(L1){} } T2: sync(G){ sync(L2){ sync(L1){} } } T3: sync(L1){ sync(L2){} } L1 L2 T3,{} T3,{} T1,{} T1,{} T2,{G} T2,{G} 1. 1. Threads: Threads: must differ 2. 2. Guard sets: Guard sets: must not overlap
Valid Cycles:
2 cycles = 2 deadlock potentials reported.
Potential 1 & 2
Algorithm: extend lock graph with labeled edges: which thread, and set of guard locks. Deadlock cycle! Thread segmented cycle
5 6
Input: an execution trace δ type Label = Thread × Lock-set GL : (Lock × Label × Lock)-set lock graph CL : [Thread → Lock-set] lock context for(i = 1..|δ|) do case δ[i] of l(t,o) → GL := GL ∪ { (o’,(t,g),o) |
- ’ ∈ CL(t) ∧
g = { o’’ | o’’ ∈ CL(t) }}; CL := CL ⊕ [t CL(t) ∪ {o} ]; u(t,o) → CL := CL ⊕ [t CL(t) \ {o} ]; endcase enddo foreach c in valid-cycles(GL) print(“deadlock potential: ” + c);
valid-cycle( c ) = ∀ e1,e2 ∈ c • thread(e1) ≠ thread(e2) ∧ guards(e1) ∩ guards(e2) = { } thread : Label → Thread thread(t,g) = t guards : Label → Lock-set guards(t,g) = g
guarded algorithm
5 7
dividing code into segments
T1: sync(G){ sync(L1){ sync(L2){} } }; T3 = new T3(); j3.start(); J3.join(); sync(L2){ sync(L1){} } T2: sync(G){ sync(L2){ sync(L1){} } } T3: sync(L1){ sync(L2){} }
Thread segmented cycle
1 2 4 3 x y
x executes before y
5 8
effect of start() and join()
x n x y n n+1
t2.start() t2.join() T1 T2 T2 T1 n : next free segment
Let S* be the transitive
closure of segmentation graph S.
The happens-before relation is defined as:
x y = (x,y) ∈ S*
The in-parallel relation is defined as:
x || y = ¬(x y) ∧ ¬(y x)
5 9
final algorithm
T1: sync(G){ sync(L1){ sync(L2){} } }; T3 = new T3(); j3.start(); J3.join(); sync(L2){ sync(L1){} } T2: sync(G){ sync(L2){ sync(L1){} } } T3: sync(L1){ sync(L2){} } L1 L2 T3,{},(6,6) T3,{},(6,6) T1,{G},(2,2) 1,{G},(2,2)) T1,{},(7,7) T1,{},(7,7) T2,{G},(4,4) T2,{G},(4,4) M: new T1().start(); new T2().start();
3 4 7 6 5 1 2
M T1 T2 T3 1. 1. Threads: Threads: must differ 2. 2. Guard sets: Guard sets: must not overlap 3. 3. Segments: Segments: must be parallel
Valid Cycles:
One potential left, the real deadlock!
Algorithm: extend lock graph with segment labels: which segments of a thread locks are taken. Deadlock cycle!
6 0
final algorithm
Input: an execution trace δ type Label = nat × Thread × Lock-set × nat GL : (Lock × Label × Lock)-set lock graph GS : (nat × nat)-set seg. Graph CL : [Thread → (Lock × nat)-set] lock context CS : [Thread → nat] segm. Context n : nat = 1; next available segm. for(i = 1..|δ|) do case δ[i] of l(t,o) → GL := GL ∪ { (o’,(s1,t,g,s2),o) | (o’, s1) ∈ CL(t) ∧ g = { o’’ | (o’’, *) ∈ CL(t) } ∧ s2 = CS(t)}; CL := CL ⊕ [t CL(t) ∪ {(o,CS(t))} ]; u(t,o) → CL := CL ⊕ [t CL(t) \ {(o,*)} ]; s(t1, t2) → GS := GS ∪ { (CS(t1),n), (CS(t1),n+1)} CS := CS ⊕ [t1 n, t2 n+1]; n := n + 2;
j(t1, t2) →
GS := GS ∪ { (CS(t1),n), (CS(t2),n)}; CS := CS ⊕ [t1 n]; n := n + 1; endcase enddo foreach c in valid-cycles(GL) print(“deadlock potential: ” + c);
valid-cycle( c ) = ∀ e1,e2 ∈ c • thread(e1) ≠ thread(e2) ∧ guards(e1) ∩ guards(e2) = { } ∧ target(e1) ¬ target(e2)
thread : Label → Thread thread(s1,t,g,s2) = t guards : Label → Lock-set guards(s1,t,g,s2) = g source : Label → nat source(s1,t,g,s2) = s1 target : Label → nat target(s1,t,g,s2) = s2
6 1
the testing problem
synchronized(L1){ if(random(1,n)==1) synchronized(L2){} } synchronized(L2){ if(random(1,n)==1) synchronized(L3){} } synchronized(Lk){ if(random(1,n)==1) synchronized(L1){} }
…
Probability for deadlock to occur: Prog(k,n): PD(k,n) = P(Prog(k,n) deadlocks in a run) = 1/n * P(deadlock | random == 1) = 1/n * (1 * (k-1)/k * (k-2)/k * … * 1/k) = 1/n * k!/k^k Example: PD(k=4,n=3) = 0.03
6 2
potential-analysis is 10 times more effective
synchronized(L1){ if(random(1,n)==1) synchronized(L2){} } synchronized(L2){ if(random(1,n)==1) synchronized(L3){} } synchronized(Lk){ if(random(1,n)==1) synchronized(L1){} }
…
Probability for cycle to occur: Prog(k,n): PC(k,n) = P(Prog(k,n) generates cycle in a run) = 1/n * P(cycle | random == 1) = 1/n * 1 = 1/n Example: PC(k=4,n=3) = 0.33 = 10 * PD(k=4,n=3) < 1
6 3
the dining philosophers
Deadlocking version:
–N=15 (32 seconds), –N=20 (2.51 minutes), –N=21 (out of mem.)
Deadlock free version:
–N=3 (3 minutes), –N=4 (out of mem.) model checking
Deadlocking version:
–N=47 (5 minutes),
Deadlock free version:
–N=3 (38 seconds), –N=4 (out of mem.) trace model checking
Deadlocking version:
–N=100 (8 seconds), –N=300 (22 seconds) –…
Deadlock free version:
–N=4 (7 seconds) –N=100 (30 seconds) –N=300 (2 minutes) –… runtime analysis
6 4
analysis of NASA Planner
planner main
token network (plan) agent relay (communication system) external events traceLock TokenNetworkMutex
TokenNetworkMutex PlanRunner::trace_lock by thread: DeliberativePlanner at positions:
- trySetVariableDomain()
- PlanRunner::traceLock()
TokenNetworkMutex by thread: Mission_Agent_Main at positions:
- PlanRunner::traceLock()
- getPredicateType()
Error message:
6 5