Distributed W atchpoints: Debugging Very Large Ensem bles of Robots - - PowerPoint PPT Presentation
Distributed W atchpoints: Debugging Very Large Ensem bles of Robots - - PowerPoint PPT Presentation
Distributed W atchpoints: Debugging Very Large Ensem bles of Robots De Rosa, Goldstein, Lee, Campbell, Pillai Aug 19, 2006 Motivation Distributed errors are hard to find with traditional debugging tools Centralized snapshot algorithms
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 2
Motivation
- Distributed errors are hard to find with traditional debugging tools
- Centralized snapshot algorithms
– Expensive – Geared towards detecting one error at a time
- Special-purpose debugging code is difficult to write, may itself
contain errors
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 3
Expressing and Detecting Distributed Conditions
“How can we represent, detect, and trigger on distributed conditions in very large multi-robot systems?”
- Generic detection framework, well suited to debugging
- Detect conditions that are not observable via the local state of one
robot
- Support algorithm-level debugging (not code/ HW debugging)
- Trigger arbitrary actions when condition is met
- Asynchronous, bandwidth/ CPU-limited systems
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 4
Distributed/ Parallel Debugging: State of the Art
Modes:
- Parallel: powerful nodes, regular (static) topology, shared memory
- Distributed: weak, mobile nodes
Tools:
- GDB
- printf()
- Race detectors
- Declarative network systems with debugging support (ala P2)
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 5
Exam ple Errors: Leader Election
Scenario: One Leader Per Tw o-Hop Radius
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 6
Exam ple Errors: Token Passing
Scenario: I f a node has the token, exactly one
- f it’s neighbors m ust have had it last tim estep
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 7
Exam ple Errors: Gradient Field
Scenario: Gradient Values Must Be Sm ooth
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 8
Expressing Distributed Error Conditions
Requirements:
- Ability to specify shape of trigger groups
- Temporal operators
- Simple syntax (reduce programmer effort/ learning curve)
A Solution:
- Inspired by Linear Temporal Logic (LTL)
– A simple extension to first-order logic – Proven technique for single-robot debugging [ Lamine01]
- Assumption: Trigger groups must be connected
– For practical/ efficiency reasons
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 9
W atchpoint Prim itives
- Modules (implicitly quantified over all connected sub-ensembles)
- Topological restrictions (pairwise neighbor relations)
- Boolean connectives
- State variable comparisons (distributed)
- Temporal operators
nodes(a,b,c); n(b,c) & (a.var > b.var) & (c.prev.var != 2)
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 0
Distributed Errors: Exam ple W atchpoints
nodes( a,b,c) ;n( a.b) & n( b,c) & ( a.isLeader = = 1 ) & ( c.isLeader = = 1 ) nodes( a,b,c) ;n( a,b) & n( a,c) & ( a.token = = 1 ) & ( b.prev.token = = 1 ) & ( c.prev.token = = 1 ) nodes( a,b) ;( a.state - b.state > 1 )
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 1
W atchpoint Execution
nodes(a,b,c)…
2 1 4 3 6 5 8 7 10 9 12 11 14 13 16 15 18 17 20 19 22 21 24 23 26 25 28 27 30 29 32 31
1 2 3 1 2 1 9
. . . .
1 9 2 1 9 10
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 2
Perform ance: W atchpoint Size
- 1000 modules, running for 100 timesteps
- Simulator overhead excluded
- Application: data aggregation with landmark routing
- Watchpoint: are the first and last robots in the watchpoint in the same state?
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 3
Perform ance: Num ber of Matchers
- This particular watchpoint never terminates early
- Number of matchers increases exponentially
- Time per matcher remains within factor of 2
- Details of the watchpoint expression more important than size
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 4
Perform ance: Periodically Running W atchpoints
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 5
Future W ork
- Distributed implementation
- More optimization
- User validation
- Additional predicates
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 6
Conclusions
- Simple, yet highly descriptive syntax
- Able to detect errors missed by more conventional techniques
- Low simulation overhead
Thank You
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 8
Backup Slides
8 / 1 9 / 2 0 0 6 Distributed W atchpoints 1 9
Optim izations
- Temporal span
- Early termination
- Neighbor culling
- (one slide per)