with Dynamic Information Flow Analysis Mona Attariyan Jason Flinn - - PowerPoint PPT Presentation
with Dynamic Information Flow Analysis Mona Attariyan Jason Flinn - - PowerPoint PPT Presentation
Automating Configuration Troubleshooting with Dynamic Information Flow Analysis Mona Attariyan Jason Flinn University of Michigan Configuration Troubleshooting Is Difficult Software systems Users make mistakes difficult to configure Mona
Mona Attariyan - University of Michigan 2
Configuration Troubleshooting Is Difficult
Software systems difficult to configure Users make mistakes
Mona Attariyan - University of Michigan 3
Configuration Troubleshooting Is Difficult
Software systems difficult to configure Users make mistakes
Misconfigurations happen
Mona Attariyan - University of Michigan 4
Configuration Troubleshooting Is Difficult
Mona Attariyan - University of Michigan 5
What To Do With Misconfiguration?
…… ……
&$%#!
….. …..
config file
Ask colleagues Search manual, FAQ,
- nline forums
Look at the code if available
Mona Attariyan - University of Michigan 6
What To Do With Misconfiguration?
A tool that automatically finds the root cause
- f the misconfiguration in applications?
Mona Attariyan - University of Michigan 7
ConfAid
Application code has enough information to lead us to the root cause Insight Dynamic information flow analysis on application binaries
How?
Mona Attariyan - University of Michigan 8
How to Use ConfAid?
error
…… …… ……
config file
Application
Mona Attariyan - University of Michigan 9
How to Use ConfAid?
error
…… …… ……
config file
Application ConfAid
Mona Attariyan - University of Michigan 10
How to Use ConfAid?
error
…… …… ……
config file
Application ConfAid
Mona Attariyan - University of Michigan 11
How to Use ConfAid?
error
…… …… ……
config file
likely root causes 1)… 2)… 3)… …… Application ConfAid
- Motivation
- How ConfAid runs
- Information flow analysis algorithms
- Embracing imprecise analysis
- Evaluation
- Conclusion
Mona Attariyan - University of Michigan 12
Outline
Mona Attariyan - University of Michigan 13
How Developers Find Root Cause
ExecCGI
Config file
file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()
Application
Mona Attariyan - University of Michigan 14
How Developers Find Root Cause
ExecCGI
Config file
file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()
Application
Mona Attariyan - University of Michigan 15
How ConfAid Finds Root Cause
Config file file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()
- ConfAid uses taint tracking
ExecCGI
Mona Attariyan - University of Michigan 16
How ConfAid Finds Root Cause
Config file file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()
- ConfAid uses taint tracking
ExecCGI
17
How to Avoid Error?
if (b) if (c) if (a)
18
How to Avoid Error?
if (b) if (c) if (a)
19
How to Avoid Error?
if (b) if (c) if (a) This path ends before the error happens
20
How to Avoid Error?
if (b) if (c) This path leads to some other error if (a) This path ends before the error happens
21
How to Avoid Error?
if (b) if (c) This path leads to some other error if (a) This path ends before the error happens This path successfully avoids the error
22
How to Avoid Error?
if (b) if (c) This path leads to some other error likely root cause if (a) This path ends before the error happens This path successfully avoids the error
23
How to Avoid Error?
if (b) if (c) This path leads to some other error likely root cause if (a) This path ends before the error happens This path successfully avoids the error
- Motivation
- How ConfAid runs
- Information flow analysis algorithms
- Embracing imprecise analysis
- Evaluation
- Conclusion
Mona Attariyan - University of Michigan 24
Outline
Mona Attariyan - University of Michigan 25
Data Flow Analysis
x = y + z , Ty = { , } Tz = { , } Tx = { , , }
Ty Tz
value of x might change, if tokens or change Tx = { , } Taint propagates via data flow and control flow
Mona Attariyan - University of Michigan 26
Control Flow Analysis
/* c = 0 */ /* x is read from file*/ if (c == 0) { x = a } T
a = { }
Tx = { T
c = { }
Tx = { } What could cause x to be different? } ,
Data flow Control flow
,( Ʌ )
Mona Attariyan - University of Michigan 27
Alternate Path Exploration
y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 28
Alternate Path Exploration
y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 29
Alternate Path Exploration
y depends on c if(c) if(!c)
ckpt
/* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 30
Alternate Path Exploration
y depends on c if(c) y = a if(!c)
ckpt
/* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 31
Alternate Path Exploration
y depends on c if(c) if(!c)
ckpt
/* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 32
Alternate Path Exploration
y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 33
Alternate Path Exploration
y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }
Mona Attariyan - University of Michigan 34
Effect of Alternate Path Exploration
/* c = 1*/ /* y is from file*/ if (c) { … } else { y = a } What could cause y to be different? T
a = { }
Ty = { T
c = { }
Ty = { } } ,
Alternate path exploration
,( Ʌ )
Alternate path + Data flow
- Motivation
- How ConfAid runs
- Information flow analysis algorithms
- Embracing imprecise analysis
- Evaluation
- Conclusion
Mona Attariyan - University of Michigan 35
Outline
Mona Attariyan - University of Michigan 36
Embracing Imprecise Analysis
- Complete and sound analysis leads to:
– poor performance – high false positive rate
- To improve performance
- To reduce false positives
Bounded horizon heuristic Single mistake heuristic Weighting heuristic
- Bounded horizon prevents path explosion
- Alternate path runs a fixed # of instructions
37
Bounded Horizon Heuristic
if (b) if (c) max reached, abort exploration
likely root causes
- Configuration file contains a single mistake
- Reduces amount of taint and # of explored paths
Mona Attariyan - University of Michigan 38
Single Mistake Heuristic
/* x=1, c=0*/ if (c == 0) { x = a } T
a = { }
Tx = { , , ( Ʌ )} T
c = { }
Tx = { }
- Configuration file contains a single mistake
- Reduces amount of taint and # of explored paths
Mona Attariyan - University of Michigan 39
Single Mistake Heuristic
/* x=1, c=0*/ if (c == 0) { x = a } T
a = { }
Tx = { , , ( Ʌ )} T
c = { }
Tx = { }
Mona Attariyan - University of Michigan 40
Weighting Heuristic
- Insufficient to treat all taint propagations equally
– Data flow introduces stronger dependency than ctrl flow – Branches closer to error stronger than farther branches
- Assign weights to taints to represent strength level
– Data flow taint gets a higher weight than ctrl flow taint – Branches closer to error get higher weight than farther
Mona Attariyan - University of Michigan 41
Example of Weighting Heuristic
if (x) { … if (y) { … if (z) { ERROR() } } } likely root causes
42
Heuristics: Pros and Cons
Bounded horizon Single mistake Weighting Simplify control flow analysis
Improve performance
Reduce FP
Increase FP Increase FN FP = False Positive, FN = False Negative
Mona Attariyan - University of Michigan 43
ConfAid and Multi-process Apps
- ConfAid propagates taints between processes
– Intercepts IPC system calls – Sends taint along with the data
- ConfAid currently supports communication via:
– Unix sockets, pipes, TCP and UDP sockets – Regular files
- Motivation
- How ConfAid runs
- Information flow analysis algorithms
- Embracing imprecise analysis
- Evaluation
- Conclusion
Mona Attariyan - University of Michigan 44
Outline
- ConfAid debugs misconfiguration in:
– OpenSSH 5.1 (2 processes) – Apache HTTP server 2.2.14 (1 process) – Postfix mail transfer agent 2.7 (up to 6 processes)
- Manually inject errors to configuration files
- Evaluation metrics:
– The ranking of the correct root cause – The time to execute the application with ConfAid
Mona Attariyan - University of Michigan 45
Evaluation
- Real-world misconfigurations:
– total of 18 bugs from manuals, forums and FAQs
- Randomly generated bugs:
– 60 bugs using ConfErr [Keller et al. DSN 08]
Mona Attariyan - University of Michigan 46
Data Sets
Mona Attariyan - University of Michigan 47
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5
Correct root caused ranked first or second for all 18 real-world bugs
Mona Attariyan - University of Michigan 48
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5
Correct root caused ranked first or second for all 18 real-world bugs 72%
Mona Attariyan - University of Michigan 49
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5
Correct root caused ranked first or second for all 18 real-world bugs 72% 28%
Mona Attariyan - University of Michigan 50
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5
Correct root caused ranked first or second for all 18 real-world bugs 72% 28% 0%
Mona Attariyan - University of Michigan 51
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3
Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs
Mona Attariyan - University of Michigan 52
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3
Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs 85%
Mona Attariyan - University of Michigan 53
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3
Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs 85% 7%
Mona Attariyan - University of Michigan 54
How Effective is ConfAid ?
Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3
Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs 85% 7% 8%
Mona Attariyan - University of Michigan 55
How Fast is ConfAid?
Average Execution Time OpenSSH 52 seconds Apache 2 minutes 48 seconds Postfix 57 seconds OpenSSH 7 seconds Apache 24 seconds Postfix 38 seconds
Average execution time for real-world bugs: 1m 32s Average time for randomly-generated bugs: 23s
- ConfAid automatically finds root cause of problems
- ConfAid uses dynamic information flow analysis
- ConfAid ranks the correct root cause as first or
second in:
– 18 out of 18 real-world bugs – 55 out of 60 random bugs
- ConfAid takes only a few minutes to run
Mona Attariyan - University of Michigan 56
Conclusion
Mona Attariyan - University of Michigan 57
Questions?
- ConAid may or may not report all
- For independent mistakes, ConfAid first
finds the one that led to the first failure
- For dependent mistakes, ConfAid may
report all based on their effect on program
Mona Attariyan - University of Michigan 58
What if there are multiple mistakes?
Mona Attariyan - University of Michigan 59
Effect of Bounded Horizon Heuristic
100 200 300 400 500 200 400 600 800 1000 1200 1400 1600 1800 Execution time (seconds) Maximum # of explored instructions OpenSSH Server Postfix
Mona Attariyan - University of Michigan 60
Effect of Weighting Heuristic
20 40 60 80 100 OpenSSH Apache Postfix False Positives Max # tokens: 49 Max # tokens: 93 Max # tokens: 5