with Dynamic Information Flow Analysis Mona Attariyan Jason Flinn - - PowerPoint PPT Presentation

with dynamic information flow analysis
SMART_READER_LITE
LIVE PREVIEW

with Dynamic Information Flow Analysis Mona Attariyan Jason Flinn - - PowerPoint PPT Presentation

Automating Configuration Troubleshooting with Dynamic Information Flow Analysis Mona Attariyan Jason Flinn University of Michigan Configuration Troubleshooting Is Difficult Software systems Users make mistakes difficult to configure Mona


slide-1
SLIDE 1

Automating Configuration Troubleshooting with Dynamic Information Flow Analysis

Mona Attariyan Jason Flinn University of Michigan

slide-2
SLIDE 2

Mona Attariyan - University of Michigan 2

Configuration Troubleshooting Is Difficult

Software systems difficult to configure Users make mistakes

slide-3
SLIDE 3

Mona Attariyan - University of Michigan 3

Configuration Troubleshooting Is Difficult

Software systems difficult to configure Users make mistakes

Misconfigurations happen

slide-4
SLIDE 4

Mona Attariyan - University of Michigan 4

Configuration Troubleshooting Is Difficult

slide-5
SLIDE 5

Mona Attariyan - University of Michigan 5

What To Do With Misconfiguration?

…… ……

&$%#!

….. …..

config file

Ask colleagues Search manual, FAQ,

  • nline forums

Look at the code if available

slide-6
SLIDE 6

Mona Attariyan - University of Michigan 6

What To Do With Misconfiguration?

A tool that automatically finds the root cause

  • f the misconfiguration in applications?
slide-7
SLIDE 7

Mona Attariyan - University of Michigan 7

ConfAid

Application code has enough information to lead us to the root cause Insight Dynamic information flow analysis on application binaries

How?

slide-8
SLIDE 8

Mona Attariyan - University of Michigan 8

How to Use ConfAid?

error

…… …… ……

config file

Application

slide-9
SLIDE 9

Mona Attariyan - University of Michigan 9

How to Use ConfAid?

error

…… …… ……

config file

Application ConfAid

slide-10
SLIDE 10

Mona Attariyan - University of Michigan 10

How to Use ConfAid?

error

…… …… ……

config file

Application ConfAid

slide-11
SLIDE 11

Mona Attariyan - University of Michigan 11

How to Use ConfAid?

error

…… …… ……

config file

likely root causes 1)… 2)… 3)… …… Application ConfAid

slide-12
SLIDE 12
  • Motivation
  • How ConfAid runs
  • Information flow analysis algorithms
  • Embracing imprecise analysis
  • Evaluation
  • Conclusion

Mona Attariyan - University of Michigan 12

Outline

slide-13
SLIDE 13

Mona Attariyan - University of Michigan 13

How Developers Find Root Cause

ExecCGI

Config file

file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()

Application

slide-14
SLIDE 14

Mona Attariyan - University of Michigan 14

How Developers Find Root Cause

ExecCGI

Config file

file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()

Application

slide-15
SLIDE 15

Mona Attariyan - University of Michigan 15

How ConfAid Finds Root Cause

Config file file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()

  • ConfAid uses taint tracking

ExecCGI

slide-16
SLIDE 16

Mona Attariyan - University of Michigan 16

How ConfAid Finds Root Cause

Config file file = open(config file) token = read_token(file) if (token equals “ExecCGI”) execute_cgi = 1 … if (execute_cgi == 1) ERROR()

  • ConfAid uses taint tracking

ExecCGI

slide-17
SLIDE 17

17

How to Avoid Error?

if (b) if (c) if (a)

slide-18
SLIDE 18

18

How to Avoid Error?

if (b) if (c) if (a)

slide-19
SLIDE 19

19

How to Avoid Error?

if (b) if (c) if (a) This path ends before the error happens

slide-20
SLIDE 20

20

How to Avoid Error?

if (b) if (c) This path leads to some other error if (a) This path ends before the error happens

slide-21
SLIDE 21

21

How to Avoid Error?

if (b) if (c) This path leads to some other error if (a) This path ends before the error happens This path successfully avoids the error

slide-22
SLIDE 22

22

How to Avoid Error?

if (b) if (c) This path leads to some other error likely root cause if (a) This path ends before the error happens This path successfully avoids the error

slide-23
SLIDE 23

23

How to Avoid Error?

if (b) if (c) This path leads to some other error likely root cause if (a) This path ends before the error happens This path successfully avoids the error

slide-24
SLIDE 24
  • Motivation
  • How ConfAid runs
  • Information flow analysis algorithms
  • Embracing imprecise analysis
  • Evaluation
  • Conclusion

Mona Attariyan - University of Michigan 24

Outline

slide-25
SLIDE 25

Mona Attariyan - University of Michigan 25

Data Flow Analysis

x = y + z , Ty = { , } Tz = { , } Tx = { , , }

Ty  Tz

value of x might change, if tokens or change Tx = { , } Taint propagates via data flow and control flow

slide-26
SLIDE 26

Mona Attariyan - University of Michigan 26

Control Flow Analysis

/* c = 0 */ /* x is read from file*/ if (c == 0) { x = a } T

a = { }

Tx = { T

c = { }

Tx = { } What could cause x to be different? } ,

Data flow Control flow

,( Ʌ )

slide-27
SLIDE 27

Mona Attariyan - University of Michigan 27

Alternate Path Exploration

y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-28
SLIDE 28

Mona Attariyan - University of Michigan 28

Alternate Path Exploration

y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-29
SLIDE 29

Mona Attariyan - University of Michigan 29

Alternate Path Exploration

y depends on c if(c) if(!c)

ckpt

/* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-30
SLIDE 30

Mona Attariyan - University of Michigan 30

Alternate Path Exploration

y depends on c if(c) y = a if(!c)

ckpt

/* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-31
SLIDE 31

Mona Attariyan - University of Michigan 31

Alternate Path Exploration

y depends on c if(c) if(!c)

ckpt

/* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-32
SLIDE 32

Mona Attariyan - University of Michigan 32

Alternate Path Exploration

y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-33
SLIDE 33

Mona Attariyan - University of Michigan 33

Alternate Path Exploration

y depends on c if(c) /* c = 1*/ /* y is read from file*/ if (c) { /*taken path*/ … } else { y = a }

slide-34
SLIDE 34

Mona Attariyan - University of Michigan 34

Effect of Alternate Path Exploration

/* c = 1*/ /* y is from file*/ if (c) { … } else { y = a } What could cause y to be different? T

a = { }

Ty = { T

c = { }

Ty = { } } ,

Alternate path exploration

,( Ʌ )

Alternate path + Data flow

slide-35
SLIDE 35
  • Motivation
  • How ConfAid runs
  • Information flow analysis algorithms
  • Embracing imprecise analysis
  • Evaluation
  • Conclusion

Mona Attariyan - University of Michigan 35

Outline

slide-36
SLIDE 36

Mona Attariyan - University of Michigan 36

Embracing Imprecise Analysis

  • Complete and sound analysis leads to:

– poor performance – high false positive rate

  • To improve performance
  • To reduce false positives

Bounded horizon heuristic Single mistake heuristic Weighting heuristic

slide-37
SLIDE 37
  • Bounded horizon prevents path explosion
  • Alternate path runs a fixed # of instructions

37

Bounded Horizon Heuristic

if (b) if (c) max reached, abort exploration

likely root causes

slide-38
SLIDE 38
  • Configuration file contains a single mistake
  • Reduces amount of taint and # of explored paths

Mona Attariyan - University of Michigan 38

Single Mistake Heuristic

/* x=1, c=0*/ if (c == 0) { x = a } T

a = { }

Tx = { , , ( Ʌ )} T

c = { }

Tx = { }

slide-39
SLIDE 39
  • Configuration file contains a single mistake
  • Reduces amount of taint and # of explored paths

Mona Attariyan - University of Michigan 39

Single Mistake Heuristic

/* x=1, c=0*/ if (c == 0) { x = a } T

a = { }

Tx = { , , ( Ʌ )} T

c = { }

Tx = { }

slide-40
SLIDE 40

Mona Attariyan - University of Michigan 40

Weighting Heuristic

  • Insufficient to treat all taint propagations equally

– Data flow introduces stronger dependency than ctrl flow – Branches closer to error stronger than farther branches

  • Assign weights to taints to represent strength level

– Data flow taint gets a higher weight than ctrl flow taint – Branches closer to error get higher weight than farther

slide-41
SLIDE 41

Mona Attariyan - University of Michigan 41

Example of Weighting Heuristic

if (x) { … if (y) { … if (z) { ERROR() } } } likely root causes

slide-42
SLIDE 42

42

Heuristics: Pros and Cons

Bounded horizon Single mistake Weighting Simplify control flow analysis

Improve performance

 

Reduce FP

 

Increase FP Increase FN FP = False Positive, FN = False Negative

slide-43
SLIDE 43

Mona Attariyan - University of Michigan 43

ConfAid and Multi-process Apps

  • ConfAid propagates taints between processes

– Intercepts IPC system calls – Sends taint along with the data

  • ConfAid currently supports communication via:

– Unix sockets, pipes, TCP and UDP sockets – Regular files

slide-44
SLIDE 44
  • Motivation
  • How ConfAid runs
  • Information flow analysis algorithms
  • Embracing imprecise analysis
  • Evaluation
  • Conclusion

Mona Attariyan - University of Michigan 44

Outline

slide-45
SLIDE 45
  • ConfAid debugs misconfiguration in:

– OpenSSH 5.1 (2 processes) – Apache HTTP server 2.2.14 (1 process) – Postfix mail transfer agent 2.7 (up to 6 processes)

  • Manually inject errors to configuration files
  • Evaluation metrics:

– The ranking of the correct root cause – The time to execute the application with ConfAid

Mona Attariyan - University of Michigan 45

Evaluation

slide-46
SLIDE 46
  • Real-world misconfigurations:

– total of 18 bugs from manuals, forums and FAQs

  • Randomly generated bugs:

– 60 bugs using ConfErr [Keller et al. DSN 08]

Mona Attariyan - University of Michigan 46

Data Sets

slide-47
SLIDE 47

Mona Attariyan - University of Michigan 47

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5

Correct root caused ranked first or second for all 18 real-world bugs

slide-48
SLIDE 48

Mona Attariyan - University of Michigan 48

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5

Correct root caused ranked first or second for all 18 real-world bugs 72%

slide-49
SLIDE 49

Mona Attariyan - University of Michigan 49

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5

Correct root caused ranked first or second for all 18 real-world bugs 72% 28%

slide-50
SLIDE 50

Mona Attariyan - University of Michigan 50

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47-49 2 2 2 1 Apache 88-93 3 1 2 Postfix 27-29 5 5

Correct root caused ranked first or second for all 18 real-world bugs 72% 28% 0%

slide-51
SLIDE 51

Mona Attariyan - University of Michigan 51

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3

Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs

slide-52
SLIDE 52

Mona Attariyan - University of Michigan 52

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3

Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs 85%

slide-53
SLIDE 53

Mona Attariyan - University of Michigan 53

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3

Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs 85% 7%

slide-54
SLIDE 54

Mona Attariyan - University of Michigan 54

How Effective is ConfAid ?

Total tokens First First tied w/1 Second Second tied w/1 Worse than second OpenSSH 47 17 1 1 1 Apache 88 17 1 1 1 Postfix 27 15 2 3

Correct root caused ranked first or second for 55 out of 60 randomly-generated bugs 85% 7% 8%

slide-55
SLIDE 55

Mona Attariyan - University of Michigan 55

How Fast is ConfAid?

Average Execution Time OpenSSH 52 seconds Apache 2 minutes 48 seconds Postfix 57 seconds OpenSSH 7 seconds Apache 24 seconds Postfix 38 seconds

Average execution time for real-world bugs: 1m 32s Average time for randomly-generated bugs: 23s

slide-56
SLIDE 56
  • ConfAid automatically finds root cause of problems
  • ConfAid uses dynamic information flow analysis
  • ConfAid ranks the correct root cause as first or

second in:

– 18 out of 18 real-world bugs – 55 out of 60 random bugs

  • ConfAid takes only a few minutes to run

Mona Attariyan - University of Michigan 56

Conclusion

slide-57
SLIDE 57

Mona Attariyan - University of Michigan 57

Questions?

slide-58
SLIDE 58
  • ConAid may or may not report all
  • For independent mistakes, ConfAid first

finds the one that led to the first failure

  • For dependent mistakes, ConfAid may

report all based on their effect on program

Mona Attariyan - University of Michigan 58

What if there are multiple mistakes?

slide-59
SLIDE 59

Mona Attariyan - University of Michigan 59

Effect of Bounded Horizon Heuristic

100 200 300 400 500 200 400 600 800 1000 1200 1400 1600 1800 Execution time (seconds) Maximum # of explored instructions OpenSSH Server Postfix

slide-60
SLIDE 60

Mona Attariyan - University of Michigan 60

Effect of Weighting Heuristic

20 40 60 80 100 OpenSSH Apache Postfix False Positives Max # tokens: 49 Max # tokens: 93 Max # tokens: 5