CS510 Software Engineering Dynamic Program Analysis Asst. Prof. - - PowerPoint PPT Presentation

cs510 software engineering
SMART_READER_LITE
LIVE PREVIEW

CS510 Software Engineering Dynamic Program Analysis Asst. Prof. - - PowerPoint PPT Presentation

CS510 Software Engineering Dynamic Program Analysis Asst. Prof. Mathias Payer Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-CS510-SE Spring 2015 Overview


slide-1
SLIDE 1

CS510 Software Engineering

Dynamic Program Analysis

  • Asst. Prof. Mathias Payer

Department of Computer Science Purdue University TA: Scott A. Carr Slides inspired by Xiangyu Zhang http://nebelwelt.net/teaching/15-CS510-SE

Spring 2015

slide-2
SLIDE 2

Overview

Table of Contents

1

Overview

2

DPA Primitives

3

Tracing definition

4

Use-cases for Tracing

5

How to Trace Source to Source Instrumentation Binary Instrumentation FastBT, Generating Fast Binary Translators

6

Reducing Trace Size Basic block-level Tracing Alternatives to Reduce Trace Size Compression Using Value Predictors

Mathias Payer (Purdue University) CS510 Software Engineering 2015 2 / 35

slide-3
SLIDE 3

Overview

Overview

Dynamic program analysis tackles software dependability and productivity problems by inspecting software execution. A program execution captures runtime behavior of a program (think class and object). Dynamic analysis follows path through the program: each statement is executed {0, N} times. The analysis is restricted to a single path. All variables are instantiated (solving the aliasing problem of static analysis).

Mathias Payer (Purdue University) CS510 Software Engineering 2015 3 / 35

slide-4
SLIDE 4

Overview

Advantages

Relatively low learning curve. Precision. Applicability. Scalability.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 4 / 35

slide-5
SLIDE 5

Overview

Disadvantages?

Neither generalizable nor complete. Limited to available test-cases. Possible runtime constraints (Heisenbugs)

Mathias Payer (Purdue University) CS510 Software Engineering 2015 5 / 35

slide-6
SLIDE 6

DPA Primitives

Table of Contents

1

Overview

2

DPA Primitives

3

Tracing definition

4

Use-cases for Tracing

5

How to Trace Source to Source Instrumentation Binary Instrumentation FastBT, Generating Fast Binary Translators

6

Reducing Trace Size Basic block-level Tracing Alternatives to Reduce Trace Size Compression Using Value Predictors

Mathias Payer (Purdue University) CS510 Software Engineering 2015 6 / 35

slide-7
SLIDE 7

DPA Primitives

Dynamic Program Analysis Primitives

Tracing Profiling Checkpoint and replay Dynamic slicing Execution indexing Delta debugging

Mathias Payer (Purdue University) CS510 Software Engineering 2015 7 / 35

slide-8
SLIDE 8

DPA Primitives

Applications

Taint tracking Dynamic information flow tracking Automated debugging

Mathias Payer (Purdue University) CS510 Software Engineering 2015 8 / 35

slide-9
SLIDE 9

Tracing definition

Table of Contents

1

Overview

2

DPA Primitives

3

Tracing definition

4

Use-cases for Tracing

5

How to Trace Source to Source Instrumentation Binary Instrumentation FastBT, Generating Fast Binary Translators

6

Reducing Trace Size Basic block-level Tracing Alternatives to Reduce Trace Size Compression Using Value Predictors

Mathias Payer (Purdue University) CS510 Software Engineering 2015 9 / 35

slide-10
SLIDE 10

Tracing definition

Tracing definition

Tracing Tracing is a lossless process that faithfully records detailed information of a program’s execution. Tracing is a basic and simple primitive.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 10 / 35

slide-11
SLIDE 11

Tracing definition

Types of Tracing

Control-flow tracing (sequence of executed statements); Dependence tracing (sequence of exercised dependences); Value tracing (sequence of values produced by each instruction); Memory access tracing (sequence of memory accesses during execution).

Mathias Payer (Purdue University) CS510 Software Engineering 2015 11 / 35

slide-12
SLIDE 12

Use-cases for Tracing

Table of Contents

1

Overview

2

DPA Primitives

3

Tracing definition

4

Use-cases for Tracing

5

How to Trace Source to Source Instrumentation Binary Instrumentation FastBT, Generating Fast Binary Translators

6

Reducing Trace Size Basic block-level Tracing Alternatives to Reduce Trace Size Compression Using Value Predictors

Mathias Payer (Purdue University) CS510 Software Engineering 2015 12 / 35

slide-13
SLIDE 13

Use-cases for Tracing

Use-cases for Tracing

Debugging: time-travel to understand interactions; Code optimizations: hot program paths, data compression, value speculation, data locality for cache optimization; Security: malware analysis; Testing: code coverage.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 13 / 35

slide-14
SLIDE 14

How to Trace

Table of Contents

1

Overview

2

DPA Primitives

3

Tracing definition

4

Use-cases for Tracing

5

How to Trace Source to Source Instrumentation Binary Instrumentation FastBT, Generating Fast Binary Translators

6

Reducing Trace Size Basic block-level Tracing Alternatives to Reduce Trace Size Compression Using Value Predictors

Mathias Payer (Purdue University) CS510 Software Engineering 2015 14 / 35

slide-15
SLIDE 15

How to Trace

Tracing by printf

1 i n t max = 0; 2 f o r

(p = head ; p ; p = p− >next ) {

3

p r i n t f ( ” in loop \n” ) ;

4

i f (p− >value > max) {

5

p r i n t f ( ”True branch \n” ) ;

6

max = p− >value ;

7

}

8 } Mathias Payer (Purdue University) CS510 Software Engineering 2015 15 / 35

slide-16
SLIDE 16

How to Trace Source to Source Instrumentation

Tracing by Source-Level Instrumentation

Parse a source file into an AST. Annotate the AST with instrumentation. Translate the annotated trees into a new source file. Compile the new sources. Execute the program and produce a trace as side-effect.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 16 / 35

slide-17
SLIDE 17

How to Trace Source to Source Instrumentation

Source-Level Instrumentation Example

1 f o r

( i = 1; i < 10; i++) {

2

a [ i ] = b [ i ] ∗ 5;

3 }

for i 1 10 = [] a i * [] b i 5

Mathias Payer (Purdue University) CS510 Software Engineering 2015 17 / 35

slide-18
SLIDE 18

How to Trace Source to Source Instrumentation

Source-Level Instrumentation Example (2)

1 f o r

( i = 1; i < 10; i++) {

2

p r i n t f ( ” In loop \n” ) ;

3

a [ i ] = b [ i ] ∗ 5;

4 }

for i 1 10 ; printf = [] a i * [] b i 5

Mathias Payer (Purdue University) CS510 Software Engineering 2015 18 / 35

slide-19
SLIDE 19

How to Trace Source to Source Instrumentation

Characteristics of Source-Level Instrumentation

Detailed type and variable information available. Detailed control-flow structures available. No support for pre-compiled libraries or binaries. Limited support for multi-lingual programs. Requires full source-code.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 19 / 35

slide-20
SLIDE 20

How to Trace Binary Instrumentation

Tracing by Binary Instrumentation

Parse binary into intermediate representation, generate graph data structures like CFG. Instrument IR with tracing nodes. Compile/assemble back to an executable for static binary instrumentation or use a JIT to execute on-the-fly.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 20 / 35

slide-21
SLIDE 21

How to Trace Binary Instrumentation

Characteristics of Binary-Level Instrumentation

No source-code needed. Supports libraries and any executable. Possibly high overhead due to instrumentation and translation. Limited scope and high-level data structures available.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 21 / 35

slide-22
SLIDE 22

How to Trace FastBT, Generating Fast Binary Translators

FastBT

Enable fast, efficient instrumentation at low overhead. Instead of converting machine code to an IR, translate using pre-generated tables. Define a set of translation actions that add instrumentation when dispatched. Use a code-cache to lower overhead. Challenge: define translation actions for instructions that change control-flow.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 22 / 35

slide-23
SLIDE 23

How to Trace FastBT, Generating Fast Binary Translators

FastBT Overview

  • Translates individual basic blocks
  • Verifies code source / destination
  • Checks branch targets and origins

1 1' 2 2' 3 3' … ... Original code Code cache Mapping table Translator 1 2 4 3 1' 2' 3' R RX

Indirect control flow transfers use a dynamic check to verify target and origin

Reading material: Generating low-overhead dynamic binary translators, Mathias Payer and Thomas R. Gross, SySTOR’10 (see course homepage).

Mathias Payer (Purdue University) CS510 Software Engineering 2015 23 / 35

slide-24
SLIDE 24

Reducing Trace Size

Table of Contents

1

Overview

2

DPA Primitives

3

Tracing definition

4

Use-cases for Tracing

5

How to Trace Source to Source Instrumentation Binary Instrumentation FastBT, Generating Fast Binary Translators

6

Reducing Trace Size Basic block-level Tracing Alternatives to Reduce Trace Size Compression Using Value Predictors

Mathias Payer (Purdue University) CS510 Software Engineering 2015 24 / 35

slide-25
SLIDE 25

Reducing Trace Size

Fine-grained Tracing is Expensive!

1 i n t sum = 0; 2 i n t

i = 1;

3 while

( i < N) {

4

i ++;

5

sum = sum + i ;

6 } 7 p r i n t f ( ”Sum: %d\n” , sum) ;

Trace (N = 6): 1, 2, 3, 4, 5, 3, 4, 5, 6, 3, 4, 5, 6, 3, 4, 5, 6, 3, 4, 5, 6, 3, 7. Space complexity: exec length ∗ sizeof (void∗)

Mathias Payer (Purdue University) CS510 Software Engineering 2015 25 / 35

slide-26
SLIDE 26

Reducing Trace Size Basic block-level Tracing

Basic block-level Tracing

1 i n t sum = 0; 2 i n t

i = 1;

3 while

( i < N) {

4

i ++;

5

sum = sum + i ;

6 } 7 p r i n t f ( ”Sum: %d\n” , sum) ;

BB Trace: 1-2, 3, 4-5, 3, 4-5, 3, 4-5, 3, 4-5, 3, 4-5, 3, 7 In this example only 13/19 storage needed. Drawback: seeking inside basic block is more complicated.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 26 / 35

slide-27
SLIDE 27

Reducing Trace Size Alternatives to Reduce Trace Size

Other options to reduce trace size?

Function-level tracing (i.e., recording functions and their parameters)(What about side-effects?) Predicate tracing (i.e., record all branch predicates from beginning of execution (needs only one bit per branch)(Seeking is hard) Path-based tracing (record path through CFG)(Needs heavy-weight data structures) Compression using, e.g., deflate(Relies on decompression, no seeking)

Mathias Payer (Purdue University) CS510 Software Engineering 2015 27 / 35

slide-28
SLIDE 28

Reducing Trace Size Compression Using Value Predictors

Last n Values Predictor: Compression

Buffer stores the last n unique encountered values. If the next value is one of the n values then the index into the buffer is emitted (prefixed with symbol 0). Otherwise (mis-prediction) store the encountered value to the encoded trace (prefixed with symbol m), update the buffer with a least used strategy. Example: 123 456 456 456 456 123 123 789 456 Use last-2 predictor: m 123 m 456 00 00 00 01 01 m 789 m 456

Mathias Payer (Purdue University) CS510 Software Engineering 2015 28 / 35

slide-29
SLIDE 29

Reducing Trace Size Compression Using Value Predictors

Last n Values Predictor: Decompression

Take one bit from encoded trace. If m symbol then read next value and update buffer. If 0 symbol read index and print value from table. n-Value Predictors are related to Run-Length Encoding (RLE).

Mathias Payer (Purdue University) CS510 Software Engineering 2015 29 / 35

slide-30
SLIDE 30

Reducing Trace Size Compression Using Value Predictors

Finite Context Method (FCM)

Construct a lookup-table that predicts a value based on the last n values (2-FCM, 3-FCM). If the next value is correctly predicted using the left context, a 0-bit is emitted to the encoded trace. Otherwise (mis-prediction), an m-symbol and the original value are emitted to the trace. The lookup-table is updated accordingly. Example (3-FCM): 1 2 3 4 5 3 4 5 ... 3 4 5 6 m 1 m 2 m 3 m 4 m 5 m 3 m 4 m 5 0 . . . 0 0 0 m 6

Mathias Payer (Purdue University) CS510 Software Engineering 2015 30 / 35

slide-31
SLIDE 31

Reducing Trace Size Compression Using Value Predictors

FCM Characteristics

Length (compressed): n/sizeof (void∗) + n ∗ (1 − predict rate). Predictors are better than deflate due to repetitive loop patterns. Drawback: trace is only forward traversable.

Mathias Payer (Purdue University) CS510 Software Engineering 2015 31 / 35

slide-32
SLIDE 32

Reducing Trace Size Compression Using Value Predictors

Bidirectional Compression

Use a small sliding window of clear text on the compressed string (just like with FCM)1 Keep both left-context and right-context lookup table (instead

  • f just left-context lookup table).

Moving forward: decompress next value using left-context lookup table (sliding window is now n+1), compress the first value using the right-context lookup table (sliding window is now n again). Moving backward: decompress using right-context, compress using left-context.

1The left and right side of the window stay compressed

Mathias Payer (Purdue University) CS510 Software Engineering 2015 32 / 35

slide-33
SLIDE 33

Reducing Trace Size Compression Using Value Predictors

Bidirectional Compression: Example

... A X Y 0 ... Left-context: A X Y: Z Compress right-context: X Y Z: A2

2If correct prediction, emit 0 to right-context stream, otherwise update table

and emit m symbol

Mathias Payer (Purdue University) CS510 Software Engineering 2015 33 / 35

slide-34
SLIDE 34

Reducing Trace Size Compression Using Value Predictors

Bidirectional Predictor Characteristics

Almost same compression rate as unidirectional predictors. (Possibly slightly worse due to different prediction rate for forward/backward). Fast compression/decompression (two times slower than unidirectional predictors).

Mathias Payer (Purdue University) CS510 Software Engineering 2015 34 / 35

slide-35
SLIDE 35

Reducing Trace Size Compression Using Value Predictors

Questions?

?

Mathias Payer (Purdue University) CS510 Software Engineering 2015 35 / 35