Dynamic and Adaptive Calling Context Encoding Jianjun Li , Zhenjiang - - PowerPoint PPT Presentation

dynamic and adaptive calling context encoding
SMART_READER_LITE
LIVE PREVIEW

Dynamic and Adaptive Calling Context Encoding Jianjun Li , Zhenjiang - - PowerPoint PPT Presentation

Dynamic and Adaptive Calling Context Encoding Jianjun Li , Zhenjiang Wang, Chenggang Wu State Key Laboratory of Computer Architecture Institute of Computing Technology, CAS Wei-Chung Hsu Di Xu Department of Computer Sciences, IBM Research -


slide-1
SLIDE 1

Dynamic and Adaptive Calling Context Encoding

Jianjun Li, Zhenjiang Wang, Chenggang Wu State Key Laboratory of Computer Architecture Institute of Computing Technology, CAS

Wei-Chung Hsu Department of Computer Sciences, National Taiwan University Di Xu IBM Research - China CGO 2014, Orlando, Florida

1

slide-2
SLIDE 2

Introduction

  • Calling contexts are the sequence of active functions on

call stack

  • Calling contexts play an important role in a wide range of

software development processes.

  • Testing
  • Debugging and error reporting
  • Program analysis
  • Security enforcement

2

slide-3
SLIDE 3

Existing Approaches

  • Accurate calling context
  • Stack Walking, Calling context trees or calling context up trees
  • High overhead
  • Precise calling context encoding (ICSE’2010)
  • Static encoding method, work only on complete call graph
  • Unable to handle dynamic loading and virtual dispatch
  • Inaccurate calling context
  • Inferred Call Path Profiling (OOPSLA ’09)
  • Low overhead but not precise enough
  • Hash based path encoding: Probabilistic Calling Context (OOPSLA

’07), Breadcrumbs (PLDI’2010)

  • Trade accuracy to performance

3

slide-4
SLIDE 4

Background: Calling Context Encoding

  • Calling context encoding
  • Based on Ball-Larus path

encoding algorithm (BL algorithm)

  • Encode a call path to an

integer

  • Accurate calling context
  • Low overhead

A B C F D E

+1

1 1 1 2 2 2

4

slide-5
SLIDE 5

Background: Calling Context Encoding

  • Problems:
  • Static encoding method, work only on complete

call graph

  • Unable to handle dynamic loading and virtual

dispatch

  • Need profiling runs or pointer analysis to

identify the targets of indirect calls

  • Not efficient in encoding space

5

slide-6
SLIDE 6

Outline

  • Our Goals and Key Challenges
  • Dynamic Encoding Method
  • Adaptive Encoding Method
  • Experimental Results
  • Summary

6

slide-7
SLIDE 7

Our goals

A dynamic and adaptive context encoding algorithm:

Does not need extra profiling runs or static program analysis Handle dynamic loadings Adaptive to program behavior changes Efficient in encoding space and time Accurate context information

7

slide-8
SLIDE 8

Key Challenges

  • How to handle newly identified call edges?
  • Indirect call paths
  • Dynamic loadings
  • How to ensure the collected path ids be correctly

decoded?

  • The encodings of call edges may change after

adaptive encoding.

8

slide-9
SLIDE 9

Dynamic Encoding Method Overview

call paths which are existing while encoding the call graph call paths which contain newly identified call edges

Encoding Space

maxID 2*maxID+1 Call Graph: maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

9

slide-10
SLIDE 10

Dynamic Encoding

main Initially, the call graph only contain the entry function “main”.

10

slide-11
SLIDE 11

Dynamic Encoding

main Initially, the call graph only contain the entry function “main”. Replace all function call instructions with “call rtHandler”.

10

slide-12
SLIDE 12

Dynamic Encoding

main

A C D

Initially, the call graph only contain the entry function “main”. Replace all function call instructions with “call rtHandler”. In rtHandler, update the call graph and instrument that edge.

save the encoding context id = maxID + 1 call A restore the encoding context

10

slide-13
SLIDE 13

Adaptive Encoding

  • Why adaptive encoding?
  • reduce the runtime overhead
  • adaptive to program’s runtime behavior
  • Trigger conditions of adaptive encoding:
  • The number of identified call edges reaches a threshold.
  • The frequently invoked call paths have changed.
  • The helper stack is frequently accessed.

11

slide-14
SLIDE 14

Adaptive Encoding

  • Adaptive encoding process:
  • Decode and analyze the collected contexts,

mark the frequently invoked call edges.

  • Encode the call graph, and adjust the encodings

according to the invocation frequency.

  • Instrument the program with the new encodings.

12

slide-15
SLIDE 15

Adaptive Encoding

13

slide-16
SLIDE 16

Adaptive Encoding

main

A

main

A

main

C

A

main

C B

A

main

C B A

main

C B

+1

A

main

C B

+1

D A

main

C B

+1

D E

… … … …

timestamp=0 timestamp=1 timestamp=2

13

slide-17
SLIDE 17

Recursive Calls

  • BL path encoding algorithm only woks on acyclic

graph.

  • Recursive call paths will be encoded into range

[maxID+1, 2*maxID+1].

  • For highly repetitive recursive calls, the saved

encoding contexts will be compressed.

14

slide-18
SLIDE 18

Indirect Calls

  • An indirect call may have multiple targets.
  • After re-encoding, the identified targets are

instrumented separately.

X

slide-19
SLIDE 19

Decoding Mechanism

  • Call graph is growing dynamically as the program runs.
  • To correctly decode the recorded context, we need the

exact call graph and encoding information when the context is recorded.

15

slide-20
SLIDE 20

Decoding Algorithm

  • Use a flag “onstack” to indicate if there is an unencoded

call edge in current sub-path.

  • If the encoding id of a sub-path is bigger than maxID, then

adjust id=id-(maxID+1) and set onstack=true.

  • In each decoding iteration:

1) If id=0 and onstack=true (i.e. id=maxID+1), then try to match the decoded context with the saved encoding context on the top of helper stack. 2) Decode the acyclic sub-path.

16

slide-21
SLIDE 21

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id Helper Stack

17

slide-22
SLIDE 22

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A

Helper Stack

18

slide-23
SLIDE 23

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

19

slide-24
SLIDE 24

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

D

20

slide-25
SLIDE 25

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

D E

21

slide-26
SLIDE 26

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

D E I 2

22

slide-27
SLIDE 27

2, I, C

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

D E I 2 C 5

23

slide-28
SLIDE 28

5,C,E 2, I, C

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

D E I 2 C 5 E 5

24

slide-29
SLIDE 29

5,C,E 2, I, C

Encoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Last Called

id

A B

Helper Stack

D E I 2 C 5 E 5 I 7

25

slide-30
SLIDE 30

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Encoding result: pc in function I, id=7

2, I, C

Helper Stack

5,C,E

slide-31
SLIDE 31

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Encoding result: pc in function I, id=7

2, I, C

Helper Stack

5,C,E

Decoding Intialization: a) print “I” b) (id=7) > (maxID=4), so adjust id=id-(maxID+1)=2 and set

  • nstack=ture.
slide-32
SLIDE 32

current condition: pc in function I, id=2,

  • nstack=ture

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

2, I, C

Helper Stack

5,C,E

Decoding step 1: a) Since id!=0, continue decoding current sub-path. b) Edge EI is decoded, and id = 2-2 = 0. c) Print “E”.

slide-33
SLIDE 33

current condition: pc in function E, id=0,

  • nstack=ture

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

2, I, C

Helper Stack 5,C,E Decoding step 2: a) Since id=0, onstack=true and the encoding context on the helper stack’s top entry matches current context, popup the top entry. b) Restore current encoding context with the popped encoding context. c) Print “C”.

slide-34
SLIDE 34

current condition: pc in function C, id=5,

  • nstack=false.

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

2, I, C Helper Stack Decoding step 3: a) (id=5)>(maxID=4), so adjust the value of id=id-(maxID=1)=0 and set

  • nstack=true.

b) Since id=0, onstack=true and the encoding context on the helper stack’s top entry matches current context, popup the top entry. b) Restore current encoding context with the popped encoding context. c) Print “I”.

slide-35
SLIDE 35

current condition: pc in function I, id=2,

  • nstack=false.

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Helper Stack Decoding step 4: a) Since onstack=false, the acyclic sub-path “ABDEI” is decoded. b) Print “E”, “D”, “B”, “A”.

slide-36
SLIDE 36

current condition: pc in function I, id=2,

  • nstack=false.

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Helper Stack Decoding step 4: a) Since onstack=false, the acyclic sub-path “ABDEI” is decoded. b) Print “E”, “D”, “B”, “A”.

slide-37
SLIDE 37

current condition: pc in function A, id=0,

  • nstack=false.

Decoding Example

maxID=4

A B C E D F I

1 1 1 2 2 2 5

+1 +4 +2

Helper Stack Decoding iteration 5: a) id=0 and helper stack is empty, so the decoding process terminates. b) Finally, we get the full path “ABDEICEI”.

slide-38
SLIDE 38

Evaluation

  • Experimental Framework
  • Implemented as a shared library
  • To verify the correctness of DACCE, we periodically collect

context ids at runtime. we also capture the calling contexts with a stack-walking method. The contexts obtained by the two methods are cross validated.

  • Benchmarks
  • SPEC CPU2006 (ref input set)
  • Parsec 2.1 (native input set)

33

slide-39
SLIDE 39

Benchmarks

Program Nodes Edges maxID depth re-encode calls/s 400.perlbenc h 684 3911 1.4E+11 0.20 23 29205101 401.bzip2 50 109 61 0.05 5 7687097 403.gcc 1931 11518 7.0E+13 0.00 110 14710894 429.mcf 11 12 3 0.01 2 295581 445.gobmk 1378 4808 2.4E+11 2.47 76 1335556 … … 483.xalancb mk 2170 7321 1422838 6.01 27 25341805 410.bwaves 82 164 73 0.01 6 263845 416.gamess 362 2017 112645 0.03 19 3390329 … … 447.dealII 792 3369 1132 0.06 47 19533456 450.soplex 225 453 367 0.07 7 312430 453.povray 548 2201 548645 0.76 6 34335309 … … blackschole s 3 5 5 0.00 11 14646244 bodytrack 218 894 667 0.01 5 6928160 … … x264 221 1052 2017 0.00 4 23984355

34

slide-40
SLIDE 40

Benchmarks

Program Nodes Edges maxID depth re-encode calls/s 400.perlbenc h 684 3911 1.4E+11 0.20 23 29205101 401.bzip2 50 109 61 0.05 5 7687097 403.gcc 1931 11518 7.0E+13 0.00 110 14710894 429.mcf 11 12 3 0.01 2 295581 445.gobmk 1378 4808 2.4E+11 2.47 76 1335556 … … 483.xalancb mk 2170 7321 1422838 6.01 27 25341805 410.bwaves 82 164 73 0.01 6 263845 416.gamess 362 2017 112645 0.03 19 3390329 … … 447.dealII 792 3369 1132 0.06 47 19533456 450.soplex 225 453 367 0.07 7 312430 453.povray 548 2201 548645 0.76 6 34335309 … … blackschole s 3 5 5 0.00 11 14646244 bodytrack 218 894 667 0.01 5 6928160 … … x264 221 1052 2017 0.00 4 23984355

34

slide-41
SLIDE 41

Benchmarks

Program Nodes Edges maxID depth re-encode calls/s 400.perlbenc h 684 3911 1.4E+11 0.20 23 29205101 401.bzip2 50 109 61 0.05 5 7687097 403.gcc 1931 11518 7.0E+13 0.00 110 14710894 429.mcf 11 12 3 0.01 2 295581 445.gobmk 1378 4808 2.4E+11 2.47 76 1335556 … … 483.xalancb mk 2170 7321 1422838 6.01 27 25341805 410.bwaves 82 164 73 0.01 6 263845 416.gamess 362 2017 112645 0.03 19 3390329 … … 447.dealII 792 3369 1132 0.06 47 19533456 450.soplex 225 453 367 0.07 7 312430 453.povray 548 2201 548645 0.76 6 34335309 … … blackschole s 3 5 5 0.00 11 14646244 bodytrack 218 894 667 0.01 5 6928160 … … x264 221 1052 2017 0.00 4 23984355

34

slide-42
SLIDE 42

Runtime Overhead

Runtime ¡Overhead 0% 1% 2% 3% 4% 5% 6% 7% 8% 9% 10% 400.perlbench 401.bzip2 403.gcc 429.mcf 445.gobmk 456.hmmer 458.sjeng 462.libquantum 464.h264ref 471.omnetpp 473.astar 483.xalancbmk 410.bwaves 416.gamess 433.milc 434.zeusmp 435.gromacs 436.cactusADM 437.leslie3d 444.namd 447.dealII 450.soplex 453.povray 454.calculix 459.GemsFDTD 465.tonto 470.lbm 481.wrf 482.sphinx3 blackscholes bodytrack facesim ferret raytrace swaptions fluidanimate vips x264 canneal dedup streamcluster geomean DACCE

35

slide-43
SLIDE 43

Adaptive Encoding

36

slide-44
SLIDE 44

Conclusions

A dynamic and adaptive context encoding algorithm:

Does not need extra profiling runs or static program analysis Handle dynamic loadings Adaptive to program behavior changes Efficient in encoding space and time Accurate context information

37

slide-45
SLIDE 45

Thank you & Questions?

38

slide-46
SLIDE 46

39

call ¡*target L1: … ¡… ¡ store ¡GPRs spin_lock push ¡%rsp call ¡HandleCallRT release ¡spinlock restore ¡GPRs Context ¡Switch ¡ Code Call ¡ContextSwitch L1: … ¡… ¡ Patch