Analysis of Overhead in Dynamic Java Performance Monitoring Vojtch - - PowerPoint PPT Presentation

analysis of overhead in dynamic java performance
SMART_READER_LITE
LIVE PREVIEW

Analysis of Overhead in Dynamic Java Performance Monitoring Vojtch - - PowerPoint PPT Presentation

Analysis of Overhead in Dynamic Java Performance Monitoring Vojtch Hork, Jaroslav Kotr, Peter Libi and Petr Tma Charles University in Prague Context: Dynamic Monitoring of Production Systems Dynamic Monitoring of Production Systems


slide-1
SLIDE 1

Analysis of Overhead in Dynamic Java Performance Monitoring

Vojtěch Horký, Jaroslav Kotrč, Peter Libič and Petr Tůma

Charles University in Prague

slide-2
SLIDE 2

Context: Dynamic Monitoring

  • f Production Systems
slide-3
SLIDE 3

Dynamic Monitoring of Production Systems

Measurement probes are active only when needed, measuring everything all the time might not be practical.

slide-4
SLIDE 4

Dynamic Monitoring of Production Systems

Measurement probes are active only when needed, measuring everything all the time might not be practical.

function of interest Application

measurements database

dynamic monitoring control

slide-5
SLIDE 5

Dynamic Monitoring of Production Systems

Measurement probes are active only when needed, measuring everything all the time might not be practical.

function of interest Application

measurements database

dynamic monitoring control We are interested in performance of this function

slide-6
SLIDE 6

Dynamic Monitoring of Production Systems

Measurement probes are active only when needed, measuring everything all the time might not be practical.

function of interest Application get time get time store difference

measurements database

dynamic monitoring control

slide-7
SLIDE 7

Dynamic Monitoring of Production Systems

Measurement probes are active only when needed, measuring everything all the time might not be practical.

function of interest Application get time get time store difference

measurements database

dynamic monitoring control Code is dynamically instrumented when measuring.

slide-8
SLIDE 8

Dynamic Monitoring of Production Systems

Measurement probes are active only when needed, measuring everything all the time might not be practical.

function of interest Application

measurements database

dynamic monitoring control Once enough data is collected, probes are removed.

slide-9
SLIDE 9

Issues of Dynamic Monitoring

In managed environments, code is compiled at run-time; probe insertion (removal) causes recompilation. Monitored application can thus behave differently.

slide-10
SLIDE 10

Issues of Dynamic Monitoring

In managed environments, code is compiled at run-time; probe insertion (removal) causes recompilation. Monitored application can thus behave differently.

Interesting Questions

How do the code manipulations affect the application? What is the overhead of such probe? Is the observed performance representative? Is there zero overhead once the probe is removed?

slide-11
SLIDE 11

Experiment Setup

slide-12
SLIDE 12

Experiment Coordination

application code dynamic monitoring framework

measurements database

slide-13
SLIDE 13

Experiment Coordination

application code dynamic monitoring framework Experiment

measurements database

with static probes

measurements database

CPU & JVM monitoring experiment coordination

slide-14
SLIDE 14

Two Measurement Infrastructures

Self-measurement Dynamic monitoring

slide-15
SLIDE 15

Two Measurement Infrastructures

Self-measurement Dynamic monitoring Performance Baseline “Observed”

slide-16
SLIDE 16

Two Measurement Infrastructures

Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both)

slide-17
SLIDE 17

Two Measurement Infrastructures

Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time)

slide-18
SLIDE 18

Two Measurement Infrastructures

Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time) Data collection Continuous On demand

slide-19
SLIDE 19

Two Measurement Infrastructures

Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time) Data collection Continuous On demand Implementation Native method (in C) Pure Java

slide-20
SLIDE 20

Two Measurement Infrastructures

Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time) Data collection Continuous On demand Implementation Native method (in C) Pure Java

dynamic probe static probe function of interest

static (self) measurement dynamic measurement

slide-21
SLIDE 21

Experiment Process

slide-22
SLIDE 22

Experiment Process

run for some time

slide-23
SLIDE 23

Experiment Process

run for some time pick random method

slide-24
SLIDE 24

Experiment Process

run for some time insert dynamic probe pick random method Dynamic monitoring

slide-25
SLIDE 25

Experiment Process

run for some time run for some time insert dynamic probe pick random method Dynamic monitoring

slide-26
SLIDE 26

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe pick random method Dynamic monitoring

slide-27
SLIDE 27

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe pick random method Dynamic monitoring What is the

  • bserved

performance?

slide-28
SLIDE 28

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe pick random method Dynamic monitoring How fast it runs with dynamic monitoring? What is the

  • bserved

performance?

slide-29
SLIDE 29

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe pick random method Dynamic monitoring How fast it runs with dynamic monitoring? What is the

  • bserved

performance?

slide-30
SLIDE 30

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe pick random method Dynamic monitoring run for some time How fast it runs with dynamic monitoring? What is the

  • bserved

performance?

slide-31
SLIDE 31

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time How fast it runs with dynamic monitoring? What is the

  • bserved

performance?

slide-32
SLIDE 32

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time How fast it runs without dynamic monitoring? How fast it runs with dynamic monitoring? What is the

  • bserved

performance?

slide-33
SLIDE 33

Experiment Process

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time How fast it runs without dynamic monitoring? How fast it runs with dynamic monitoring? What is the

  • bserved

performance?

slide-34
SLIDE 34

Platform and Application Details

slide-35
SLIDE 35

Platform and Application Details

– Hardware: 32 CPUs, 2 NUMA nodes, 48G RAM.

slide-36
SLIDE 36

Platform and Application Details

– Hardware: 32 CPUs, 2 NUMA nodes, 48G RAM. – SPECjbb2015 augmented with static probes.

– Fixed request rate 4 000 reqs/s.

(Close to maximum with static probes on our hardware.)

– Over 1 200 monitored methods.

– Business code of the benchmark. – Practically all methods called frequently enough. – About one minute of dynamic monitoring per method.

slide-37
SLIDE 37

Platform and Application Details

– Hardware: 32 CPUs, 2 NUMA nodes, 48G RAM. – SPECjbb2015 augmented with static probes.

– Fixed request rate 4 000 reqs/s.

(Close to maximum with static probes on our hardware.)

– Over 1 200 monitored methods.

– Business code of the benchmark. – Practically all methods called frequently enough. – About one minute of dynamic monitoring per method.

– Several TBs of raw data per week of run-time.

slide-38
SLIDE 38

Results

slide-39
SLIDE 39

Overall Overhead of Dynamic Monitoring

slide-40
SLIDE 40

Overall Overhead of Dynamic Monitoring

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Record CPU utilization with dynamic monitoring ... … and without it.

slide-41
SLIDE 41

Overall Overhead of Dynamic Monitoring

70 72 74 76 78 80 82 800 400 400 800 Without dynamic monitoring With dynamic monitoring CPU utilization [%] Frequency

slide-42
SLIDE 42

Overall Overhead of Dynamic Monitoring

70 72 74 76 78 80 82 800 400 400 800 Without dynamic monitoring With dynamic monitoring CPU utilization [%] Frequency

Measuring one method (even a hot one) at a time brings no significant overhead.

slide-43
SLIDE 43

Time Needed for Just-in-time Recompilation

slide-44
SLIDE 44

Time Needed for Just-in-time Recompilation

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Record Just-in-time compiler events here ... … and here.

slide-45
SLIDE 45

Time Needed for Just-in-time Recompilation

10 20 30 40 600 300 300 600 Recompilation duration [s] (waited for a minute without JIT activity) Frequency Instrumentation (probe inserted) Deinstrumentation (probe removed)

slide-46
SLIDE 46

Time Needed for Just-in-time Recompilation

10 20 30 40 600 300 300 600 Recompilation duration [s] (waited for a minute without JIT activity) Frequency Instrumentation (probe inserted) Deinstrumentation (probe removed)

JIT compiler typically needs at least 30 s to finish recompilations after probe insertion (removal).

slide-47
SLIDE 47

Accuracy of Collected Data

slide-48
SLIDE 48

Accuracy of Collected Data

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Ratio between

  • bserved and baseline

performance.

slide-49
SLIDE 49

Accuracy of Collected Data

0.0 0.2 0.4 0.6 0.8 1.0 1.2 50 100 150 200 Method execution time (static probe) [s] Ratio between times reported by static and dynamic probes

slide-50
SLIDE 50

Accuracy of Collected Data

10 20 30 40 1.0 2.0 3.0 4.0 Method execution time (static probe) [µs] Ratio between times reported by static and dynamic probes

slide-51
SLIDE 51

Accuracy of Collected Data

10 20 30 40 1.0 2.0 3.0 4.0 Method execution time (static probe) [µs] Ratio between times reported by static and dynamic probes

Interpretation of numbers from dynamic monitoring: Observed Actual 50 µs 45 µs – 50 µs 20 µs 10 µs – 20 µs 2 µs

1 2 µs – 2 µs

slide-52
SLIDE 52

Impact of Dynamic Monitoring

slide-53
SLIDE 53

Impact of Dynamic Monitoring

run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Ratio of baseline performance with and without dynamic monitoring

slide-54
SLIDE 54

Impact of Dynamic Monitoring

0.0 0.5 1.0 1.5 2.0 2.5 3.0 20 40 60 Static probes: ratio of mean execution times during and after dynamic instrumentation Frequency

Faster when Slower when being monitored being monitored

⇐ ⇒

slide-55
SLIDE 55

Impact of Dynamic Monitoring

0.0 0.5 1.0 1.5 2.0 2.5 3.0 20 40 60 Static probes: ratio of mean execution times during and after dynamic instrumentation Frequency

Faster when Slower when being monitored being monitored

⇐ ⇒

slide-56
SLIDE 56

Impact of Dynamic Monitoring

0.0 0.5 1.0 1.5 2.0 2.5 3.0 20 40 60 Static probes: ratio of mean execution times during and after dynamic instrumentation Frequency

Faster when Slower when being monitored being monitored

⇐ ⇒ Dynamic monitoring can observe shorter times (as if the probes speeded-up the application).

slide-57
SLIDE 57

Conclusion

slide-58
SLIDE 58

Analysis of Overhead in Dynamic Java Performance Monitoring

We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.

slide-59
SLIDE 59

Analysis of Overhead in Dynamic Java Performance Monitoring

We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.

Rules of thumb coming from our experiment . . .

– Measuring one method at a time does not change CPU utilization. – At least 30 s are needed for (JIT) recompilation. – If the reported time is 30 µs . . . . . . the actual duration is between 20 µs and 40 µs (durations of at least 100 µs are more “trustworthy”, though).

slide-60
SLIDE 60

Analysis of Overhead in Dynamic Java Performance Monitoring

We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.

Rules of thumb coming from our experiment . . .

– Measuring one method at a time does not change CPU utilization. – At least 30 s are needed for (JIT) recompilation. – If the reported time is 30 µs . . . . . . the actual duration is between 20 µs and 40 µs (durations of at least 100 µs are more “trustworthy”, though).

http://d3s.mff.cuni.cz/resources/icpe2016

slide-61
SLIDE 61

Analysis of Overhead in Dynamic Java Performance Monitoring

We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.

Rules of thumb coming from our experiment . . .

– Measuring one method at a time does not change CPU utilization. – At least 30 s are needed for (JIT) recompilation. – If the reported time is 30 µs . . . . . . the actual duration is between 20 µs and 40 µs (durations of at least 100 µs are more “trustworthy”, though).

http://d3s.mff.cuni.cz/resources/icpe2016

Thank You!