SLIDE 1 Analysis of Overhead in Dynamic Java Performance Monitoring
Vojtěch Horký, Jaroslav Kotrč, Peter Libič and Petr Tůma
Charles University in Prague
SLIDE 2 Context: Dynamic Monitoring
SLIDE 3
Dynamic Monitoring of Production Systems
Measurement probes are active only when needed, measuring everything all the time might not be practical.
SLIDE 4 Dynamic Monitoring of Production Systems
Measurement probes are active only when needed, measuring everything all the time might not be practical.
function of interest Application
measurements database
dynamic monitoring control
SLIDE 5 Dynamic Monitoring of Production Systems
Measurement probes are active only when needed, measuring everything all the time might not be practical.
function of interest Application
measurements database
dynamic monitoring control We are interested in performance of this function
SLIDE 6 Dynamic Monitoring of Production Systems
Measurement probes are active only when needed, measuring everything all the time might not be practical.
function of interest Application get time get time store difference
measurements database
dynamic monitoring control
SLIDE 7 Dynamic Monitoring of Production Systems
Measurement probes are active only when needed, measuring everything all the time might not be practical.
function of interest Application get time get time store difference
measurements database
dynamic monitoring control Code is dynamically instrumented when measuring.
SLIDE 8 Dynamic Monitoring of Production Systems
Measurement probes are active only when needed, measuring everything all the time might not be practical.
function of interest Application
measurements database
dynamic monitoring control Once enough data is collected, probes are removed.
SLIDE 9
Issues of Dynamic Monitoring
In managed environments, code is compiled at run-time; probe insertion (removal) causes recompilation. Monitored application can thus behave differently.
SLIDE 10
Issues of Dynamic Monitoring
In managed environments, code is compiled at run-time; probe insertion (removal) causes recompilation. Monitored application can thus behave differently.
Interesting Questions
How do the code manipulations affect the application? What is the overhead of such probe? Is the observed performance representative? Is there zero overhead once the probe is removed?
SLIDE 11
Experiment Setup
SLIDE 12 Experiment Coordination
application code dynamic monitoring framework
measurements database
SLIDE 13 Experiment Coordination
application code dynamic monitoring framework Experiment
measurements database
with static probes
measurements database
CPU & JVM monitoring experiment coordination
SLIDE 14
Two Measurement Infrastructures
Self-measurement Dynamic monitoring
SLIDE 15
Two Measurement Infrastructures
Self-measurement Dynamic monitoring Performance Baseline “Observed”
SLIDE 16
Two Measurement Infrastructures
Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both)
SLIDE 17
Two Measurement Infrastructures
Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time)
SLIDE 18
Two Measurement Infrastructures
Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time) Data collection Continuous On demand
SLIDE 19
Two Measurement Infrastructures
Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time) Data collection Continuous On demand Implementation Native method (in C) Pure Java
SLIDE 20 Two Measurement Infrastructures
Self-measurement Dynamic monitoring Performance Baseline “Observed” Location Method entry and exit points (both) Instrumentation Static Dynamic (run-time) Data collection Continuous On demand Implementation Native method (in C) Pure Java
dynamic probe static probe function of interest
static (self) measurement dynamic measurement
SLIDE 21
Experiment Process
SLIDE 22 Experiment Process
run for some time
SLIDE 23 Experiment Process
run for some time pick random method
SLIDE 24 Experiment Process
run for some time insert dynamic probe pick random method Dynamic monitoring
SLIDE 25 Experiment Process
run for some time run for some time insert dynamic probe pick random method Dynamic monitoring
SLIDE 26 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe pick random method Dynamic monitoring
SLIDE 27 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe pick random method Dynamic monitoring What is the
performance?
SLIDE 28 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe pick random method Dynamic monitoring How fast it runs with dynamic monitoring? What is the
performance?
SLIDE 29 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe pick random method Dynamic monitoring How fast it runs with dynamic monitoring? What is the
performance?
SLIDE 30 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe pick random method Dynamic monitoring run for some time How fast it runs with dynamic monitoring? What is the
performance?
SLIDE 31 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time How fast it runs with dynamic monitoring? What is the
performance?
SLIDE 32 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time How fast it runs without dynamic monitoring? How fast it runs with dynamic monitoring? What is the
performance?
SLIDE 33 Experiment Process
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time How fast it runs without dynamic monitoring? How fast it runs with dynamic monitoring? What is the
performance?
SLIDE 34
Platform and Application Details
SLIDE 35
Platform and Application Details
– Hardware: 32 CPUs, 2 NUMA nodes, 48G RAM.
SLIDE 36 Platform and Application Details
– Hardware: 32 CPUs, 2 NUMA nodes, 48G RAM. – SPECjbb2015 augmented with static probes.
– Fixed request rate 4 000 reqs/s.
(Close to maximum with static probes on our hardware.)
– Over 1 200 monitored methods.
– Business code of the benchmark. – Practically all methods called frequently enough. – About one minute of dynamic monitoring per method.
SLIDE 37 Platform and Application Details
– Hardware: 32 CPUs, 2 NUMA nodes, 48G RAM. – SPECjbb2015 augmented with static probes.
– Fixed request rate 4 000 reqs/s.
(Close to maximum with static probes on our hardware.)
– Over 1 200 monitored methods.
– Business code of the benchmark. – Practically all methods called frequently enough. – About one minute of dynamic monitoring per method.
– Several TBs of raw data per week of run-time.
SLIDE 38
Results
SLIDE 39
Overall Overhead of Dynamic Monitoring
SLIDE 40 Overall Overhead of Dynamic Monitoring
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Record CPU utilization with dynamic monitoring ... … and without it.
SLIDE 41 Overall Overhead of Dynamic Monitoring
70 72 74 76 78 80 82 800 400 400 800 Without dynamic monitoring With dynamic monitoring CPU utilization [%] Frequency
SLIDE 42 Overall Overhead of Dynamic Monitoring
70 72 74 76 78 80 82 800 400 400 800 Without dynamic monitoring With dynamic monitoring CPU utilization [%] Frequency
Measuring one method (even a hot one) at a time brings no significant overhead.
SLIDE 43
Time Needed for Just-in-time Recompilation
SLIDE 44 Time Needed for Just-in-time Recompilation
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Record Just-in-time compiler events here ... … and here.
SLIDE 45 Time Needed for Just-in-time Recompilation
10 20 30 40 600 300 300 600 Recompilation duration [s] (waited for a minute without JIT activity) Frequency Instrumentation (probe inserted) Deinstrumentation (probe removed)
SLIDE 46 Time Needed for Just-in-time Recompilation
10 20 30 40 600 300 300 600 Recompilation duration [s] (waited for a minute without JIT activity) Frequency Instrumentation (probe inserted) Deinstrumentation (probe removed)
JIT compiler typically needs at least 30 s to finish recompilations after probe insertion (removal).
SLIDE 47
Accuracy of Collected Data
SLIDE 48 Accuracy of Collected Data
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Ratio between
performance.
SLIDE 49 Accuracy of Collected Data
0.0 0.2 0.4 0.6 0.8 1.0 1.2 50 100 150 200 Method execution time (static probe) [s] Ratio between times reported by static and dynamic probes
SLIDE 50 Accuracy of Collected Data
10 20 30 40 1.0 2.0 3.0 4.0 Method execution time (static probe) [µs] Ratio between times reported by static and dynamic probes
SLIDE 51 Accuracy of Collected Data
10 20 30 40 1.0 2.0 3.0 4.0 Method execution time (static probe) [µs] Ratio between times reported by static and dynamic probes
Interpretation of numbers from dynamic monitoring: Observed Actual 50 µs 45 µs – 50 µs 20 µs 10 µs – 20 µs 2 µs
1 2 µs – 2 µs
SLIDE 52
Impact of Dynamic Monitoring
SLIDE 53 Impact of Dynamic Monitoring
run for some time run for some time insert dynamic probe dump from static probes dump from dynamic probe remove dynamic probe dump from static probes pick random method Dynamic monitoring run for some time Ratio of baseline performance with and without dynamic monitoring
SLIDE 54 Impact of Dynamic Monitoring
0.0 0.5 1.0 1.5 2.0 2.5 3.0 20 40 60 Static probes: ratio of mean execution times during and after dynamic instrumentation Frequency
Faster when Slower when being monitored being monitored
⇐ ⇒
SLIDE 55 Impact of Dynamic Monitoring
0.0 0.5 1.0 1.5 2.0 2.5 3.0 20 40 60 Static probes: ratio of mean execution times during and after dynamic instrumentation Frequency
Faster when Slower when being monitored being monitored
⇐ ⇒
SLIDE 56 Impact of Dynamic Monitoring
0.0 0.5 1.0 1.5 2.0 2.5 3.0 20 40 60 Static probes: ratio of mean execution times during and after dynamic instrumentation Frequency
Faster when Slower when being monitored being monitored
⇐ ⇒ Dynamic monitoring can observe shorter times (as if the probes speeded-up the application).
SLIDE 57
Conclusion
SLIDE 58
Analysis of Overhead in Dynamic Java Performance Monitoring
We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.
SLIDE 59
Analysis of Overhead in Dynamic Java Performance Monitoring
We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.
Rules of thumb coming from our experiment . . .
– Measuring one method at a time does not change CPU utilization. – At least 30 s are needed for (JIT) recompilation. – If the reported time is 30 µs . . . . . . the actual duration is between 20 µs and 40 µs (durations of at least 100 µs are more “trustworthy”, though).
SLIDE 60
Analysis of Overhead in Dynamic Java Performance Monitoring
We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.
Rules of thumb coming from our experiment . . .
– Measuring one method at a time does not change CPU utilization. – At least 30 s are needed for (JIT) recompilation. – If the reported time is 30 µs . . . . . . the actual duration is between 20 µs and 40 µs (durations of at least 100 µs are more “trustworthy”, though).
http://d3s.mff.cuni.cz/resources/icpe2016
SLIDE 61
Analysis of Overhead in Dynamic Java Performance Monitoring
We evaluated how dynamic monitoring affects a running application and what is the accuracy of the obtained data.
Rules of thumb coming from our experiment . . .
– Measuring one method at a time does not change CPU utilization. – At least 30 s are needed for (JIT) recompilation. – If the reported time is 30 µs . . . . . . the actual duration is between 20 µs and 40 µs (durations of at least 100 µs are more “trustworthy”, though).
http://d3s.mff.cuni.cz/resources/icpe2016
Thank You!