Continuous Performance Testing Mark Price / @epickrram Performance - - PowerPoint PPT Presentation

continuous performance testing
SMART_READER_LITE
LIVE PREVIEW

Continuous Performance Testing Mark Price / @epickrram Performance - - PowerPoint PPT Presentation

Continuous Performance Testing Mark Price / @epickrram Performance Engineer Improbable.io The ideal System performance testing as a first-class citizen of the continuous delivery pipeline Process Process maturity A scientific and rigorous


slide-1
SLIDE 1

Continuous Performance Testing

Mark Price / @epickrram Performance Engineer Improbable.io

slide-2
SLIDE 2

The ideal

System performance testing as a first-class citizen of the continuous delivery pipeline

slide-3
SLIDE 3

Process

slide-4
SLIDE 4

Process maturity

A scientific and rigorous survey

slide-5
SLIDE 5

Process maturity

A scientific and rigorous survey

slide-6
SLIDE 6

Process maturity

“As part of QA, the whole team logs on to the system to make sure it scales”

slide-7
SLIDE 7

Process maturity

“We have some hand-rolled benchmarks that prove our code is fast”

slide-8
SLIDE 8

Process maturity

“We use a well-known testing framework for our benchmarks”

slide-9
SLIDE 9

Process maturity

“Our benchmarks are run as part of CI”

slide-10
SLIDE 10

Process maturity

“Trend visualisations of system performance are available”

slide-11
SLIDE 11

Process maturity

“There is a release gate on performance regression”

slide-12
SLIDE 12

Increasing process maturity

Implies: Higher maintenance cost Greater confidence

slide-13
SLIDE 13

Scopes

slide-14
SLIDE 14

Performance test scopes

  • Nanobenchmarks
  • Microbenchmarks
  • Component Benchmarks
  • System performance tests
slide-15
SLIDE 15

Nanobenchmarks

  • Determine the cost of something in the underlying

platform or runtime

  • How long does it take to retrieve System.nanoTime()?
  • What is the overhead of retrieving AtomicLong vs long?
  • Invocation times on the order of 10s of nanoseconds
slide-16
SLIDE 16

Nanobenchmarks

  • Susceptible to jitter in the runtime/OS
  • Unlikely to need to regression test these...
  • Unless called very frequently from your code
slide-17
SLIDE 17

Message callback

@Benchmark @BenchmarkMode(Mode.Throughput) @OutputTimeUnit(TimeUnit.SECONDS) public void singleCallback(final Blackhole blackhole) { callback.accept(blackhole); } @Benchmark @BenchmarkMode(Mode.Throughput) @OutputTimeUnit(TimeUnit.SECONDS) public void singleElementIterationCallback(final Blackhole blackhole) { for (Consumer<Blackhole> objectConsumer : callbackList) {

  • bjectConsumer.accept(blackhole);

} }

slide-18
SLIDE 18

Message callback

slide-19
SLIDE 19

Microbenchmarks

  • Test small, critical pieces of infrastructure or logic
  • E.g message parsing, calculation logic
  • These should be regression tests
  • We own the code, so assume that we’re going to break it
  • Same principle as unit & acceptance tests
slide-20
SLIDE 20

Microbenchmarks

  • Invaluable for use in optimising your code (if it is a

bottleneck)

  • Still susceptible to jitter in the runtime
  • Execution times in the order of 100s of nanos/single-digit

micros

  • Beware bloat
slide-21
SLIDE 21

Risk analysis - long vs double

BigDecimal long double

slide-22
SLIDE 22

Component benchmarks

  • ‘Service’ or ‘component’ level benchmarks
  • Whatever unit of value makes sense in the codebase
  • Wire together a number of components on the critical path
  • We can start to observe the behaviour of the JIT compiler

(i.e. inlining)

slide-23
SLIDE 23

Component benchmarks

  • Execution times in the 10s - 100s of microseconds
  • Useful for reasoning about maximum system performance
  • Runtime jitter less of an issue, as things like GC/de-opts

might start to enter the picture

  • Candidate for regression testing
slide-24
SLIDE 24

Matching Engine - no-ops are fast!

slide-25
SLIDE 25

System performance tests

  • Last line of defence against regressions
  • Will catch host OS configuration changes
  • Costly, requires hardware that mirrors production
  • Useful for experimentation
  • System recovery after failure
  • Tools developed for monitoring here should make it to

production

slide-26
SLIDE 26

System performance tests

  • Potentially the longest cycle-time
  • Can provide an overview of infrastructure costs (e.g

network latency)

  • Red-line tests (at what point will the system fail

catastrophically)

  • Understand of interaction with host OS more important
  • Regressions should be visible
slide-27
SLIDE 27

Page fault stalls

slide-28
SLIDE 28

Performance testing trade-offs

Nanobenchmarks Microbenchmarks Component Benchmarks System Tests

  • Slower

feedback

  • Hardware

cost

  • Maintenance

cost

  • KPI/SLA

indicator

  • Realism
  • Faster

feedback

  • System jitter

magnified

  • Fewer moving

parts

  • Stability
slide-29
SLIDE 29

Measurement

slide-30
SLIDE 30

System jitter is a thing

slide-31
SLIDE 31

Reducing runtime jitter

Histogram of invocation times (via JMH) Run-to-run variation Large error values around average

slide-32
SLIDE 32

Reducing runtime jitter

slide-33
SLIDE 33

Measurement apparatus

Use a proven test-harness If you can’t: Understand coordinated omission Measure out-of-band Look for load-generator back-pressure

slide-34
SLIDE 34

Production-grade tooling

Monitoring and tooling used in your performance environment should be productionised

slide-35
SLIDE 35

Containers and the cloud

Measure the baseline of system jitter Network throughput & latency: understand what is an artifact

  • f our system and what is the infrastructure

End-to-end testing is more important here since there are many more factors at play adding to latency long-tail

slide-36
SLIDE 36

Reporting

slide-37
SLIDE 37

Charting

“Let’s chart our benchmark results so we’ll see if there are regressions”

slide-38
SLIDE 38

Charting

slide-39
SLIDE 39

Charting

slide-40
SLIDE 40

Charting

slide-41
SLIDE 41

Charting

Make a computer do the analysis We automated manual testing, we should automate regression analysis Then we can selectively display charts Explain the screen in one sentence, or break it down

slide-42
SLIDE 42

Improvement

slide-43
SLIDE 43

Virtuous cycle

Measure Model Execute Measure Compare

slide-44
SLIDE 44

Virtuous cycle

Measure Model Execute Measure Compare

PRODUCTION PERF ENV

slide-45
SLIDE 45

Virtuous cycle

Measure Model Execute Measure Compare

Use the same tooling Track divergence

slide-46
SLIDE 46

Regression tests

If we find a performance issue, try to add a test that demonstrates the problem This helps in the investigation phase, and ensures regressions do not occur Be careful with assertions

slide-47
SLIDE 47

In a nutshell...

slide-48
SLIDE 48

Key points

Use a known-good framework if possible If you have to roll your own: peer review, measure it, understand it Data volume can be oppressive, use or develop tooling to understand results Test with realistic data/load distribution

slide-49
SLIDE 49

Key points

Are we confident that our performance testing will catch regressions before they make it to production?

slide-50
SLIDE 50

Thank you!

  • @epickrram
  • https://epickrram.blogspot.com
  • recruitment@improbable.io