ARE YOU IN IT FOR THE LONG HAUL? Brian Rogers - PDF document

11/1/2013 ARE YOU IN IT FOR THE LONG HAUL? Brian Rogers brian.rogers@microsoft.com October 2013 Are you in it for the long haul? • Introduction • Defining “long - haul” • Why do we need long-haul testing? • Can vs. should: goals and non-goals • One category, many flavors • Automation design considerations • What success looks like 1

11/1/2013 Introduction • Modern application workloads involve hours or days of continuous operation • Unit/functional/integration testing are necessary but not sufficient • Consider long-haul testing to measure availability/reliability over longer term use Defining “long - haul” • Like stress, long-haul is a specialization of load testing • But they are not the same thing! Long-haul Stress Exercises typical application workloads and Exercises extreme situations and heavy faults controlled faults concurrently and continuously concurrently and continuously Runs for hours to days Runs for hours to days Stays within nominal system limits Exceeds limits, pushes past the breaking point Expects system to remain operational Expects system to fail in some way Demonstrates adherence to predefined SLAs Demonstrates graceful degradation and recovery Fairly broad and probabilistic Relatively constrained and repeatable 2

11/1/2013 Why do we need long-haul testing? • Traditional testing is heavy on functional and integration tests • Straightforward and predictable, pre-planned A B • Provide quick feedback, short-term quality indicators Why do we need long-haul testing? • What are we missing ? • Complex, concurrent multi- feature/multi-user interactions • Operational behavior over extended periods of time • Controlled chaos – impact of occasional faults on typical workloads 3

11/1/2013 Why do we need long-haul testing? • Enter long-haul ! • Relies less on pre-planned workflows – gives new data without new (explicit) test cases • Compresses time and scale – minimizes execution cost • Stays within operational limits – a good measure of your system’s SLAs Why do we need long-haul testing? • A sampling of long-haul bugs • Slow leak • Repeated operations over a period of hours resulted in slow but steady memory growth. • State poisoning • Race condition resulted in a momentary undefined state change; state was saved and system crashed any time state was restored. • Too much information • Tracing was too verbose; diagnostic data files were too big to be useful after the system ran for long enough time. 4

11/1/2013 Why do we need long-haul testing? • What about single-user, low concurrency, limited operational duration systems? • Long-haul principles are still useful • Can get broad coverage of positive and negative code paths with reasonable test cost Can vs. should: goals and non-goals • Goal: reduce the cost of testing • Less orchestration, less planning • Random actions and faults • Overall, fewer tests with wider product coverage • Non-goal: supplant functional testing • Functional testing is still best for guaranteed regression coverage 5

11/1/2013 Can vs. should: goals and non-goals • Goal: uncover race conditions and invalid states • Long-haul = randomness + parallelism + time • Breadth of long-haul means lots of state coverage • Non-goal: exhaustively validate all behavior • Long-haul relies more on heuristics than strict expected results Can vs. should: goals and non-goals • Goal: provide valuable feedback for ship-readiness • Demonstrates longer-term reliability • Validates (or challenges) assumptions about system capabilities • Non-goal: provide quick feedback • Long-haul needs time • Many other tests exist to provide fast results 6

11/1/2013 Can vs. should: goals and non-goals • Goal: leverage controlled chaos • Think “ many things go” not “ any thing goes” – not too predictable but not too random • Non-goal: “monkey” testing • Too much undirected randomness is a liability • Results are very difficult to analyze, resulting “bugs” are difficult to prioritize One category, many flavors • Long-haul tests come in all shapes and sizes • Exact taxonomy depends on the team, product, and context • The examples I give are adapted from my past experience • Not a definitive or exhaustive list 7

11/1/2013 One category, many flavors • Low-level • “ One- box” test, written close to the code • Observes internal state for more detailed validation • Feature/subsystem • Focuses on a specific area of the product • Purposely constrained to limit complexity • Virtuous feedback loop between functional tests and feature long-hauls One category, many flavors • Customer workload • Long-running acceptance test for particular use case • Best employed for uncommon or specialized configurations • Full-system • Exercises cross-component workloads and system-level faults • Can be quite powerful but also difficult to develop and analyze 8

11/1/2013 Automation design considerations • First, decide on the basic test architecture and topology • Typical example: “A long -haul test drives continuous concurrent actions and faults with periodic validations .” • Many decision points here, depending on the needs of the project Automation design considerations • Test driver: simple loop, distributed work scheduler? • Action: function, class/interface, separate executable? • Concurrency: fixed, parameterized, adaptive? • Faults: external vs. internal, scope/targets, how to recover? • Validations: complete vs. partial, sync vs. async? 9

11/1/2013 Automation design considerations • Separate actions from validations • Same action can be performed at different times with different expected results • Think of an action as a data producer and a validation as a data consumer Automation design considerations • Parameterize test inputs • Long-haul tests often require tuning • Ensure configurability without requiring code changes • Consider parameterization of data values and even actions/validations 10

11/1/2013 Automation design considerations • Use profiles to guide decisions • Mine production data or do market research to define typical workloads • Example: a “casual user” makes 10 requests per hour, a “power user” makes 1000 requests per hour; 20 : 1 ratio of casual to power users. Automation design considerations • Partition disparate test actors • Consider a test of an online folder shared by many users • Indiscriminately adding and removing files makes validation difficult • Instead, use the shared folder itself as a logical partition to group different user workloads • Folder 1: one writer, many readers • Folder 2: multiple writers (different files) • Folder 3: multiple writers (same files) • …and so on, until your test matrix is satisfied 11

11/1/2013 Automation design considerations • Coordinate invasive faults • Too many faults at the same time or in quick succession can turn long-haul into stress • Some care is required in scheduling faults to avoid undue pressure and stay within limits Automation design considerations • Optimize for diagnosability • Long-haul can never guarantee reproducibility • Instead, strive for diagnosability • Test and product should have sufficiently detailed logs to aid in root cause analysis • Be mindful of log sizes; consider circular/segmented logs to keep the data manageable 12

11/1/2013 What success looks like • A successful long-haul testing effort involves communication and collaboration across the team • Use a common vocabulary • Agree on the scope and target • Build a realistic schedule • Iterations : crawl, walk, run • Hold the bar What success looks like • Use a common vocabulary • Make sure your team understands and uses the same terms to describe long- haul tests • Be ready to compare/contrast similar load testing activities 13

11/1/2013 What success looks like • Agree on the scope and target • Which flavors of long-haul tests? • Which behaviors will you focus on? • What validations will you apply? • What are the expected results? • How long will the tests run? What success looks like • Build a realistic schedule • Long-haul tests need sufficient time to design and execute • Last minute issues found by long-haul tests can add days to the end of a cycle • A proper schedule must account for these risks and uncertainties 14

11/1/2013 What success looks like • Iterations: crawl, walk, run • Slowly build up the breadth and rigor of long-haul tests and focus on small, realistic objectives • Example: in iteration 1, one long-haul test of one major feature area, target duration of four hours, validate the product does not crash • In iteration 2, expand to two feature areas, target duration of eight hours, etc. What success looks like • Hold the bar • Assess risks and make informed decisions as a team • Do not unilaterally relax exit criteria or ignore issues • Be clear about changes to scope, validation, etc. of long-haul tests • Do not subject the team to a moving target 15

11/1/2013 Are you in it for the long haul? • Questions ? 16

ARE YOU IN IT FOR THE LONG HAUL? Brian Rogers - PDF document

11/1/2013 ARE YOU IN IT FOR THE LONG HAUL? Brian Rogers brian.rogers@microsoft.com October 2013 Are you in it for the long haul? Introduction Defining long - haul Why do we need long-haul testing? Can vs. should: goals

Improving Mine Haul Roads by Using Advanced Instrument to Measure Haul Road Parameters Alok

COSMOS X-Haul Overview Dan Kilper dkilper (at) optics (dot) Arizona (dot) edu X-Haul Network

Ultra Ultra Long Long-Haul and High Haul and High-Capacity Capacity 40 Gbps DWDM Transmission

1 Chapter 1: Safety for the Long Haul Introduction & Overview Chapters Crash Fundamentals

Vardhman Textiles Invested for the Long Haul May 2017 Table of Contents Snapshot

Vardhman Textiles Invested for the Long Haul May 2016 Table of Contents Snapshot Corporate

In It for the Long Haul: Prioritizing Academic Programs Presented by Charles Kreitzer, M.Ed

Long-haul coherent QPSK transmission of 40G channels with 120% spectral efficiency using

Bypassing the hubs - The potential of secondary European airports in the long haul sector Sven

Carbon Intensity of Natural Gas C8 trucks in Transportation (focus on long haul) Rosa

Live Streaming for the Long Haul A little about me Rev. Brian Wallace Associate Minister to

Rust In It for the Long Haul Carol (Nichols || Goulding) @carols10cents is.gd/rustLH Online

DWDM Long-Haul Technology Yuan-Hua (Claire) Kao and Jim Benson Optical Networking Group Lucent

A Mirage of Persistent I nequality? Comparative Educational Opportunity over the Long Haul Tony

Characterizing Performance and Fairness of Big Data Transfer Protocols on Long-haul Networks

InterTubes: A Study of the US Long-haul Fiber-optic Infrastructure By Ramakrishnan Durairajan,

Technical Debt: Unintentional Vs Intentional Hands On Christos Kotselidis & Bijan

F isc al/ E mploye r Age nt T r ansfe r s: A Se r ie s of T r ade - Offs Kate Mur r

How to create meter and why (for beginning students) J O H N R O E D E R U N I V E R S I T Y O

Dedekinds forgotten axiom and why we should teach it (and why we shouldnt teach

Corporate Operations 10/29/20 Vicki Comeau Corporate Operations, Code 10 NAVAL UNDERSEA WARFARE

HMRCs Continuous Improvement Journey Will Dey-Wood, Barrie Fawcus Continuous Improvement Team

Value Stream Transformation From the shop floor to the board room Chris Beaulieu Executive

In Introduct ctio ion t to Im Improve K e KSU SU KSUs Approach to Continuous

ARE YOU IN IT FOR THE LONG HAUL? Brian Rogers - PDF document

11/1/2013 ARE YOU IN IT FOR THE LONG HAUL? Brian Rogers brian.rogers@microsoft.com October 2013 Are you in it for the long haul? Introduction Defining long - haul Why do we need long-haul testing? Can vs. should: goals

Improving Mine Haul Roads by Using Advanced Instrument to Measure Haul Road Parameters Alok

COSMOS X-Haul Overview Dan Kilper dkilper (at) optics (dot) Arizona (dot) edu X-Haul Network

Ultra Ultra Long Long-Haul and High Haul and High-Capacity Capacity 40 Gbps DWDM Transmission

1 Chapter 1: Safety for the Long Haul Introduction &amp; Overview Chapters Crash Fundamentals

Vardhman Textiles Invested for the Long Haul May 2017 Table of Contents Snapshot

Vardhman Textiles Invested for the Long Haul May 2016 Table of Contents Snapshot Corporate

In It for the Long Haul: Prioritizing Academic Programs Presented by Charles Kreitzer, M.Ed

Long-haul coherent QPSK transmission of 40G channels with 120% spectral efficiency using

Bypassing the hubs - The potential of secondary European airports in the long haul sector Sven

Carbon Intensity of Natural Gas C8 trucks in Transportation (focus on long haul) Rosa

Live Streaming for the Long Haul A little about me Rev. Brian Wallace Associate Minister to

Rust In It for the Long Haul Carol (Nichols || Goulding) @carols10cents is.gd/rustLH Online

DWDM Long-Haul Technology Yuan-Hua (Claire) Kao and Jim Benson Optical Networking Group Lucent

A Mirage of Persistent I nequality? Comparative Educational Opportunity over the Long Haul Tony

Characterizing Performance and Fairness of Big Data Transfer Protocols on Long-haul Networks

InterTubes: A Study of the US Long-haul Fiber-optic Infrastructure By Ramakrishnan Durairajan,

Technical Debt: Unintentional Vs Intentional Hands On Christos Kotselidis &amp; Bijan

F isc al/ E mploye r Age nt T r ansfe r s: A Se r ie s of T r ade - Offs Kate Mur r

How to create meter and why (for beginning students) J O H N R O E D E R U N I V E R S I T Y O

Dedekinds forgotten axiom and why we should teach it (and why we shouldnt teach

Corporate Operations 10/29/20 Vicki Comeau Corporate Operations, Code 10 NAVAL UNDERSEA WARFARE

HMRCs Continuous Improvement Journey Will Dey-Wood, Barrie Fawcus Continuous Improvement Team

Value Stream Transformation From the shop floor to the board room Chris Beaulieu Executive

In Introduct ctio ion t to Im Improve K e KSU SU KSUs Approach to Continuous

1 Chapter 1: Safety for the Long Haul Introduction & Overview Chapters Crash Fundamentals

Technical Debt: Unintentional Vs Intentional Hands On Christos Kotselidis & Bijan