Chaos Engineering Day Stockholm edition, 2017 Organization: Martin - - PowerPoint PPT Presentation

chaos engineering day stockholm edition 2017
SMART_READER_LITE
LIVE PREVIEW

Chaos Engineering Day Stockholm edition, 2017 Organization: Martin - - PowerPoint PPT Presentation

Chaos Engineering Day Stockholm edition, 2017 Organization: Martin Monperrus, KTH http://chaos.conf.kth.se/ 111 Chaos Engineering Day? Goals: Meet: Know each other Learn: High quality technical presentations Plan: Next


slide-1
SLIDE 1

111

Chaos Engineering Day Stockholm edition, 2017

Organization: Martin Monperrus, KTH http://chaos.conf.kth.se/

slide-2
SLIDE 2

222

Chaos Engineering Day?

  • Goals:
  • Meet: Know each other
  • Learn: High quality technical presentations
  • Plan: Next collaborations in research, industry

and open-source

  • Worldwide
  • San Francisco (Nov 4 2015 & 2017), Seattle

(Aug 26 2016)

  • Paris (Nov 24 2017), London (Dec 12 17)
slide-3
SLIDE 3

333

Presentation of each participant

slide-4
SLIDE 4

444

Key Statistics

  • 1 keynote, 5 presentations
  • 43 participants (and counting...)
  • from industry, from academia
  • From many countries: Sweden,

Norway, Spain, France, Germany, Denmark

slide-5
SLIDE 5

555

Program (morning)

  • Presentations are allocated 20 minutes (incl. questions).
  • Agenda slack and interactions is builtin

9:15-9:45 Workshop introduction (Martin Monperrus, KTH) 9:45-10:45 Keynote: "Let it crash" (Joe Armstrong, father of Erlang) 10:45-11:10 Coffee Break 11:10-11:30 "Lineages as a first-class construct for fault-tolerant distributed programming" (Philipp Haller, KTH), 11:30-11:50 "Configuration testing for better DevOps" (Anatoly Vasilevskiy, SINTEF) 12:00-13:15: Lunch (free for registered participants) 13:15-14:00: Parallel sessions (email, walk)

slide-6
SLIDE 6

666

Lunch

  • 12:00 – 13:15 (10 minutes walk)
  • 13:15 – 14:00 Email session and walk session
slide-7
SLIDE 7

777

Program (afternoon)

14:00-14:20 "Continuous Diversification in a DevOps pipeline" (Nicolas Harrand, Univ Rennes) 14:20-14:40 "High Frequency Chaos Engineering" (Mats Jonsson, SAAB) 14:40-15:00 "Correctness Attraction: Runtime Perturbation for Full Correctness" (B. Danglot, Inria) 15:00-15:30 Coffee break 15:30-16:15 Breakout Group Discussion 16:15-16:30 Presentation of group results 16:30-16:45 Closing

slide-8
SLIDE 8

888

Wifi

  • Network eduroam: your institutional

login

  • Guest Logins, see sheet
slide-9
SLIDE 9

999

Acknowledgements

  • Presenters
  • Participants
  • KTH CASTOR Center for Software

Research for funding

  • Sandhya Hagelin (KTH Service) for

the organization

slide-10
SLIDE 10

101010

Introduction to Chaos Engineering

slide-11
SLIDE 11

111111

Chaos Engineering Examples

  • Chaos monkey:
  • Automatically and randomly shutdown

servers

  • Verifies that the system withstand crashes
  • Abstract over a wide range of problems

(HW, OS, SW)

  • Gameday exercise
  • Simulates a network partition isolating a

whole data center

  • Planned and monitored
slide-12
SLIDE 12

121212

Chaos Engineering Definitions

  • “Chaos Engineering is the discipline of

experimenting on a distributed system in

  • rder to build confidence in the system's

capability to withstand turbulent conditions in production” (principlesofchaos.org)

  • “Chaos Engineering is the discipline of

experimenting on a software system in production in order to verify a property”

slide-13
SLIDE 13

131313

Chaos Engineering Definition

“Chaos Engineering is the discipline of perturbing a software system in production for fun and profit” (working definition for today)

slide-14
SLIDE 14

141414

Chaos Engineering Related Work

  • The scientific method
  • Popper’s falsifiability
  • Ghost planes (1975!)
slide-15
SLIDE 15

151515

Chaos Engineering Related Work

  • Randomization & software diversity
  • Testing;
  • In-the-field testing
  • Stress testing
  • Devops:
  • Canari testing / Rolling deployment
  • A/B testing
  • Disaster recovery
slide-16
SLIDE 16

161616

Chaos Engineering Methodology

  • Invariant: measurable output that

indicates normal behavior.

  • Failure model: reflect real world events

like crash.

  • Hypothesis: control group and

experimental group.

  • Try to falsify the hypothesis by looking for

a difference in steady state between the control group and the experimental group.

slide-17
SLIDE 17

171717

Chaos Engineering Research

  • Perturbation models
  • Coarse-grain: crash
  • Fine-grain: nullify a single variable
  • Human based
  • Perturbation gains & costs
  • Chaos monkey: zero cost
  • Perturbation controller
  • Targeted perturbations
  • Use of undo? Use of isolation?
  • Maximize the gained knowledge
slide-18
SLIDE 18

181818

Chaos engineering and open-source

  • Netflix: Simianarmy (Java) and chaosmonkey

(Go)

  • jepsen-io/jepsen
  • Kube-monkey (chaos for Kubernetes)
  • Pumba (chaos for Docker)
  • os-faults & destroystack (openstack)
  • faulterl: Erlang library-level fault injection
  • See also list at

https://www.oreilly.com/ideas/chaos- engineering