Injecting Diversity Into Running Software Systems Vivek Nallur - - PowerPoint PPT Presentation

injecting diversity into running software systems
SMART_READER_LITE
LIVE PREVIEW

Injecting Diversity Into Running Software Systems Vivek Nallur - - PowerPoint PPT Presentation

W HAT S THIS ABOUT ? The Big Idea Injecting Diversity Into Running Software Systems Vivek Nallur Trinity College Dublin 16-May-2014 W HAT S THIS ABOUT ? The Big Idea E FFECTS OF M ONOCULTURE Figure: Phytophthora infestans W HAT S


slide-1
SLIDE 1

WHAT’S THIS ABOUT? The Big Idea

Injecting Diversity Into Running Software Systems

Vivek Nallur Trinity College Dublin 16-May-2014

slide-2
SLIDE 2

WHAT’S THIS ABOUT? The Big Idea

EFFECTS OF MONOCULTURE

Figure: Phytophthora infestans

slide-3
SLIDE 3

WHAT’S THIS ABOUT? The Big Idea

EVEN IN THE SOFTWARE WORLD

Slammer attacked only one combination: Win2k + MSSQL

slide-4
SLIDE 4

WHAT’S THIS ABOUT? The Big Idea

EVEN IN THE SOFTWARE WORLD

◮ ˜75k hosts in 30 mins!

slide-5
SLIDE 5

WHAT’S THIS ABOUT? The Big Idea

FUNDAMENTAL PREMISE

  • 1. Diversity is not just a good-to-have, but essential
  • 2. Robustness is a quality attribute that we would like our

systems to have

  • 3. Robustness can be increased by injecting Diversity
slide-6
SLIDE 6

WHAT’S THIS ABOUT? The Big Idea

DIVERSIFY - FET FP7 PROJECT

Partners Investigating Diversification at Various Levels

  • 1. Inria (France)
  • 2. Sintef (Norway)
  • 3. Trinity College Dublin (Ireland)
  • 4. Universit´

e de Rennes 1 (France)

slide-7
SLIDE 7

WHAT’S THIS ABOUT? The Big Idea

GENETIC DIVERSITY

  • 1. Not necessarily vastly different, but just different enough
  • 2. An algorithm is the genetic heart of a software system
  • 3. Algorithm diversification is a good candidate for genetic

diversification

slide-8
SLIDE 8

WHAT’S THIS ABOUT? The Big Idea

ALGORITHM DIVERSIFICATION

  • 1. There exists natural diversity amongst algorithms
  • 2. In any domain, there are multiple algorithms that do the

same thing, better, faster, etc.

  • 3. We use load-balancing as our domain, for now
slide-9
SLIDE 9

WHAT’S THIS ABOUT? The Big Idea

LOAD BALANCING

  • 1. Fundamental Idea: Distribute incoming traffic amongst

pool of machines, such that two goals are satisfied:

1.1 Response time is minimized 1.2 Failure rate is minimized

  • 2. Many algorithms exist: round-robin, dynamic round-robin,

leastconn, header-Hashing, parameter-Hashing, uri-Hashing, rdp-cookie, etc.

  • 3. Each makes assumptions about the nature of traffic being

encountered

slide-10
SLIDE 10

WHAT’S THIS ABOUT? The Big Idea

NATURE OF TRAFFIC

  • 1. Traffic depends on type of content:

1.1 Static web-pages, like wikipedia, blogs, articles, etc. 1.2 Dynamic web-pages, like weather, traffic, news, youtube, etc. 1.3 Sticky (personalized) like facebook, twitter, etc.

  • 2. The algorithms mentioned previously, improve response

times for these workloads

  • 3. Specialist algorithms for specialist patterns
slide-11
SLIDE 11

WHAT’S THIS ABOUT? The Big Idea

PATTERNS, NOISE, ETC.

  • 1. In a DDoS attack, traffic pattern is random
  • 2. Failure-rate rather than response time becomes more

important

  • 3. Generalist algorithm for all patterns of workload, doesn’t

exist

slide-12
SLIDE 12

WHAT’S THIS ABOUT? The Big Idea

CHANGE ALGORITHMS

  • 1. Currently, sysadmins have to consider their workloads

and choose one algorithm

  • 2. When pattern of traffic changes, or website gets hit by a

DDoS attack, the prevailing algorithm’s assumptions are invalid

  • 3. What if we modify the algorithm when the traffic pattern

changes?

  • 4. Can we do better than random?
slide-13
SLIDE 13

WHAT’S THIS ABOUT? The Big Idea

ADAPTATION VIA ALGORITHM SWAPPING

  • 1. Modify load-balancer to work on a pool of algorithms,

instead of one

  • 2. Cycle through the pool, every n seconds
  • 3. In the worst case:

3.1 Algorithm completely unsuited for traffic pattern = ⇒ high failure 3.2 But it lasts only for n seconds!

slide-14
SLIDE 14

WHAT’S THIS ABOUT? The Big Idea

CREATING A POOL OF ALGORITHMS

  • 1. Choose haproxy as an industrial-strength load-balancer
  • 2. Use all the algorithms implemented by haproxy
  • 3. Number of combinations: 7C2 —- 7C7!!
  • 4. Potential behavioural diversity is very high!
slide-15
SLIDE 15

WHAT’S THIS ABOUT? The Big Idea

DOES THIS WORK?

  • 1. We want to decrease failure-rate
  • 2. So measure dropped requests
  • 3. In the presence of a cloud of VMs hitting the load-balancer
  • 4. Pools defined as:

4.1 7C1 — class A — baseline 4.2 7C3 — class B 4.3 7C4 — class C 4.4 7C7 — class D

slide-16
SLIDE 16

WHAT’S THIS ABOUT? The Big Idea

EXPERIMENTAL CONDITIONS

  • 1. Workload: 3 Virtual Machines
  • 2. Load-Balancer: 1 haproxy
  • 3. Load-Generators: 13 Virtual Machines

Note:

We want to overwhelm haproxy, not the workload machines

slide-17
SLIDE 17

WHAT’S THIS ABOUT? The Big Idea

NORMAL PERFORMANCE OF HAPROXY

hdrHost leastconn roundrobin static−rr uri 15 20 25 30 35 40 45

% Requests dropped

Figure: Each pool containing one algorithm – all of class A

slide-18
SLIDE 18

WHAT’S THIS ABOUT? The Big Idea

DIVERSIFIED PERFORMANCE OF HAPROXY

roundrobin−uri−hdrHost static−rr−leastconn−hdrHost 4 6 8 10

% Requests dropped

Figure: class B

slide-19
SLIDE 19

WHAT’S THIS ABOUT? The Big Idea

DIVERSIFIED PERFORMANCE OF HAPROXY

leastconn−source−uri−rdpcookie roundrobin−leastconn−uri−hdrHost 10 20 30 40

% Requests dropped

Figure: class C

slide-20
SLIDE 20

WHAT’S THIS ABOUT? The Big Idea

DIVERSIFIED PERFORMANCE OF HAPROXY

5.0 5.5 6.0 6.5 7.0 % Requests dropped

Figure: class D

slide-21
SLIDE 21

WHAT’S THIS ABOUT? The Big Idea

ALL TOGETHER NOW

A B C D 10 20 30 40 Algorithm combination % Requests dropped

Figure: Robustness across pools

slide-22
SLIDE 22

WHAT’S THIS ABOUT? The Big Idea

STATISTICAL EVIDENCE

diff lwr upr p adj B- A −20.622 −30.632 −10.612 0.00001 C- A −9.329 −19.340 0.681 0.076 D- A −22.160 −36.317 −8.004 0.001

Table: Significance of long-run differences in failure rate

diff lwr upr p adj B- A

  • 1, 073.833
  • 2, 638.443

490.777 0.276 C- A 50.333

  • 1, 514.277

1, 614.943 1.000 D- A

  • 1, 523
  • 3, 735.693

689.693 0.273

Table: No significance of long-run differences in median response time

slide-23
SLIDE 23

WHAT’S THIS ABOUT? The Big Idea

EXPERIMENT VALIDITY

  • 1. Sample size: 6 samples per pool
  • 2. Anova & Tukey test pass for statistical significance
  • 3. Failure-rate improved; Response time same!!
  • 4. Only static workload
  • 5. Dynamic & Sticky workloads missing
slide-24
SLIDE 24

WHAT’S THIS ABOUT? The Big Idea

DIVERSITY ISN’T ALL GREAT :(

hdrHost leastconn roundrobin static−rr uri 15 20 25 30 35 40 45

% Requests dropped

leastconn−source−uri−rdpcookie roundrobin−leastconn−uri−hdrHost 10 20 30 40

% Requests dropped

slide-25
SLIDE 25

WHAT’S THIS ABOUT? The Big Idea

SO, IT’S STILL RANDOM CHOICE

  • 1. Not exactly. We can measure inter-algorithm distance
  • 2. Sort of.
  • 3. We can use Normalized Compression Distance
  • 4. Used in many free-text domains

NCDZ(x, y) = maxK(x|y), K(y|x) maxK(x), K(y)

slide-26
SLIDE 26

WHAT’S THIS ABOUT? The Big Idea

Figure: Clustering on code of algorithm implementation

slide-27
SLIDE 27

WHAT’S THIS ABOUT? The Big Idea

USING NCD

  • 1. Not all pools are created equal
  • 2. Selecting from pool, might be better than random choice
  • 3. Pre-compute pool diversity?
slide-28
SLIDE 28

WHAT’S THIS ABOUT? The Big Idea

WHAT’S THE NET RESULT?

  • 1. No definitive answers
  • 2. But promising experiments
  • 3. Obviously more required
slide-29
SLIDE 29

WHAT’S THIS ABOUT? The Big Idea

THAT’S ALL, FOLKS!

Questions, Suggestions...