Design and Evaluation of a Virtual Experimental Environment for - - PowerPoint PPT Presentation

design and evaluation of a virtual experimental
SMART_READER_LITE
LIVE PREVIEW

Design and Evaluation of a Virtual Experimental Environment for - - PowerPoint PPT Presentation

. . Design and Evaluation of a Virtual Experimental Environment for Distributed Systems L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum . . 27/02/2013 PDP 2013, Belfast L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum Distem -


slide-1
SLIDE 1

. .

Design and Evaluation of a Virtual Experimental Environment for Distributed Systems

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

. .

Grid’5000

27/02/2013 PDP 2013, Belfast

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 1 / 27

slide-2
SLIDE 2

. . Study of distributed systems

Many objects of study:

Response time Throughput Performance Scalability Robustness Complexity Fault-tolerance Fairness etc.

Many experimental methodologies:

In-situ: real applications on real platforms Grid’5000, FutureGrid, PlanetLab, etc. Simulation: modeled applications on modeled systems SimGrid, ns-2, OMNET++, etc.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 2 / 27

slide-3
SLIDE 3

. . Different methodologies

In-situ methodology:

more realistic

uses real implementation

limited to available environmental conditions

usually unreproducible

Simulation:

enables unprecedented experiments

perfectly reproducible

simplified assumptions

lower realism

Is there a middle ground methodology?

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 3 / 27

slide-4
SLIDE 4

. . Emulation

Technique used to efficiently simulate the behavior of a computer on another one, usually more powerful

Idea:

. .

1

Use an existing platform . .

2

Model your desired platform . .

3

Efficiently emulate the desired platform using the real platform

.

.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 4 / 27

slide-5
SLIDE 5

. . Emulation (cont.)

Advantages:

Combines advantages of simulation and in-situ approaches:

allows to use real applications and infrastructure enables complicated experiments

Paves the way to reproducibility

Answers following type of questions:

How can I reproduce an experiment published in 2001 even if 1.5GHz processors do not exist anymore? How can I evaluate my new P2P software designed for DSL networks? How does this runtime with advanced load-balancing capabilities perform

  • n highly hierarchical networks?
  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 5 / 27

slide-6
SLIDE 6

. . Plan of the talk

During the rest of the talk I will:

. .

1

present our emulation-based solution . .

2

describe its architecture . .

3

show and discuss evaluation results . .

4

conclude and outline future work

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 6 / 27

slide-7
SLIDE 7

. . Distem - DISTributed systems EMulator

Distem is a (freely available) software to build virtual distributed experimental environments. . .

+

.

=

.

Heterogeneous nodes, Long distance networks, Grid, Cloud, P2P, …

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 7 / 27

slide-8
SLIDE 8

. . What can Distem do for you?

Features of Distem include:

Introducing heterogeneity in otherwise homogeneous cluster:

CPU heterogeneity How does your solution perform when some nodes are slower? Network heterogeneity Does your solution work in Internet-like infrastructure?

Emulating complex network topologies How does your solution perform on a Grid? Enlarging the scale of the experiment How does your solution perform on several thousands of nodes? User-friendliness

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 8 / 27

slide-9
SLIDE 9

. . CPU heterogeneity

Distem can host multiple virtual nodes on one physical node:

with a different number of cores with different CPU performance

There are 2 strategies for degrading performance:

CPU-Gov – based on hardware CPU throttling CPU-Hogs – advanced CPU burning . . . 1 . 2 . 3 . 4 . 5 . 6 . 7 . VN 1 . VN 2 . VN 3 . Virtual node 4 . CPU cores . CPU performance

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 9 / 27

slide-10
SLIDE 10

. . Network heterogeneity

Distem can emulate properties of network links between nodes. Each link can have a different:

maximum bandwidth latency

They can be set for incoming and outgoing traffic independently.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 10 / 27

slide-11
SLIDE 11

. . Complex network configuration

Define properties of network links using network heterogeneity Use them to emulate several local networks linked together

. . n3 . n1 . n2 .

5 Mbps 10 ms

.

5 ms 10 Mbps

.

if0

.

1 Mbps 30 ms

.

30 ms 1 Mbps

.

if0

.

100 Mbps 3 ms

.

1 ms 100 Mbps

.

if0

. n4 . n5 .

4 Mbps 12 ms

.

16 ms 6 Mbps

.

if1

.

100 kbps 25 ms

.

30 ms 256 kbps

.

if0

.

200 kbps 30 ms

.

40 ms 512 kbps

.

if0
  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 11 / 27

slide-12
SLIDE 12

. . Scale of the experiment

Distem uses a lightweight virtualization to:

share resources between nodes:

CPU network filesystem

host many instances of virtual nodes on a single node

This powerful feature:

enables challenging experiments of unprecedented scale saves resources and energy

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 12 / 27

slide-13
SLIDE 13

. . User-friendliness

Distem strives to be user-friendly:

complex and tedious tasks are automated:

configuring network interfaces populating routing tables distributing system images etc.

3 interfaces with increasing complexity and feature-set are offered:

command-line Ruby library REST interface (with JSON to represent data)

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 13 / 27

slide-14
SLIDE 14

. . User interfaces

Command-line interface – distem command

Easy to use

Hard to automate

No access to more advanced features

Ruby library

Easy to automate

Access to all features

Easy to use (if you know Ruby)

Requires Ruby

REST API

Language agnostic

Requires REST knowledge

. . User . REST API . Ruby . CLI CLI example:

distem --create-vnetwork vnetwork=net,address=10.144.0.0/22 distem --create-vnode vnode=node-1,rootfs=file:///image.tgz distem --create-viface vnode=node-1,iface=if0,vnetwork=net distem --start-vnode node-1 distem --execute vnode=node-1,command="hostname"

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 14 / 27

slide-15
SLIDE 15

. . Distem internals

Distem uses modern Linux features:

Control Groups and Linux containers (LXC) CPU frequency scaling advanced networking (network bridging) network traffic control (packet schedulers and shapers)

Note that it limits the scope of experiments to Linux/Unix.

. . Node 1 . Node 2 . Node 3 . Switch .

+

. .

=

. Node 1 . Node 2 . Node 3 . Switch

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 15 / 27

slide-16
SLIDE 16

. . Communication architecture

. User’s machine .

Uses the command line interface, the Ruby client library

  • r a REST client

. Pnode 2 . distemd . Pnode 3 . distemd . Coordinator & Pnode 1 .

Starts and controls other Pnodes, keeps the global state

  • f the platform and acts as the

gateway to Pnodes and Vnodes

. distemd .

REST

.

R E S T

.

R E S T

. Vnodes

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 16 / 27

slide-17
SLIDE 17

. . Evaluation

To evaluate Distem we designed a few experiments concerned with:

network emulation:

precision of latency emulation latency emulation over time precision of bandwidth emulation emulation of a simple topology basic performance analysis of scp and rsync tools

CPU emulation (Linpack, DGEMM and FFT benchmarks) scalability (by performing a large deployment with Distem)

Each measurement was repeated many times and results are averaged. We used the Grid’5000 testbed.

Grid’5000

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 17 / 27

slide-18
SLIDE 18

. . Precision of latency emulation

Purpose: test if latency is properly emulated Setup:

2 physical nodes, 1 virtual node on each Emulated latencies from 1 ms to 100 ms

Data point: RTT (custom ping tool) between two virtual nodes

.

. .

100

.

101

.

102

.

100

.

101

.

102

.

Emulated latency (ms)

.

Measured latency (ms)

. . .Measured latency (in) . .Measured latency (out) . .Expected latency

.

Conclusion

. .

Emulation is accurate, especially for values above real network latency (0.3 ms).

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 18 / 27

slide-19
SLIDE 19

. . Latency emulation over time

Purpose: test if latency is constant over time Setup:

2 physical nodes, 1 virtual node on each

Data point: result of high-frequency RTT probing

.

. . .

1

.

2

.

3

.

4

.

9.8

.

10

.

10.2

.

10.4

.

Time (ms)

.

Latency (ms)

. . .Measured latency . .Emulated latency

.

Conclusion

. .

Emulation is stable and correct during the measurement.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 19 / 27

slide-20
SLIDE 20

. . Precision of bandwidth emulation

Purpose: test if bandwidth emulation is correct Setup:

2 physical nodes, 1 virtual node on each Two bandwidth ranges: [ 56 Kbps, 1Mbps ] and [ 50 Mbps, 1Gbps ]

Data point: bandwidth between two nodes (using iperf)

. . .

128

.

256

.

512

.

768

.

1,024

. .

500

.

1,000

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Emulated bandwidth (kbps)

.

Measured bandwidth (kbps)

. . .Measured bandwidth (in) . .Measured bandwidth (out) . .Expected bandwidth . . . . .

100

.

300

.

500

.

700

.

900

. .

500

.

1,000

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Emulated bandwidth (Mbps)

.

Measured bandwidth (Mbps)

. . . .Measured bandwidth (in) . . .Measured bandwidth (out) . . .Expected bandwidth

.

Conclusion

. .

Bandwidth emulation is accurate in both situations.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 20 / 27

slide-21
SLIDE 21

. . Emulation of a simple topology

Purpose: test if Distem properly emulates non-trivial network Setup:

simple topology created with Distem heterogeneous network links

Data point: bandwidth and RTT between each pair of nodes

. . n3 . n1 . n2 .

5 Mbps 10 ms

.

5 ms 10 Mbps

.

if0

.

1 Mbps 30 ms

.

30 ms 1 Mbps

.

if0

.

100 Mbps 3 ms

.

1 ms 100 Mbps

.

if0

. n4 . n5 .

4 Mbps 12 ms

.

16 ms 6 Mbps

.

if1

.

100 kbps 25 ms

.

30 ms 256 kbps

.

if0

.

200 kbps 30 ms

.

40 ms 512 kbps

.

if0
  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 21 / 27

slide-22
SLIDE 22

. . Emulation of a simple topology (cont.)

. n3 . n1 . n2 .

5 Mbps 10 ms

.

5 ms 10 Mbps

.

if0

.

1 M b p s 3 m s

.

3 m s 1 M b p s

.

i f

.

1 M b p s 3 m s

.

1 m s 1 M b p s

.

i f

. n4 . n5 .

4 Mbps 12 ms

.

16 ms 6 Mbps

.

if1

.

1 k b p s 2 5 m s

.

3 m s 2 5 6 k b p s

.

i f

.

2 k b p s 3 m s

.

4 m s 5 1 2 k b p s

.

i f From \ To n1 n2 n3 n4 n5 n1
  • 0.06 s / 1.07 Mbps
0.08 s / 1.07 Mbps 0.16 s / 0.29 Mbps 0.17 s / 0.55 Mbps n2 0.06 s / 1.04 Mbps
  • 0.02 s / 9.57 Mbps
0.10 s / 0.29 Mbps 0.12 s / 0.55 Mbps n3 0.08 s / 1.05 Mbps 0.02 s / 4.92 Mbps
  • 0.08 s / 0.30 Mbps
0.10 s / 0.57 Mbps n4 0.16 s / 0.17 Mbps 0.10 s / 0.16 Mbps 0.08 s / 0.15 Mbps
  • 0.13 s / 0.15 Mbps
n5 0.17 s / 0.26 Mbps 0.12 s / 0.27 Mbps 0.10 s / 0.27 Mbps 0.13 s / 0.26 Mbps
  • RTTn3→n5 = RTTn5→n3 = (16 ms + 40 ms) + (30 ms + 12 ms) = 98 ms ≈ 0.1 s
BWn3→n5 = min {6 Mbps, 512 kbps} ≈ 0.57 Mbps BWn5→n3 = min {200 kbps, 4 Mbps} ≈ 0.27 Mbps

.

Conclusion

. .

Emulation is correct: bandwidth between each pair of nodes is a minimum of link bandwidths, and RTT is a sum of all latencies on the path.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 22 / 27

slide-23
SLIDE 23

. . Performance analysis of scp and rsync tools

Purpose: compare scp and rsync when transferring Distem sources Setup:

2 physical nodes, 1 virtual node on each Emulated latency (from 0 to 100 ms) and bandwidth (from 1 to 5 Mbps) scp and rsync are unchanged

Data point: time needed to transfer 700 KBs (small files)

. . . .

20

.

40

.

60

.

80

.

100

.

1

.

2

.

3

.

4

.

5

.

100

.

101

.

102

.

Latency (ms)

.

Bandwidth (Mbps)

.

Time (s)

. . .scp . .rsync

.

Conclusion

. .

rsync outperforms scp thanks to a more efficient transfer of many files.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 23 / 27

slide-24
SLIDE 24

. . CPU emulation

Purpose: compare CPU emulation strategies available in Distem Setup:

1 physical node with 1 virtual node, 1 or 4 cores Emulated frequencies from [ 1.2 GHz, 2.54 GHz ] range

Data point: each benchmark result

1 core 4 cores

. . .

1,200

.

1,333

.

1,467

.

1,600

.

1,733

.

1,867

.

2,000

.

2,133

.

2,267

.

2,400

.

2,534

. .

2

.

4

.

6

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Frequency (Ghz)

.

GFlops

. . . .HPL/CPU-Hogs . . .HPL/CPU-Gov . . .DGEMM/CPU-Hogs . . .DGEMM/CPU-Gov . . .FFT/CPU-Hogs . . .FFT/CPU-Gov . . .

1,200

.

1,333

.

1,467

.

1,600

.

1,733

.

1,867

.

2,000

.

2,133

.

2,267

.

2,400

.

2,534

. .

10

.

20

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Frequency (Ghz)

.

GFlops

. . .HPL/CPU-Hogs . .HPL/CPU-Gov . .DGEMM/CPU-Hogs . .DGEMM/CPU-Gov . .FFT/CPU-Hogs . .FFT/CPU-Gov

.

Conclusion

. .

Even with real benchmarks (CPU & memory usage), the CPU-Hogs strategy (usable without any hardware support) behaves as expected.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 24 / 27

slide-25
SLIDE 25

. . Scalability

Purpose: test scalability of Distem Setup: 25 or 100 physical nodes Data point: time required to deploy from 500 to 5000 virtual nodes and to launch a taktuk command (command execution tool using tree topology)

Distem installation (s) Virtual nodes deployment (s) TakTuk command (s) # Vnodes # Pnodes 25 100 25 100 25 100 500 56.9 95.4 53.7 51.7 2.4 2.1 1000 56.4 94.7 94.3 91.1 3.4 3.8 2500 59.7 95.8 219.2 207.7 10.3 9 5000 64 97.1 445.5 410.7 21.3 21.3

TakTuk communication topology with 2560 virtual nodes (10 physical nodes)

.

Conclusion

. .

Deploying 5000 virtual nodes takes less than 10 minutes One physical node can host many virtual nodes efficiently

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 25 / 27

slide-26
SLIDE 26

. . Related work

Network emulation CPU emulation Complexity Maintained / Available Emulab

☺ ☹ ☹ ☺

Wrekavoc

☺ 😑 ☺ ☹

PlanetLab

😑 ☹ ☹ ☺

ModelNet

☺ ☹ 😑 ☹

DieCast

☺ ☺ ☹ ☹

Distem

☺ ☺ ☺ ☺

Distem is the only tool offering advanced emulation and low complexity. Moreover it is freely available and actively maintained.

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 26 / 27

slide-27
SLIDE 27

. . Conclusion and future works

In this talk I presented Distem. It features:

Emulation of resources:

Network parameters and topology CPU performance

Scalability and efficiency:

Sharing of resources using virtualization Thousands of nodes can be deployed in a few minutes

Easy way to design and run advanced experiments:

Supports reproducibility (by saving topology information) Ready to use on Grid’5000

In the near future we plan to:

Push scalability even further Emulate more properties: memory, advanced CPU features

More information on http://distem.gforge.inria.fr/

  • L. Sarzyniec, T. Buchert, E. Jeanvoine, L. Nussbaum

Distem - Design and Evaluation 27 / 27