High Speed Transport Protocols Evaluation in Grid5000 Date - - PowerPoint PPT Presentation

high speed transport protocols evaluation in grid5000
SMART_READER_LITE
LIVE PREVIEW

High Speed Transport Protocols Evaluation in Grid5000 Date - - PowerPoint PPT Presentation

1 NRENs and Grids TERENA workshop 7 /12/06 High Speed Transport Protocols Evaluation in Grid5000 Date Pascale Vicat-Blanc Primet Senior Researcher at INRIA Leader of the RESO team LIP Laboratory UMR CNRS-INRIA-ENS-UCBL Ecole Normale


slide-1
SLIDE 1

1

High Speed Transport Protocols Evaluation in Grid5000

Date

Pascale Vicat-Blanc Primet

Senior Researcher at INRIA Leader of the RESO team

LIP Laboratory UMR CNRS-INRIA-ENS-UCBL Ecole Normale Supérieure de Lyon France

Pascale.primet@inria.fr

NRENs and Grids TERENA workshop 7 /12/06

slide-2
SLIDE 2

2

Outline

Grid Internetworking Research Grid5000 testbed HS transport protocols evaluation Conclusion & perspective

slide-3
SLIDE 3

3

Image processing clusters Image acquisition and storage center The GridNetwork

Data movements & bandwidth sharing

Grid Internetworking research

slide-4
SLIDE 4

4

Enriched with customised Enriched with customised network mechanisms network mechanisms Original Internet technology Original Internet technology

Traditional Internet Traditional Internet applications applications (web browser, ftp, ..) (web browser, ftp, ..) Real-time multimedia Real-time multimedia applications (VoIP, applications (VoIP, video conference, ..) video conference, ..) Today Today‘ ‘s Grid s Grid applications applications

Driving a racing car Driving a racing car

  • n a public road
  • n a public road

Applications with special Applications with special network properties and network properties and requirements requirements Bringing the Grid to its full potential ! Bringing the Grid to its full potential !

EC-GIN EC-GIN

EC-GIN EC-GIN enabled enabled Grid applications Grid applications

⇒ ⇒Faster Grid: network mechanisms based on Grid peculiarities

Faster Grid: network mechanisms based on Grid peculiarities

⇒ ⇒Economic Grid traffic management and security

Economic Grid traffic management and security

EC-GIN: Grid Internetworking

slide-5
SLIDE 5

5

EC-GIN : Research Challenges

  • How to model Grid traffic?

– Much is known about web traffic (e.g. self-similarity) - but the Grid is different!

  • How to simulate a Grid-network?

– Necessary for checking various environment conditions – May require traffic model (above) – Currently, Grid-Sim / Net-Sim are two separate worlds (different goals, assumptions, tools, people)

  • How to specify network requirements?

– Explicit or implicit, guaranteed or “elastic“, various possible levels of granularity (=> new or extended APIs?)

  • How to align network and Grid economics?

– Grid service model, charging model for grid services, and network model for such Grid services – Network Mgmt mechanisms in support of those three areas in an integrated fashion

slide-6
SLIDE 6

6

The shared resources are interconnected by a complex internetwork Applications use Internet protocols: TCP/IP ⇒Main Networking Issues: ⇒1 : Security ⇒2 : E2E performance prediction and control

Access Link 1, 10 Gb/s Ethernet Core network Internet MPLS VPN GMPLS OBS… Local area networks Gigabit/10Gb/s Eth, IB, Myri

Grid Internetwork

slide-7
SLIDE 7

7

E2E performance

Combination of many factors :

  • > cross all layers and all elements of the E2E chain

Problems related to the network

  • if not overprovisionned or if no QoS support …

Problems related to the TCP protocol

  • TCP designed first and foremost to be robust and when congestion is detected, TCP accommodates

at the expense of reduced performance.

Problems related to the TCP configuration

  • small buffer space or SACK improperly negotiated

Problems related to the end system: hardware & OS

  • to the processor, bus speed, I/O devices
  • to the NIC with its associated driver;

Problems due to the applications

  • small messages or pauses in the data flow
  • > quantify the contribution of the different layers and different elements
slide-8
SLIDE 8

8

E2E performance

Speedup depends on C/T C: computing time / image T: transfer time / image congested network Speedup is very low controlled network Speedup is good

Objective function: MCT

  • Minimum completion time
slide-9
SLIDE 9

9

Two TCP Reno sources from independent machines limited 490Mb/s - Bottleneck of 1Gb/s - RTT 100ms

Window-based rate control Packet pacing rate control

E2E performance Flows interaction problem

The reality of TCP like congestion control algorithm at high speed

slide-10
SLIDE 10

10

Outline

Grid Internetworking Research Grid5000 testbed HS transport protocols evaluation Conclusion & perspective

slide-11
SLIDE 11

11

GRID5000 initiative

A nation wide experimental platform for Grid researches

  • 9 geographically distributed sites
  • every site hosts a cluster (from 256 CPUs to 1K CPUs)
  • All sites are connected by RENATER (10Gb/s DWDM VPN)
  • A system/middleware environment for safe and repeatable experiments

Run Grid experiments in real life conditions

  • Address critical issues of Grid system/middleware:
  • Programming, Scalability, Fault Tolerance, Scheduling
  • Address critical issues of Grid Networking
  • High performance transport, QoS, measurement, distributed security
  • Port and test applications
  • Investigate innovative approaches
  • P2P resources discovery, Desktop Grids, active grids
slide-12
SLIDE 12

12

Grid5000 network

RENATER-4

2,5 Gbit/s Fibre noire

CERN

Sophia

RENATER-4

2,5 Gbit/s Fibre noire

CERN

10Gb/s Dedicated lambdas

9 Clusters with 256 to 1K CPUS => about 2600 CPUs

Grid5000 software: Resource reservation Automatic reconfiguration

slide-13
SLIDE 13

13

Grid’5000

4 main features:

  • A high security for Grid’5000 and the Internet, despite the deep reconfiguration

feature

  • -> Grid’5000 is confined: communications between sites are isolated from

the Internet and Vice versa (level2 MPLS, Dedicated lambda).

  • A software infrastructure allowing users to access Grid’5000 from any Grid’5000 site

and have simple view of the system

  • -> A user has a single account on Grid’5000, Grid’5000 is seen as a cluster
  • f clusters, 9 (1 per site) unsynchronized home directories
  • A reservation/scheduling tools allowing users to select nodes and schedule experiments

 a reservation engine + batch scheduler (1 per site) + OAR Grid (a co-reservation scheduling system)

  • A user toolkit to reconfigure the nodes
  • > software image deployment and node reconfiguration

tool

Special features

slide-14
SLIDE 14

14

Reservation & Batch Scheduler

slide-15
SLIDE 15

15

  • Experiment: Geophysics: Seismic Ray

Tracing in 3D mesh of the Earth

Building a seismic tomography model of the Earth geology using seismic wave propagation characteristics in the Earth. Seismic waves are modeled from events detected by sensors. Ray tracing algorithm: waves are reconstructed from rays traced between the epicenter and one sensor.

A MPI parallel program composed of 3 steps 1) Master-worker: ray tracing and mesh update by each process with blocks of rays successively fetched from the master process, 2) all-to all communications to exchange submesh in-formation between the processes, 3) merging of cell information of the submesh associated with each process. Reference: 32 CPUs

IPGS: “Institut de Physique du Globe de Strasbourg”

Grid’5000

slide-16
SLIDE 16

16

Outline

Grid Internetworking Research Grid5000 testbed HS transport protocols evaluation Conclusion & perspective

slide-17
SLIDE 17

17

E2E performance problem

=> Wizard gap problem

Slide from Matt Mathis- PSC

slide-18
SLIDE 18

18

Grid5000 network Grid5000 network

RENATER RENATER

Rennes Orsay

RENATER

Lille

RENATER

Nancy

Router RENATER

Bordeaux

Paris Lyon

RENATER RENATER

Grenoble Lyon

RENATER

Sophia

RENATER

Toulouse Black fiber Dedicated Lambda Fully isolated traffic!

10 Gbps 10 Gbps 10 Gbps 1 G b p s 1 G b p s 1 G b p s 10 Gbps 1 G b p s 1 G b p s 1Gbps

Black Fibers are rent by the network provider RENATER is enlighted by RENATER

Source: Cees de Laat (UvA)

Next step? Lambdas on demand? Sharing Grid5000 & DAS3

slide-19
SLIDE 19

19

Is there a wizard gap problem in Grid5K?

Novice: 1Gb/s measurement, with default kernel images => goodput in Mb/s

36.1 26.3 44.3 29.7 65.7 29.8 47.6 166 To 34.0 25.1 22.3 28.9 67.4 29.5 46.1 47 So 26.3 27.4 56.5 45.5 41.4 46.6 33.6 64.2 Re 50.8 36.2 68.7 936 58.8 150 54.1 67.8 Or 32 43.3 54.7 777 52.4 78.5 162 48.0 Na 72.0 100 49.8 106 97.6 71.2 230 61.5 Ly 33.9 44.3 55 199 112 53.6 70.0 53.3 Li 48.4 52.6 34.3 33.7 39.8 151 34.0 32.3 Gr 181 68.9 76.3 111 81.2 55.9 61.8 58.1 Bo To So Re Or Na Ly Li Gr Bo

YES! R <10%C

slide-20
SLIDE 20

20

Insufficent buffer size signature

newRENO; 100ms; skb=BDP newRENO; 100ms; skb<BDP

  • BDP : Bandwidth delay product, buffer size has to equal to BDP
  • BDP mean in GRID5000 = 10e9 x 0,01 = 10e7 bits = 2,5MB
  • Default buffer size = 170KB

=> max throughput = 128 x 8 x 100 = 102 400 Kb/s = 100Mb/s

slide-21
SLIDE 21

21

Is there a wizard gap problem in G5K?

Expert: 1Gb/s measurement, with tuned kernel: goodput in Mb/s

909 939 923 933 882 784 859 928 To 694 321 900 611 543 653 839 901 So 651 839 912 914 859 787 831 912 Re 523 878 849 936 869 777 866 799 Or 622 931 938 854 865 742 851 725 Na 730 926 864 740 904 786 912 425 Ly 579 598 916 848 922 120 838 738 Li 647 911 787 893 812 925 701 900 Gr 685 875 852 884 911 862 725 771 Bo To So Re Or Na Ly Li Gr Bo

YES! G> 80% of CT G < 9% of 10Gb/s

slide-22
SLIDE 22

22

High Speed-TCP approaches: Modify the Congestion control algorithm

Window (pkts) Time (s)

T1 T2 w0 w

Convergence time

Bottleneck capacity Theoretical Fair sharing

gap1 gap2

High Perf TCP congestion control aim at minimizing this surface (HS-TCP, S-TCP, H-TCP, BIC, CuBIC…) High Perf transport protocols issues:

  • Fairness, convergence, efficiency
  • RTT fairness, Friendliness
  • Reaction to available bandwidth dynamic
slide-23
SLIDE 23

23

Example of testbed setup

... ... 13 x PCs 13 x PCs 12 x 1 GbE 10 GbE 12x 1 GbE

iperf iperf iperf iperf iperf iperfd

Grid5000 Or 10Gb/s WAN emulator (futur) 1 x 10GbE + (futur) 1 x 10GbE

slide-24
SLIDE 24

24

CUBIC in Grid5000 (11.5ms Rennes-Nancy)

slide-25
SLIDE 25

25

HSTCP in Grid5000 (11.5ms Rennes-Nancy)

slide-26
SLIDE 26

26

Parallel streams study

BIC TCP : 11 flows with 1, 2, 5 or 10 streams

slide-27
SLIDE 27

27

Long distance MPI optimisation

slide-28
SLIDE 28

28

Outline

Grid Internetworking Research Grid5000 testbed HS transport protocols evaluation Conclusion & perspective

slide-29
SLIDE 29

29

Conclusion & perspectives

  • Grid5000 provides a unique testbed for high speed transport protocol

benchmarking.

  • network controllable, end nodes redeployable
  • network instrumentation is necessary (on going work with Renater)
  • flow level monitoring at 10Gb/s is needed but very challenging
  • We work at connecting Grid5000 with other international testbeds
  • Many more studies are planned to better understand how end user can

fully and systematically benefit from huge available capacity taking into account:

  • Hardware evolution
  • Networking technology evolution
  • New network & protocol architectures
slide-30
SLIDE 30

30

GRID5000 networking collaborations

Interconnection of GRID5000 and DAS3 testbeds

  • via RENATER- GEANT- SURFNET
  • France - Netherland
  • 10Gb/s dedicated lambda through europe

Interconnection of GRID5000 and Naregi testbeds

  • via RENATER- GEANT- SUPERsinet
  • France - Japan
  • 1Gb/s dedicated channel through atlantic, usa, pacific

Interconnection between Lyon and Chicago ( IN2P3/FNAL): ANR IGTMD

  • via RENATER- GEANT- ESNET
  • France - USA
  • 2Gb/s dedicated channel through atlantic
slide-31
SLIDE 31

31

Grid5000 <-> DAS3 Grid5000 <-> DAS3

RENATER RENATER

Rennes Orsay

RENATER

Lille

RENATER

Nancy

Router RENATER

Bordeaux

Paris Lyon

RENATER RENATER

Grenoble Lyon

RENATER

Sophia

RENATER

Toulouse Black fiber Dedicated Lambda Fully isolated traffic!

10 Gbps 10 Gbps 10 Gbps 1 G b p s 1 G b p s 1 G b p s 10 Gbps 1 G b p s 1 G b p s 1Gbps

Black Fibers are rent by the network provider RENATER is enlighted by RENATER

Source: Cees de Laat (UvA)

Next step? Lambdas on demand? Sharing Grid5000 & DAS3

slide-32
SLIDE 32

32

Contacts

 Pascale.Primet@ens-lyon.fr

First international IEEE GRIDNETS 2007 conference in LYON (France) 17-19 october 2007 http://gridnets.eu Looking for sponsors and contributors

slide-33
SLIDE 33

33

Reserve