Robust Communication for Jungle Computing Jason Maassen Computer - - PowerPoint PPT Presentation

robust communication for jungle computing
SMART_READER_LITE
LIVE PREVIEW

Robust Communication for Jungle Computing Jason Maassen Computer - - PowerPoint PPT Presentation

Robust Communication for Jungle Computing Jason Maassen Computer Systems Group Department of Computer Science VU University, Amsterdam, The Netherlands Requirements (revisited) Resource independence Transparent / easy deployment


slide-1
SLIDE 1

Robust Communication for Jungle Computing

Jason Maassen

Computer Systems Group Department of Computer Science VU University, Amsterdam, The Netherlands

slide-2
SLIDE 2

Requirements

(revisited)

  • Resource independence
  • Transparent / easy deployment
  • Middleware independence & interoperability
  • Jungle-aware middleware
  • Jungle-aware communication
  • Robust connectivity
  • System-support for malleability and fault-tolerance
  • Globally unique naming
  • Transparent parallelism & application-level fault-tolerance
  • Easy integration with external software
  • MPI, OpenCL, CUDA, C, C++, scripts, …

ComplexHPC Spring School 2011 2

slide-3
SLIDE 3

Requirements

(revisited)

  • Resource independence
  • Transparent / easy deployment
  • Middleware independence & interoperability
  • Jungle-aware middleware
  • Jungle-aware communication
  • Robust connectivity
  • System-support for malleability and fault-tolerance
  • Globally unique naming
  • Transparent parallelism & application-level fault-tolerance
  • Easy integration with external software
  • MPI, OpenCL, CUDA, C, C++, scripts, …

ComplexHPC Spring School 2011 3

slide-4
SLIDE 4

Low-level problems

  • Many sites have connectivity issues
  • Firewalls
  • Network Address Translation (NAT)
  • Non-routed networks
  • Multi homing
  • Mis-configured machines
  • ...
  • This makes it very hard to use a

combination of machines!

ComplexHPC Spring School 2011 4

slide-5
SLIDE 5

High-level problems

  • We need more advanced features:
  • Malleability: machines come and go during the

application lifetime

  • Fault Tolerance: machines may crash at any time
  • Robust and globally unique naming
  • Flexible communication primitives
  • Multicast or many-to-one communication
  • Efficient serialization of complex data structures
  • Need to be robust!

ComplexHPC Spring School 2011 5

slide-6
SLIDE 6

Existing libraries

  • Sockets is too low-level for daily use
  • Only point-to-point
  • No resource management
  • MPI is too inflexible
  • Focus on SPMD model
  • Little/no support for malleability or fault tolerance
  • Neither can handle firewalls/NAT/etc.

ComplexHPC Spring School 2011 6

slide-7
SLIDE 7

Ibis

  • Ibis offers “Jungle proof” communication:
  • SmartSockets
  • Sockets library (on top of regular TCP/IP)
  • Solves low-level connectivity problems
  • Ibis Portability Layer (IPL)
  • “MPI for Jungle computing”
  • Offers high-level communication primitives

ComplexHPC Spring School 2011 7

slide-8
SLIDE 8

Where are we ?

ComplexHPC Spring School 2011 8

slide-9
SLIDE 9

SmartSockets

What problems does it solve ?

  • Unreachable machines:
  • Behind firewall / NAT or on private network
  • Machine identification:
  • Machines have multiple IPs
  • Multiple machines have the same (private) IP

ComplexHPC Spring School 2011 9

slide-10
SLIDE 10

Problem 1: Firewalls

  • Blocks 'inappropriate' connections
  • Usually only blocks incoming connections
  • Some also block outgoing connection

ComplexHPC Spring School 2011 10

slide-11
SLIDE 11

Problem 2:

Network Address Translation

  • Allows multiple machines to share an IP address

ComplexHPC Spring School 2011 11

slide-12
SLIDE 12

Problem 2:

Network Address Translation

ComplexHPC Spring School 2011 12

slide-13
SLIDE 13

Problem 2:

Network Address Translation

ComplexHPC Spring School 2011 13

slide-14
SLIDE 14

Problem 2:

Network Address Translation

ComplexHPC Spring School 2011 14

slide-15
SLIDE 15

Problem 2:

Network Address Translation

ComplexHPC Spring School 2011 15

slide-16
SLIDE 16

Problem 3: Multi Homing

  • Some sites have multiple networks
  • The target address depends on the source of the

connection

ComplexHPC Spring School 2011 16

slide-17
SLIDE 17

Problem 4: Non-routed Networks

  • No route between local network and internet
  • Only the frontend is reachable

ComplexHPC Spring School 2011 17

slide-18
SLIDE 18

Problem 5:

Machine Identification

  • Private IPs (NAT/non-routed) lead to machine

identification problems

ComplexHPC Spring School 2011 18

slide-19
SLIDE 19

SmartSockets Solutions

  • The SmartSockets library
  • Detects connectivity problems
  • Tries to solve them automatically using:
  • Smart Addressing
  • Side channel
  • ... and various tricks:
  • SSH Tunneling (pass through firewalls)
  • STUN (detect external IP of NAT)
  • UPnP (automatic port forwarding)
  • ...

ComplexHPC Spring School 2011 19

slide-20
SLIDE 20

SmartSockets Library

  • Integrates existing and new solutions into
  • ne library
  • With as little help from the user as possible
  • Mostly transparent to user!
  • Offers a socket-like interface
  • Addressing is different

ComplexHPC Spring School 2011 20

slide-21
SLIDE 21

Smart Addressing

  • Instead of using a single IP:port combination for

each machine we use:

  • All machine addresses
  • Add extra information
  • External address + port for NAT (STUN, UPnP)
  • SSH contact information
  • UUID (if entire address is private)

ComplexHPC Spring School 2011 21

slide-22
SLIDE 22

Addressing Examples

ComplexHPC Spring School 2011 22

slide-23
SLIDE 23

Creating a Connection

ComplexHPC Spring School 2011 23

slide-24
SLIDE 24

Using Smart Addresses

  • This solves machine identification problems
  • All addresses are known with multi-homing
  • Each identity is unique with private IPs
  • The identity is always checked
  • Assumes anyone can create a connection
  • This will not help when target is behind NAT/Firewall
  • To solve this we need a side channel

ComplexHPC Spring School 2011 24

slide-25
SLIDE 25

Side channel

  • Overlay network implemented using a set of hubs
  • Support processes for the application
  • Started in advance
  • Hubs are run on machines with 'more connectivity'
  • Such as cluster frontends, 'open' machines, etc.
  • How / where you start them is a separate problem
  • Solved by IbisDeploy

ComplexHPC Spring School 2011 25

slide-26
SLIDE 26

Hubs

  • Similar to a peer-to-peer overlay network
  • Hubs connect to each other
  • Gossip information about other hubs
  • Automatically discover new hubs and routes
  • Need to set up spanning tree (or better)
  • Use direct connections and SSH tunnels
  • Clients connect to a 'local' hub
  • Use as side channel for connection setup

ComplexHPC Spring School 2011 26

slide-27
SLIDE 27

Hub Overlay Network

ComplexHPC Spring School 2011 27

slide-28
SLIDE 28

Advanced Connection Setup

ComplexHPC Spring School 2011 28

slide-29
SLIDE 29

Advanced Connection Setup

  • Reverse direction of connection setup
  • Send message to target using hub and wait for

incoming connection

  • Results in direct connection
  • Route via overlay
  • Create virtual connection using hubs
  • Forward all data over side channel
  • Results in indirect connection

ComplexHPC Spring School 2011 29

slide-30
SLIDE 30

SmartSockets

All problems solved

  • Unreachable machines:
  • SSH tunnels
  • Reverse connection setup
  • Routing over hubs
  • Machine identification:
  • Smart addressing
  • Identity check at connection setup

ComplexHPC Spring School 2011 30

slide-31
SLIDE 31

31 ComplexHPC Spring School 2011

slide-32
SLIDE 32

Hub Network

ComplexHPC Spring School 2011 32

slide-33
SLIDE 33

Evaluation

  • Regular TCP/IP only worked in 6 out of 30
  • SmartSockets worked in 30 out of 30!

ComplexHPC Spring School 2011 33

slide-34
SLIDE 34

Evaluation

ComplexHPC Spring School 2011 34

slide-35
SLIDE 35

Summary

  • In Jungle computing communication is hard!
  • Many connectivity problems occur
  • Takes a lot of work to find the problems and work

around them

  • SmartSockets reduces this to a single problem:
  • How to set up a spanning tree of hubs
  • The rest is done automatically!

ComplexHPC Spring School 2011 35

slide-36
SLIDE 36

However ….

  • Sockets is too low level for daily use
  • For Jungle computing we need support for
  • Malleability
  • Fault Tolerance
  • Robust and globally unique naming
  • Flexible communication primitives
  • Provided by the Ibis Portability Layer (IPL)

ComplexHPC Spring School 2011 36

slide-37
SLIDE 37

Ibis Portability Layer

(IPL)

  • Simple API for Jungle Communication
  • Flexible communication model
  • Connection oriented messaging
  • Abstract addressing scheme
  • Resource tracking
  • Notifications when machines join/leave/crash
  • Efficient serialization
  • Send bytes, doubles, objects, etc.
  • Portable:
  • SmartSockets, TCP, UDP, MPI, MX, BlueTooth,…

ComplexHPC Spring School 2011 37

slide-38
SLIDE 38

Communication Model

  • Simple communication model
  • Unidirectional pipes
  • Two end points (send and receive ports)
  • Connection oriented
  • Allows streaming (good with high latency)
  • Portable model
  • Easy to implement on Sockets/MPI/MX/…

ComplexHPC Spring School 2011 38

send port receive port

slide-39
SLIDE 39

Communication Model

  • Flexible model!

ComplexHPC Spring School 2011 39

slide-40
SLIDE 40

Port Types

  • All ports have a type
  • Defined at runtime
  • Specify set of capabilities
  • Types must match when connecting!

ComplexHPC Spring School 2011 40

X √

slide-41
SLIDE 41

Port Types

  • Consists of a set of capabilities:
  • Connection patterns
  • Unicast, many-to-one, one-to-many,

many-to-many.

  • Communication properties:
  • Fifo ordering, numbering, reliability.
  • Serialization properties:
  • Bytes, primitive types, objects
  • Message delivery:
  • Explicit receipt, automatic upcalls, polling

ComplexHPC Spring School 2011 41

slide-42
SLIDE 42

Port Types

  • Forces programmer to specify how each

communication channel is used

  • Prevents bugs
  • Exception when contract is breached
  • Allows efficient implementation to be selected
  • Unicast only ?
  • Transfer bytes only ?
  • Can save a lot complexity!

ComplexHPC Spring School 2011 42

slide-43
SLIDE 43

Messages

  • Ports communicate using 'messages'
  • Contain read or write methods for
  • Primitive types (byte, int, ...)
  • Object
  • Arrays slices (partial write / read in place)
  • Unlimited message size
  • Streaming

ComplexHPC Spring School 2011 43

slide-44
SLIDE 44

Abstract addressing

  • IbisIdentifier:
  • Abstract 'machine address' object
  • Hides network specific details
  • Examples: SmartSockets addresses, hostnames,

IP addresses, MPI ranks, etc

  • Results in more portable applications
  • Independent of network infrastructure
  • Why don't we use ranks ?
  • Hard to support malleability and fault-tolerance!

ComplexHPC Spring School 2011 44

slide-45
SLIDE 45

Resource Tracking

  • IPL offers JEL (join, elect, leave) model
  • Application gets signal when a machine

joins or leaves

  • Supports elections for distributed decision making
  • Allows machines to be elected as “master”
  • Can ensure totally ordered notifications
  • Implemented using separate registry component
  • Server that tracks application participants
  • Can track multiple applications simultaneously,

each in its own pool

ComplexHPC Spring School 2011 45

slide-46
SLIDE 46

Registry Example

ComplexHPC Spring School 2011 46

ibis-34fdw21 ibis-az33zx7 ibis-99wf331

Registry pool pool-2

ibis-983qq8f ibis-bad9955

pool pool-1

ibis-rt66pp2

pool pool-3

slide-47
SLIDE 47

Registry

  • Many implementations
  • Centralized, broadcast, gossiping, etc.
  • Different tradeoffs in functionality, complexity,

robustness, scalability and consistency

  • Application can select the functionality and

consistency that is needed

  • Reducing functionality or consistency further

improves scalability

ComplexHPC Spring School 2011 47

slide-48
SLIDE 48

Elections

  • JEL also offers an 'election'
  • Allows a group to determine who's special
  • Ranks don't work in a malleable Jungle!
  • Each election
  • Has a name (String)
  • Produces IbisIdentifier of the winner
  • Is not democratic
  • You can also be 'an observer'

ComplexHPC Spring School 2011 48

slide-49
SLIDE 49

Ibis Capabilities

  • When initializing the application must specify:
  • The PortTypes it is going to use
  • Defines what kind of communication you need
  • The Resource tracking behaviour it needs
  • Totally ordered upcalls, reliable elections etc...
  • Closed world pool, malleable pool, etc...
  • The IPL implementation it prefers
  • SmartSockets, MX, MPI, etc.
  • This allows the runtime to check if the

requested combination is feasable

ComplexHPC Spring School 2011 49

slide-50
SLIDE 50

Example

ComplexHPC Spring School 2011 50

slide-51
SLIDE 51

Example

ComplexHPC Spring School 2011 51

slide-52
SLIDE 52

Performance Evaluation

Data Parallel Image Processing

ComplexHPC Spring School 2011 52

slide-53
SLIDE 53

Performance Evaluation

Data Parallel Image Processing

ComplexHPC Spring School 2011 53

slide-54
SLIDE 54

Jungle Computing

ComplexHPC Spring School 2011 54

slide-55
SLIDE 55

Conclusion

  • SmartSockets provides robust connectivity
  • Solves issues caused by firewalls/NAT/multihoming/...
  • IPL adds high-level communication primitives
  • System-support for malleability and fault-tolerance
  • Globally unique naming
  • The combination is a perfect match to create

Jungle proof applications and programming models

ComplexHPC Spring School 2011 55