CS199-6: Wide Area Application Design, Deployment, and Management - - PowerPoint PPT Presentation

cs199 6 wide area application design deployment and
SMART_READER_LITE
LIVE PREVIEW

CS199-6: Wide Area Application Design, Deployment, and Management - - PowerPoint PPT Presentation

CS199-6: Wide Area Application Design, Deployment, and Management http://cs199.planet-lab.org/ David Culler Brent Chun Timothy Roscoe Wednesday 22 nd January 2003 Overview of todays talk Quick summary of yesterday Planetary-scale


slide-1
SLIDE 1

CS199-6: Wide Area Application Design, Deployment, and Management

http://cs199.planet-lab.org/

David Culler Brent Chun Timothy Roscoe Wednesday 22nd January 2003

slide-2
SLIDE 2

January 22, 2003 2

Overview of today’s talk

  • Quick summary of yesterday

– Planetary-scale services – PlanetLab – This class

  • The current state of PlanetLab
  • The near future of PlanetLab
  • Assignment for next week
slide-3
SLIDE 3

Planetary-scale and wide-area distributed systems

slide-4
SLIDE 4

January 22, 2003 4

What is a distributed system?

  • Distinct components running on

distinct machines

– WWW, NFS, CIFS, Email, Ultima Online, Quake3, Saber, SS7, etc., etc.

  • Characterized by:

– Concurrency – Partial failures – Latency

  • Writing distributed systems is hard

– C.f. Waldo et.al.

slide-5
SLIDE 5

January 22, 2003 5

Distributed systems: concurrency

  • Concurrency can often be dodged in

centralized systems

– Event-driven systems, one-offs

  • Alternatively, locks are available

– E.g. Java concurrency primitives

  • Distributed systems are inherently

concurrent

– And shared-memory-based synchronization is not an option

slide-6
SLIDE 6

January 22, 2003 6

Distributed systems: latency

  • Typical procedure call: ~1ms vs.

~10ns.

  • High-level system design must take

this into account

– Pipelining – Parallelism – Etc.

slide-7
SLIDE 7

January 22, 2003 7

Distributed systems: partial failure

  • “A distributed system is in which I

can’t get my work done because I computer I’ve never heard of has failed”

– Butler Lampson

  • Dist. Systems are not fail-stop

– Bits keep running – Failures may be undetected – Etc.

slide-8
SLIDE 8

January 22, 2003 8

Distributed Systems

  • Despite all this, distributed systems

are, these days, relatively commonplace

– Some are well-engineered » e.g. SS7, Ultima, etc. – Some are sufficiently simple » e.g. WWW – Some people just live with » e.g. WWW, NFS, CIFS – Some people are told to just live with » e.g. most corporate calendaring systems

slide-9
SLIDE 9

January 22, 2003 9

Wide-Area (or Planetary-Scale) Systems

  • Wide area applications are for

people who find ordinary distributed applications too easy :-)

  • Wide-area applications span a

significant portion of the globe

– Google is not a planetary-scale system – Akamai is a planetary-scale system

slide-10
SLIDE 10

January 22, 2003 10

Why build planetary-scale systems?

  • Latency

– Beating the speed of light – Move computation and data closer to users

  • Multilateration

– Stand in 1000s of viewpoints at the same time – Triangulation, correlation, measurement

  • Politics

– Spanning boundaries – Selecting (or avoiding) domains – judicial, financial, administrative, national, etc

slide-11
SLIDE 11

January 22, 2003 11

Examples of wide-area systems

  • Content-distribution networks

– Akamai, Inktomi, etc.

  • Overlay routing networks

– RON (Resilient Overlay Networks), etc.

  • Global storage systems

– OceanStore, PAST, etc.

  • True Peer-to-peer systems

– FreeNet, KaZaA, etc.

slide-12
SLIDE 12

January 22, 2003 12

What’s hard about these systems?

  • Scalability

– > 100,000s users

  • Reliability

– System should never go down

  • Performance

– It shouldn’t suck

  • Management

– How does something this big stay manageable?

Just like any

  • ther distributed

system, but more so

slide-13
SLIDE 13

January 22, 2003 13

What’s hard (and new) about these?

  • Heterogeneity

– Lots of different machines, and different components which have to talk to each other

  • Security

– We’re now spanning organizational boundaries – Perimeter-based security doesn’t really work

  • Evolution

– Parts of the system must change incrementally over time – We can’t just restart everything.

slide-14
SLIDE 14

January 22, 2003 14

Wide-area application research

  • Lots of recent research work:

– RON, ESM… – Storage: Oceanstore, IBP, CFS, Past… – DHTs: Tapestry, Chord, CAN, Pastry… – Event systems: Scribe, Herald, Bayeux... – CDNs

  • Results tend to be based on:

– Simulation – Emulation (clusters, etc.) – Small-scale deployment (call your friends)

slide-15
SLIDE 15

PlanetLab: What and Why?

slide-16
SLIDE 16

January 22, 2003 16

Doing all this for real…

  • … is hard for a researcher
  • Where do you get access to 1000

geographically dispersed machines?

  • How do you do it legitimately?

– No worms – No cracking – No Venture Capital

slide-17
SLIDE 17

January 22, 2003 17

So what is PlanetLab?

  • An open, shared testbed for

developing, deploying, and accessing planetary scale services

  • http://www.planet-lab.org
  • Boils down to:

– A set of machines to run your code on all over the world – An operating system to make this safe – Management software to keep it working – Useful services to save you time and effort

slide-18
SLIDE 18

January 22, 2003 18

PlanetLab

Intel Berkeley Intel Berkeley ICIR ICIR MIT MIT Princeton Princeton Cornell Cornell Duke Duke UT UT Columbia Columbia UCSB UCSB UCB UCB UCSD UCSD UCLA UCLA UW UW Intel Seattle Intel Seattle KY KY Cambridge Cambridge Harvard Harvard GIT GIT Uppsala Uppsala Copenhagen Copenhagen CMU CMU UPenn UPenn WI WI Chicago Chicago Utah Utah Intel OR Intel OR UBC UBC Washu Washu ISI ISI Intel Intel Rice Rice Bologna Bologna Lancaster Lancaster

  • St. Louis
  • St. Louis

UA UA Canterbury Canterbury Sydney Sydney

PlanetLab Is:

  • A globally distributed testbed for network research
  • A deployment platform for new services
  • An architectural prototype for the next Internet

PlanetLab Is:

  • A globally distributed testbed for network research
  • A deployment platform for new services
  • An architectural prototype for the next Internet
slide-19
SLIDE 19

January 22, 2003 19

PlanetLab Slices/Slivers

Node 1 Node 2 Node 3 Node 4 Node 5 Node 6 Node 7 Slice 1 Slice 2 Slice 3 Slice 4 Slice 5

A network service is broken into components that can be distributed throughout the Internet

  • Slice: total resources for the service
  • Sliver: resources required on a specific node
slide-20
SLIDE 20

January 22, 2003 20

What does PlanetLab give you?

  • PlanetLab gives you (yes you!)

– Ability to deploy a service around the world – Chance to contribute to the research

  • PlanetLab is under development

– Management services – Naming and location – Etc.

  • PlanetLab is shared

– Responsible use is called for!

slide-21
SLIDE 21

January 22, 2003 21

How do I get onto PlanetLab?

  • Brent will send out details :-)

1. Make sure you have an ssh key pair

  • Find out what this is if you don't know
  • 2. Register at http://www.planet-lab.org/
  • 3. See the course page for the list of 8 nodes

for use by the class

  • 4. Upload your public key to the web site
  • 5. Use ssh/scp for software distribution and

control

  • 6. See sf.net/planetlab for possibly useful

tools

  • 7. Note: it looks like Linux.
slide-22
SLIDE 22

The Administrative Stuff

slide-23
SLIDE 23

January 22, 2003 23

What’s this course about?

  • Writing distributed services to run in the

wide area Internet

– spread over a large no. of machines and large geographical area – which can be deployed over PlanetLab – which might become part of PlanetLab

  • Experience in the design of large systems

– Network programming – Handling failures – etc.

  • Introduction to real systems research
  • Emphasis on building rather than lectures
slide-24
SLIDE 24

January 22, 2003 24

Approach

  • Programming experience is assumed
  • Few introductory lectures
  • Reading material
  • Get into design and implementation

as soon as possible

  • Work in teams of 2-3 people
slide-25
SLIDE 25

January 22, 2003 25

Tentative Schedule

Week 1: Introductory lectures Initial team assignment Week 2: Project discussion Form teams for main project Week 3: Review proposals Thereafter: Implementation and Team meetings Project Milestones

slide-26
SLIDE 26

January 22, 2003 26

Milestones

  • 28th January: Initial assignment due
  • 4th February: Team project proposals due
  • 24th February: Initial prototype due
  • 17th March: Enhancements to prototype
  • 21st April: Deliver final service
  • 27th April, 6th May: Presentations/demos
  • 16th May: Project writeups due.
slide-27
SLIDE 27

January 22, 2003 27

Reading material

  • See:

http://cs199.planet-lab.org/reading.html

  • Required reading for next week:

– “A Blueprint for Introducing Disruptive Technology into the Internet” – “A Note on Distributed Computing” – “Hints for Computer System Design”

slide-28
SLIDE 28

Initial assignment (for next week): Parallel traceroute service

slide-29
SLIDE 29

January 22, 2003 29

Initial assigment: goals

  • Write a simple distributed service

– And work out what's hard, easy, fast, slow…

  • Get some experience with PlanetLab

– See Brent for accounts and details

  • Sneak a peek at the structure of

the Internet

– Service can be used for simple network maps

slide-30
SLIDE 30

January 22, 2003 30

Basic Concept: mapping the network

Target Machine Server Node Server Node Server Node R R R R R R R R

slide-31
SLIDE 31

January 22, 2003 31

Basic Concept: mapping the network

Target Machine Server Node Server Node Server Node Client Machine R R R R R R R R

slide-32
SLIDE 32

January 22, 2003 32

Using traceroute

deleuze> traceroute -n www.cam.ac.uk traceroute to www.cam.ac.uk (131.111.8.46), 30 hops max, 38 byte packets 1 10.212.2.1 2.365 ms 1.677 ms 0.483 ms 2 12.155.161.129 9.423 ms 0.520 ms 9.433 ms 3 12.124.44.29 9.984 ms 9.933 ms 0.512 ms 4 12.123.12.206 11.506 ms 7.959 ms 9.938 ms 5 12.122.11.93 2.379 ms 17.595 ms 9.957 ms 6 12.123.13.194 10.009 ms 9.974 ms 9.942 ms 7 209.0.227.29 10.019 ms 0.561 ms 9.664 ms 8 209.244.3.137 10.405 ms 9.174 ms 9.976 ms 9 64.159.1.73 9.914 ms 9.914 ms 9.914 ms 10 64.159.1.86 79.918 ms 85.407 ms 79.761 ms 11 212.187.128.137 153.521 ms 150.774 ms 149.966 ms 12 212.187.128.50 151.452 ms 158.274 ms 149.879 ms 13 212.113.3.2 149.990 ms 159.802 ms 149.915 ms 14 195.50.116.206 159.740 ms 149.882 ms 149.875 ms 15 146.97.37.85 159.900 ms 150.626 ms 149.156 ms 16 146.97.33.9 159.945 ms 161.207 ms 148.500 ms 17 146.97.35.10 159.868 ms 159.821 ms 159.935 ms 18 146.97.40.50 159.901 ms 149.872 ms 160.033 ms 19 192.153.213.194 159.738 ms 159.845 ms 159.961 ms 20 131.111.8.74 159.871 ms 159.853 ms 159.927 ms deleuze>

slide-33
SLIDE 33

January 22, 2003 33

Deliverables

  • Server:

$ traceroutesvr –p <port>

  • Client:

$ tracerouteclnt –p <port> <target> <server1> <ip1> <ip2> <ip3> … <host> <server2> <ip1> <ip2> <ip3> … <host> … <server8> <ip1> <ip2> <ip3> … <host>

  • See web site today or tomorrow for

more detail on the output

slide-34
SLIDE 34

January 22, 2003 34

Extra Kudos: tree display

  • Client:

$ tracerouteclnt –p <port> <target> <host> <ip1a> <ip2a> … <ip1b> <ip2a> … <ip2b> … …

slide-35
SLIDE 35

January 22, 2003 35

Extra Kudos #2: performance

  • Why is it slow (if it is!)?

– How many requests does the client have on the go? – How many requests does the server have on the go?

  • What happens if a server fails?
  • What happens if traceroute fails?

– E.g. '*'s in the output.

slide-36
SLIDE 36

January 22, 2003 36

Other details:

  • Anyone should be able to run your

tracerouteclnt from their

workstation.

  • Pick your favourite programming

language (Java, C, C++, Python, etc.)

  • Put up web page with a writeup,

tarball (including source), and sample output.

  • See Brent for details on how to get

PlanetLab accounts

slide-37
SLIDE 37

January 22, 2003 37

Questions?

  • Any thoughts on how to approach

this?

  • If you have problems:

– Send us email – Arrange to meet – Tell us how it's going