P2P, DSM, and Other Products P2P, DSM, and Other Products from the - - PowerPoint PPT Presentation

p2p dsm and other products p2p dsm and other products
SMART_READER_LITE
LIVE PREVIEW

P2P, DSM, and Other Products P2P, DSM, and Other Products from the - - PowerPoint PPT Presentation

P2P, DSM, and Other Products P2P, DSM, and Other Products from the Complexity Factory from the Complexity Factory Willy Zwaenepoel Willy Zwaenepoel EPFL EPFL Impact of Research Impact of Research Not so great Not so great Many


slide-1
SLIDE 1

P2P, DSM, and Other Products P2P, DSM, and Other Products from the Complexity Factory from the Complexity Factory

Willy Zwaenepoel Willy Zwaenepoel EPFL EPFL

slide-2
SLIDE 2

Impact of Research Impact of Research

Not so great Not so great

  • Many research ideas have lost out

Many research ideas have lost out

  • Many non

Many non-

  • research developments won out

research developments won out

slide-3
SLIDE 3

Impact of Research Impact of Research

Not so great Not so great

  • Many research ideas have lost out

Many research ideas have lost out

  • Many non

Many non-

  • research developments won out

research developments won out

Why is that? Why is that?

  • We make things too complex

We make things too complex

  • Note: not: things are too complex

Note: not: things are too complex

slide-4
SLIDE 4

Impact of Research Impact of Research

Not so great Not so great

  • Many research ideas have lost out

Many research ideas have lost out

  • Many non

Many non-

  • research developments won out

research developments won out

Why is that? Why is that?

  • We make things too complex

We make things too complex

  • Not: things are too complex

Not: things are too complex

Why? Why?

  • Publishing/reviewing pushes us to complexity

Publishing/reviewing pushes us to complexity

slide-5
SLIDE 5

Apologies, Caveats and Excuses Apologies, Caveats and Excuses

Talk is rather polemic in nature Talk is rather polemic in nature … … things are said a little crassly things are said a little crassly Now a dean Now a dean – – intellectual life prohibited intellectual life prohibited

“There was once a dean who was so dumb, There was once a dean who was so dumb, that other deans actually started noticing it that other deans actually started noticing it” ”

slide-6
SLIDE 6

P2P P2P

Peer Peer-

  • to

to-

  • peer

peer No (central) server No (central) server Easier to operate, maintain, scale, make Easier to operate, maintain, scale, make more reliable more reliable … … Started as an application Started as an application Proposed as an infrastructure for a large Proposed as an infrastructure for a large number of applications number of applications

slide-7
SLIDE 7

Research on P2P Research on P2P

Concentrated largely on Concentrated largely on DHTs DHTs Log(n Log(n) access ) access Chord, Pastry, Chord, Pastry, … … Applications: backup, streaming, Applications: backup, streaming, … …

slide-8
SLIDE 8

The Problem with P2P The Problem with P2P

Very little application other than illegal file Very little application other than illegal file sharing sharing

slide-9
SLIDE 9

Reality Check Reality Check

If we have learned anything about If we have learned anything about distributed computing over the last 25 distributed computing over the last 25 years, it is that anything distributed is years, it is that anything distributed is harder than anything centralized harder than anything centralized

slide-10
SLIDE 10

Reasons for Distribution Reasons for Distribution

You cannot handle it in one place You cannot handle it in one place

  • Performance

Performance – – controlled replication controlled replication

  • Availability

Availability – – controlled replication controlled replication

Geographical distribution Geographical distribution

  • Google!

Google!

Illegality Illegality – – P2P P2P

  • From Napster to Gnutella,

From Napster to Gnutella, Kazaa Kazaa, , … …

“Raw Raw” ” traffic numbers are high traffic numbers are high

  • Much of it static

Much of it static

  • Could be handled by conventional replication (?)

Could be handled by conventional replication (?)

slide-11
SLIDE 11

Difficulties for P2P Difficulties for P2P

Hard to find anything Hard to find anything Hard to make anything secure Hard to make anything secure

  • Open invitation to attack

Open invitation to attack

  • Actively used by RIAA (pollution attacks)

Actively used by RIAA (pollution attacks)

Hard to write anything Hard to write anything

slide-12
SLIDE 12

Advantages for P2P Research Advantages for P2P Research

Complex to find anything Complex to find anything Complex to make anything secure Complex to make anything secure Complex to write anything Complex to write anything

slide-13
SLIDE 13

Advantages for P2P Research Advantages for P2P Research

Complex to find anything Complex to find anything Complex to make anything secure Complex to make anything secure Complex to write anything Complex to write anything Complexity begets papers Complexity begets papers P2P = Paper P2P = Paper-

  • to

to-

  • Paper

Paper

slide-14
SLIDE 14

There are Applications There are Applications

Large file multicast Large file multicast Can Can be be handled handled by by very very simple techniques simple techniques

  • BitTorrent

BitTorrent

It It should should worry worry us us that that these these come come from from non non-

  • research

research corners of the world! corners of the world!

slide-15
SLIDE 15

DSM DSM

Distributed shared memory Distributed shared memory Parallel computing on clusters Parallel computing on clusters Distributed memories abstracted as a Distributed memories abstracted as a single shared memory single shared memory Easier to write programs Easier to write programs Usually by page faulting Usually by page faulting TreadMarks TreadMarks ( (ParallelTools ParallelTools) )

slide-16
SLIDE 16

Reality Check Reality Check

Clusters are only suitable for coarse Clusters are only suitable for coarse-

  • grained parallel computation

grained parallel computation A fortiori true for DSM A fortiori true for DSM

slide-17
SLIDE 17

Problems with Fine Problems with Fine-

  • Grained DSM

Grained DSM

Expensive synchronization Expensive synchronization Expensive fine Expensive fine-

  • grained data sharing

grained data sharing

  • Smaller than a page

Smaller than a page

False sharing (can be solved) False sharing (can be solved) True sharing True sharing

slide-18
SLIDE 18

Advantages for DSM Research Advantages for DSM Research

Complex fine Complex fine-

  • grain synchronization

grain synchronization Complex fine Complex fine-

  • grain data sharing

grain data sharing

  • Compiler, language, runtime,

Compiler, language, runtime, … …

Complexity begets papers Complexity begets papers … …

slide-19
SLIDE 19

TreadMarks TreadMarks

(Almost) every paper or grant for research (Almost) every paper or grant for research

  • n fine
  • n fine-
  • grain DSM was accepted

grain DSM was accepted (Almost) every paper or grant for research (Almost) every paper or grant for research

  • n coarse
  • n coarse-
  • grained DSM was rejected

grained DSM was rejected It turns out that for real applications a page It turns out that for real applications a page is not large enough! is not large enough!

slide-20
SLIDE 20

Coarse Coarse-

  • grain

grain Applications Applications

Large ( Large (independent independent) ) units units of computation

  • f computation

Large Large chunks chunks of data

  • f data
  • 1 page = 4k

1 page = 4k

  • Not

Not very very large large at at all all

  • Page

Page faulting faulting brings brings in one page in one page at at a time a time

  • Message passing

Message passing brings brings in in whole whole data segment data segment at at a a time (> page) time (> page)

Can Can be be and and was was done done with with DSM DSM

  • Increase

Increase page size (!!) page size (!!)

  • Compiler support

Compiler support

slide-21
SLIDE 21

Competition Competition is is Message Passing Message Passing

MPI (Message Passing Interface) MPI (Message Passing Interface) Low Low abstraction abstraction No room for No room for complexity complexity fabrication fabrication As a As a result result more more successful successful It It should should worry worry us us that that MPI MPI did did not come not come from from distributed distributed systems systems research research but but from from linear linear algebra algebra! !

slide-22
SLIDE 22

Server Performance Server Performance

At the beginning of the Internet boom, At the beginning of the Internet boom, server performance was badly lagging server performance was badly lagging Multithreaded or multiprocess servers Multithreaded or multiprocess servers

  • Context switching

Context switching

  • Locking

Locking

Two types of solutions Two types of solutions

  • Exokernel

Exokernel

  • Event

Event-

  • driven servers

driven servers

slide-23
SLIDE 23

Event Event-

  • Driven Servers

Driven Servers

Events Events

  • Incoming request, i/o completion,

Incoming request, i/o completion, … …

Single thread, event loop Single thread, event loop Event handler per event Event handler per event

  • Straight code (no blocking)

Straight code (no blocking)

  • At end:

At end:

nonblocking nonblocking or asynchronous i/o

  • r asynchronous i/o

create (hand create (hand-

  • made) continuation

made) continuation

slide-24
SLIDE 24

Advantages Advantages

No multithreading No multithreading

  • No

No context context switching switching

  • No

No locking locking ( (at at least on least on uniprocessor uniprocessor) )

Control over Control over order

  • rder of
  • f event

event handling handling

  • Not

Not bound bound by OS by OS scheduler scheduler

slide-25
SLIDE 25

Flash Flash

Most Most popular popular event event-

  • driven

driven Web server Web server Combined Combined multithreaded multithreaded / / event event-

  • driven

driven Many Many follow follow-

  • ons
  • ns

iMimic iMimic Networking Networking

slide-26
SLIDE 26

Reality Check Reality Check

It It’ ’s s too too complex complex Maybe Maybe Ph.D.s Ph.D.s can can figure figure it it out

  • ut

Your Your average average industry industry programmer programmer cannot cannot Actually Actually, , most most Ph.D.s Ph.D.s can can’ ’t t either either Many Many ( (expensive expensive) bugs ) bugs

slide-27
SLIDE 27

How the How the Problem Problem was was Solved Solved

Linux O(1) thread Linux O(1) thread scheduler scheduler Linux Linux futex futex

  • User

User-

  • level

level locking locking

  • No

No overhead

  • verhead if no contention

if no contention

Benefits Benefits of

  • f event

event-

  • driven

driven remain remain But But too too small small to warrant to warrant complexity complexity

slide-28
SLIDE 28

How the How the Problem Problem was was Solved Solved

The main servers are all The main servers are all process process-

  • based

based or

  • r

thread thread-

  • based

based (Apache, (Apache, MySQL MySQL) ) It It should should worry worry us us that that these these servers servers did did not come out of not come out of research research! !

slide-29
SLIDE 29

Painful Observations (1) Painful Observations (1)

Most of the strong research trends have Most of the strong research trends have not found much application not found much application Non Non-

  • research designs have won out

research designs have won out Has to do with this fabricated complexity Has to do with this fabricated complexity

slide-30
SLIDE 30

Painful Observations (2) Painful Observations (2)

Has to do Has to do with with publishing publishing/ /reviewing reviewing

  • Simple

Simple papers papers tend to tend to get get rejected rejected

  • Complex

Complex papers papers tend to tend to get get in in

slide-31
SLIDE 31

Your Your Average Average Review Review Form Form

Novelty Novelty Excitement Excitement Writing Writing Confidence Confidence

slide-32
SLIDE 32

Some Some Questions to Questions to Add Add? ?

Does Does the the added added functionality functionality justify justify the the increase increase in in complexity complexity? ? Does Does the performance the performance improvement improvement justify justify the the increase increase in in complexity complexity? ? Could Could this this system system be be maintained maintained by an by an above above-

  • average

average programmer in programmer in industry industry? ? Does Does this this paper paper simplify simplify a a known known solution solution to a to a worthwhile worthwhile problem problem? ?

slide-33
SLIDE 33

Some Some Likely Likely Review Review Comments Comments

« « Incremental Incremental » » « « Engineering Engineering » » « « Nothing Nothing new new » » « « Boring Boring » »

slide-34
SLIDE 34

It IS Possible It IS Possible

Virtual machines Virtual machines Provide Provide simple solutions to real simple solutions to real problems problems

  • Server consolidation

Server consolidation

  • Migration

Migration

slide-35
SLIDE 35

Virtual Machines Virtual Machines

Virtual machine monitor Virtual machine monitor VMM VMM provides provides a a number number of VMs

  • f VMs
  • IBM VM

IBM VM

  • VMWare

VMWare

  • Xen

Xen

Open Open-

  • source

source Paravirtualization Paravirtualization (VM ~ machine) (VM ~ machine)

slide-36
SLIDE 36

Provenance Provenance

DISCO: a DISCO: a very very complex complex OS for OS for SMPs SMPs VMWare VMWare: :

  • Simplified

Simplified to Linux/Windows on one machine to Linux/Windows on one machine

  • Precise

Precise virtualization virtualization on x86

  • n x86 very

very complex complex

Xen Xen

  • Paravirtualization

Paravirtualization to to improve improve performance and performance and decrease decrease complexity complexity

VMM VMM less less complex complex Guest Guest OS ( OS (slightly slightly) more ) more complex complex Performance Performance better better (?) (?)

slide-37
SLIDE 37

The The Way Way of All

  • f All Technology

Technology

All All technology technology

  • Becomes

Becomes more more complex complex on the

  • n the inside

inside

  • Becomes

Becomes less less complex complex on the

  • n the outside
  • utside

Example Example: car, Windows (?!) : car, Windows (?!) Not sure Not sure it it fully fully applies applies to software to software

  • Most

Most complex complex systems systems ever ever built built

  • Rare

Rare example example of

  • f discrete

discrete complex complex system system

  • Maybe

Maybe we we are over the are over the limit limit already already

slide-38
SLIDE 38

Nonetheless Nonetheless

Success = interfaces defined early? Success = interfaces defined early? Very successful systems Very successful systems

  • Apache,

Apache, MySQL MySQL, MPI, , MPI, VMWare VMWare, , Xen Xen

  • Interfaces stable (few iterations)

Interfaces stable (few iterations)

  • Internal complexity grew

Internal complexity grew

Less successful systems Less successful systems

  • DSM, event

DSM, event-

  • driven

driven

  • Interfaces unstable,

Interfaces unstable, complexified complexified

slide-39
SLIDE 39

Standardization (!?) Standardization (!?)

I am afraid some of it is necessary I am afraid some of it is necessary Find a way through publishing system Find a way through publishing system

slide-40
SLIDE 40

Other Other People People’ ’s s Advice Advice

Lampson Lampson: : « « Keep Keep it it simple simple » »

  • True

True, but , but somewhat somewhat impractical impractical

Einstein: Einstein: « « Everything Everything should should be be as as simple as possible, but no more simple as possible, but no more than than that that » »

  • Implement

Implement functionality functionality at at the right interface the right interface

  • Keep

Keep interfaces stable interfaces stable

slide-41
SLIDE 41

Lessons Lessons

Brute force Brute force often

  • ften (not

(not always always) ) works works Our Our publishing publishing and and reviewing reviewing system system pushes pushes us in the opposite direction us in the opposite direction

slide-42
SLIDE 42

More More Lessons Lessons

It It is is the interface, the interface, stupid stupid The The implementation implementation can can be be complex complex The interface has to The interface has to be be simple and stable simple and stable

slide-43
SLIDE 43

NYT, NYT, June June 26, 2006 26, 2006

slide-44
SLIDE 44

Thank you Thank you