Battlefield report: Bittorrent protocol implementation Analysis of - - PowerPoint PPT Presentation

battlefield report bittorrent protocol implementation
SMART_READER_LITE
LIVE PREVIEW

Battlefield report: Bittorrent protocol implementation Analysis of - - PowerPoint PPT Presentation

Battlefield report: Bittorrent protocol implementation Analysis of using Erlang and Haskell Jesper Louis Andersen jesper.louis.andersen@gmail.com Sep 27, 2010 Overview Goal: Tell a story. Give insight. Overview Goal: Tell a story. Give


slide-1
SLIDE 1

Battlefield report: Bittorrent protocol implementation

Analysis of using Erlang and Haskell Jesper Louis Andersen jesper.louis.andersen@gmail.com Sep 27, 2010

slide-2
SLIDE 2

Overview

Goal: Tell a story. Give insight.

slide-3
SLIDE 3

Overview

Goal: Tell a story. Give insight. Priming: What is it, really?

slide-4
SLIDE 4

Overview

Goal: Tell a story. Give insight. Priming: What is it, really? Actors! You have hundreds of independent processes ...

slide-5
SLIDE 5

Overview

Goal: Tell a story. Give insight. Priming: What is it, really? Actors! You have hundreds of independent processes ... War diary: Musings over the implementations.

slide-6
SLIDE 6

History

Etorrent - A bittorrent client implemented in Erlang

◮ Erlang/OTP implementation ◮ Initial Checkin, 27th Dec 2006 ◮ Had first working version around early 2008 ◮ 5 KSLOCs

Combinatorrent - A bittorrent client in Haskell

◮ GHC (Glasgow Haskell Compiler) implementation ◮ Initial checkin: 16th Nov 2009 ◮ First working version less than 2.5 months after ◮ Implements an actor-like model on top of STM (Software

Transactional Memory)

◮ 4.1 KSLOCs

slide-7
SLIDE 7

Ackowledgements

This is joint work; try to make it easy to contribute: Etorrent: Tuncer Ayaz, Magnus Klaar Combinatorrent: Alex Mason, Andrea Vezzozi, “Astro”, Ben Edwards, John Gunderman, Roman Cheplyaka, Thomas Christensen

slide-8
SLIDE 8

Why?

Several reasons:

slide-9
SLIDE 9

Why?

Several reasons:

◮ “To fully understand a programming language, you must

implement something non-trivial with it.” – Jespers Law

◮ A priori ◮ A posteriori

slide-10
SLIDE 10

Why?

Several reasons:

◮ “To fully understand a programming language, you must

implement something non-trivial with it.” – Jespers Law

◮ A priori ◮ A posteriori

◮ Gauge the effectiveness of modern functional programming

languages for real-world problems.

slide-11
SLIDE 11

Why?

Several reasons:

◮ “To fully understand a programming language, you must

implement something non-trivial with it.” – Jespers Law

◮ A priori ◮ A posteriori

◮ Gauge the effectiveness of modern functional programming

languages for real-world problems.

◮ BitTorrent is a good “Problem Set”

slide-12
SLIDE 12

KSLOCs

wgo combinatorrent etorrent bittornado rtorrent ktorrent transmission deluge 20000 40000 60000 80000

slide-13
SLIDE 13

KSLOCs

wgo combinatorrent bittornado rtorrent ktorrent transmission deluge Vuze 0e+00 1e+05 2e+05 3e+05 4e+05

slide-14
SLIDE 14

HTTP vs BitTorrent

BitTorrent is about Content distribution. Some key differences: HTTP

◮ Simple ◮ Stateless ◮ One-to-many ◮ “Serial” ◮ Upstream bandwidth

heavy BitTorrent

◮ Complex ◮ Stateful ◮ Peer-2-Peer ◮ “Concurrent” ◮ Upstream bandwidth

scales proportionally with number of consumers In BitTorrent everything is sacrificed for the last point.

slide-15
SLIDE 15

Key concepts

One: A stream of bytes is split into pieces and exchanged among peers with a message-passing protocol.

slide-16
SLIDE 16

Two: Swarm intelligence

Beehives, Ant colonies, wasps.

slide-17
SLIDE 17

Two: Swarm intelligence

Beehives, Ant colonies, wasps. Each client acts independently with a 10 second memory, only evaluates downstream bandwidth; unless it is seeding. Mantra: Be friendly to your established friends, but be optimistic about gaining new ones Mimics human interaction.

slide-18
SLIDE 18

Actor models

“Island model”

slide-19
SLIDE 19

Actor models

“Island model”

◮ Cheap processes (green, userland based) ◮ Fast CTX switch ◮ Process Isolation, message pass is persistent or a copy

slide-20
SLIDE 20

Communication (Link)

Peer #1 Peer #2 P1 Socket P2 P3 Tracker PeerMgr ChokeMgr Status PeerP Peer SendQueue Peer_Receiver Peer Sender PeerP Peer SendQueue Peer_Receiver Peer Sender Main FS Console Timer PieceMgr Listener HTTP

slide-21
SLIDE 21

Process Hierarchy (Location)

S0 S1 Main Timer Console PeerMgr ChokeMgr S2 FS Tracker Status PieceMgr SPeer1 SPeer2 P1Receiver P1SendQ P1PeerP P1Sender P2Receiver P2SendQ P2PeerP P2Sender

slide-22
SLIDE 22

Bigraphs

Bigraph = Hypergraph + Tree Do not confuse with bipartite graphs. Hypergraph is the link-graph Tree is the location-graph

slide-23
SLIDE 23

Robustness

Robustness is key to good programming:

◮ Semantics (segfault, Null, of-by-one, ...) ◮ Proactive: Haskell

◮ Type system

◮ Reactive: Erlang

◮ Crashes, restarts ◮ Supervisors ◮ Redundancy

Ideas from both areas are needed in robust software!

slide-24
SLIDE 24

Process Hierarchy (Location)

S0 S1 Main Timer Console PeerMgr ChokeMgr S2 FS Tracker Status PieceMgr SPeer1 SPeer2 P1Receiver P1SendQ P1PeerP P1Sender P2Receiver P2SendQ P2PeerP P2Sender

slide-25
SLIDE 25

Strings in Haskell and Erlang

◮ Single linked lists of runes

slide-26
SLIDE 26

Strings in Haskell and Erlang

◮ Single linked lists of runes ◮ Simple ◮ Unicode is trivial ◮ List operations are string operations

slide-27
SLIDE 27

Strings in Haskell and Erlang

◮ Single linked lists of runes ◮ Simple ◮ Unicode is trivial ◮ List operations are string operations ◮ It is fairly fast ◮ Extremely memory heavy (16+ bytes per char in Erlang!)

slide-28
SLIDE 28

Strings in Haskell and Erlang

◮ Single linked lists of runes ◮ Simple ◮ Unicode is trivial ◮ List operations are string operations ◮ It is fairly fast ◮ Extremely memory heavy (16+ bytes per char in Erlang!)

Solution: Use ByteString for binary data in Haskell, binaries/iolists in Erlang.

slide-29
SLIDE 29

Some cool things in Haskell

◮ Haskell is king of abstraction (sans Proof assistants) ◮ Type system is expressive almost to the point of program proof ◮ Strong Type Zoo

slide-30
SLIDE 30

Some cool things in Haskell

◮ Haskell is king of abstraction (sans Proof assistants) ◮ Type system is expressive almost to the point of program proof ◮ Strong Type Zoo ◮ Combinators run at full speed in Haskell

slide-31
SLIDE 31

Some cool things in Haskell

◮ Haskell is king of abstraction (sans Proof assistants) ◮ Type system is expressive almost to the point of program proof ◮ Strong Type Zoo ◮ Combinators run at full speed in Haskell ◮ Close to being clay: you can model actors easily

slide-32
SLIDE 32

Some cool things in Haskell

◮ Haskell is king of abstraction (sans Proof assistants) ◮ Type system is expressive almost to the point of program proof ◮ Strong Type Zoo ◮ Combinators run at full speed in Haskell ◮ Close to being clay: you can model actors easily ◮ Excellent community - vibrant; practitioners and academics. ◮ QuickCheck - (John Hughes, Wednesday)

slide-33
SLIDE 33

The bad in Haskell

◮ Lazy evaluation - space leaks

slide-34
SLIDE 34

The bad in Haskell

◮ Lazy evaluation - space leaks

◮ Heap Profile – Use strictness annotations,

slide-35
SLIDE 35

The bad in Haskell

◮ Lazy evaluation - space leaks

◮ Heap Profile – Use strictness annotations, ◮ Peak Mem: ◮ Productivity: ◮ CPU/Mb:

slide-36
SLIDE 36

The bad in Haskell

◮ Lazy evaluation - space leaks

◮ Heap Profile – Use strictness annotations, ◮ Peak Mem: ◮ Productivity: ◮ CPU/Mb:

◮ Academic compilers, stability suffer ◮ Some libraries are extremely complex type-wise

slide-37
SLIDE 37

Some cool things in Erlang

◮ Crash-oriented programming is bliss, an error might not be

fatal

slide-38
SLIDE 38

Some cool things in Erlang

◮ Crash-oriented programming is bliss, an error might not be

fatal

◮ OTP - Actor abstraction: Servers, event drivers, finite state

machine, supervision, logging, ...

slide-39
SLIDE 39

Some cool things in Erlang

◮ Crash-oriented programming is bliss, an error might not be

fatal

◮ OTP - Actor abstraction: Servers, event drivers, finite state

machine, supervision, logging, ...

◮ Processes are individually garbage collected (isolation)

slide-40
SLIDE 40

Some cool things in Erlang

◮ Crash-oriented programming is bliss, an error might not be

fatal

◮ OTP - Actor abstraction: Servers, event drivers, finite state

machine, supervision, logging, ...

◮ Processes are individually garbage collected (isolation) ◮ Interpreted language, but implementation is heavily optimized

slide-41
SLIDE 41

Some cool things in Erlang

◮ Crash-oriented programming is bliss, an error might not be

fatal

◮ OTP - Actor abstraction: Servers, event drivers, finite state

machine, supervision, logging, ...

◮ Processes are individually garbage collected (isolation) ◮ Interpreted language, but implementation is heavily optimized ◮ Again, excellent community!

slide-42
SLIDE 42

The bad in Erlang

◮ Not suited for number crunching (have to choose right

algorithm, data structure)

slide-43
SLIDE 43

The bad in Erlang

◮ Not suited for number crunching (have to choose right

algorithm, data structure)

◮ No way to do imperative code (Deliberate choice by the

Erlang developers, have to fake it)

slide-44
SLIDE 44

The bad in Erlang

◮ Not suited for number crunching (have to choose right

algorithm, data structure)

◮ No way to do imperative code (Deliberate choice by the

Erlang developers, have to fake it)

◮ Dynamic typing (Dialyzer project helps, processes are small

(< 500 lines)

slide-45
SLIDE 45

The Ugly

Haskell:

◮ Take laziness seriously from the start ◮ Be careful when choosing libraries

slide-46
SLIDE 46

The Ugly

Haskell:

◮ Take laziness seriously from the start ◮ Be careful when choosing libraries

Erlang:

◮ Be careful about messaging large data between processes ◮ Mnesia has optimistic conflict resolution

slide-47
SLIDE 47

The Ugly

Haskell:

◮ Take laziness seriously from the start ◮ Be careful when choosing libraries

Erlang:

◮ Be careful about messaging large data between processes ◮ Mnesia has optimistic conflict resolution

Both: Expect to manipulate your process model quite a bit.

slide-48
SLIDE 48

Repositories

We use github for all code: http://www.github.com/jlouis Look for etorrent and combinatorrent