beehive software-de fj ned networking Soheil Hassas Yeganeh Yashar - - PowerPoint PPT Presentation

beehive
SMART_READER_LITE
LIVE PREVIEW

beehive software-de fj ned networking Soheil Hassas Yeganeh Yashar - - PowerPoint PPT Presentation

Towards a simple abstraction for scalable beehive software-de fj ned networking Soheil Hassas Yeganeh Yashar ganjali University of Toronto Traditional networks Hard to Program Distributed Systems Controller Controller Controller Switch


slide-1
SLIDE 1

beehive

Soheil Hassas Yeganeh

Towards a simple abstraction for scalable software-defjned networking

Yashar ganjali

University of Toronto

slide-2
SLIDE 2

Traditional networks

2

Switch Controller Switch Controller Switch Controller

Hard to Program Distributed Systems

slide-3
SLIDE 3

Software Defined Networking

3

Switch Switch Controller Switch Application

Hard to Program Distributed Systems Easy Centralized

slide-4
SLIDE 4

Hard to Program Distributed Systems

Software Defined Networking

4

Controller Application Controller Application Switch Switch Switch

Easy Existing Distributed Controllers

  • Excellent in performance &

scalability

  • Perfect fjt for some specifjc

scenarios

slide-5
SLIDE 5

Hard to Program Distributed Systems

Software Defined Networking

5

Controller Application Controller Application Switch Switch Switch

M u c h b e t t e r t h a n t r a d i t i

  • n

a l n e t w

  • r

k s

still Existing Distributed Controllers

  • Don’t hide the boilerplates of

distributed programming

  • Require signifjcant efgorts to

instrument and optimize apps

slide-6
SLIDE 6

Hard to Program Distributed Systems

Our GOAL

6

Controller Application Controller Application Switch Switch Switch

Easy similar to centralized controllers +

  • ptimized

placements + application analytics

slide-7
SLIDE 7

Our GOAL

7

Application

centralized

slide-8
SLIDE 8

Our GOAL

8

Application

centralized

Application Application Application

can be automatically transformed into

slide-9
SLIDE 9

Our GOAL

9

Application

centralized

Application Application Application

can be automatically transformed into distributed

slide-10
SLIDE 10

Our goal

10

Application Application Application

distributed

Machine Machine Machine

deployed on multiple physical machines.

Application

centralized

Machine

= Very challenging for generic control applications.

slide-11
SLIDE 11

Overview

11

Application Application Application Application Control Platform Machine Machine Machine Compiler Abstraction

slide-12
SLIDE 12

Application

Abstraction

12

what is a control application?

Dictionaries

msg

in async messages application functions Process using state dictionaries

Function msg Function Function

slide-13
SLIDE 13

Abstraction

13

how do applications communicate?

async messages state dictionaries

Application Function Function

functions of the same application all functions

Application Function Function Function Function msg msg msg

slide-14
SLIDE 14

Example

14

Traffjc Engineering Init Query Collect Route S T

Statistics Topology

SwitchJoined{si}

s

i

Timeout StatQuery{si} StatResult{si} Timeout FlowMod

si si : s t a t s * * Initializes dictionary Queries switches Collects stat results Reroutes fmows, if needed

slide-15
SLIDE 15

Example

15

Traffjc Engineering Init Query Collect Route si si si

How to transform TE into a distributed application while preserving state consistency?

* * S T

slide-16
SLIDE 16

Example

16

Traffjc Engineering Init Query Collect Route si si si

Functions create an implicit mapping between messages and dictionary entries: The entries a function needs to process a message.

* * S T

SwitchJoined{si} Timeout StatResult{si} Timeout

The dictionary key is in the message For each entry All entries

slide-17
SLIDE 17

Example

17

Traffjc Engineering Init Query Collect Route si si si

Init(), Query() and Collect() access S on a per switch basis.

S T

SwitchJoined{si} Timeout StatResult{si}

slide-18
SLIDE 18

Example

18

Traffjc Engineering Init Query Collect Route

Init(), Query() and Collect() access S on a per switch basis.

si si si

Switch Entry

1

fmow1 -> stat fmow2 -> stat

2

fmow3 -> stat fmow4 -> stat

S T

slide-19
SLIDE 19

Example

19

Traffjc Engineering Init Query Collect Route s1 s1 s1 Traffjc Engineering Init Query Collect Route s2 s2 s2 Machine 1 Machine 2

Switch Entry

1

fmow1 -> stat fmow2 -> stat

Switch Entry

2

fmow3 -> stat fmow4 -> stat

S T S T

slide-20
SLIDE 20

Example

20

Traffjc Engineering Init Query Collect Route

Init(), Query() and Collect() access S on a per switch basis.

si si si

Switch Entry

1

fmow1 -> stat fmow2 -> stat

2

fmow3 -> stat fmow4 -> stat

*

Route() accesses the whole dictionary S to process the timeout message.

S T

Timeout

slide-21
SLIDE 21

Example

21

Traffjc Engineering Init Query Collect Route s1 s1 s1 Traffjc Engineering Init Query Collect Route s2 s2 s2 Machine 1 Machine 2

Switch Entry

1

fmow1 -> stat fmow2 -> stat

Switch Entry

2

fmow3 -> stat fmow4 -> stat

* *

This will cause in consistency.

S T S T

slide-22
SLIDE 22

Example

22

Traffjc Engineering Init Query Collect Route si si si Traffjc Engineering Init Query Collect Route Machine 1 Machine 2 *

Switch Entry

1

fmow1 -> stat fmow2 -> stat

2

fmow3 -> stat fmow4 -> stat

S T S T

slide-23
SLIDE 23

consistency

23

Application Function 1 Function 2

k1 k2 k3 k4 k5

msg 1 msg 2

k4 k2 k2 k3 k5 k1

msg 3

slide-24
SLIDE 24

consistency

24

Application Function 1 Function 2

k2 k3 k4 k5

msg 1 msg 2 Machine Machine Application Function 1 Function 2

k1

msg 3

slide-25
SLIDE 25

We need a runtime that steers messages among application instances while preserving consistency.

Application Function 1 Function 2 Application Function 1 Function 2 Application Function 1 Function 2

slide-26
SLIDE 26

control platform

26

Application

F F

Hive Hive

Application

F F

Hive Cell Bee + +

slide-27
SLIDE 27

control platform

27

Application

F F

Hive Hive

Application

F F

Hive

  • is the controller
  • provides the boilerplates (e.g., locking, consistency, …)
  • can run on a separate machine
slide-28
SLIDE 28

control platform

28

Application

F F

Hive Hive

Application

F F

Cell • an entry in a dictionary of a specifjc application

  • e.g., (TE, S, si, stats of si)
slide-29
SLIDE 29

control platform

29

Application

F F

Hive Hive

Application

F F

Bee

  • a lightweight thread of execution
  • process messages
  • exclusively owns a set of cells
slide-30
SLIDE 30

Switch Switch

control platform

30

TE

C I

Hive Hive

TE

C I

Switch Switch m

slide-31
SLIDE 31

control platform

31

TE

C I

Hive Hive

TE

C I

Switch Switch m Switch Switch

slide-32
SLIDE 32

control platform

32

TE

C I

Hive Hive

TE

C I

Switch Switch m

How do we infer the cells?

Switch Switch

slide-33
SLIDE 33

control platform

33

Hive

TE

C I

m

How do we infer the cells?

func Collect(r, s): s.append(flow stats in r)

  • n StatReply(r):

Collect(r, S[r.switch]) map StatReply(r): return (S, r.switch) map(app, msg) is an application defjned function

that maps a message to the set of cells used to process that message. Beehive’s compiler can automatically generate the map function. 1-3 lines of code

Switch Switch

slide-34
SLIDE 34

control platform

34

Hive

TE

C I

m Switch Switch

  • Function Composition
  • Transactions (State + Messages)
  • Bee Migration
  • Fault tolerance
  • Optimized Placement
  • Runtime Instrumentation
  • Feedback
  • Proxied Hives
slide-35
SLIDE 35

Migration

35

TE

C I

Hive Hive

TE

C I

Switch Switch Switch Switch

slide-36
SLIDE 36

Migration

36

TE

C I

Hive Hive

TE

C I

Switch Switch Switch Switch m

slide-37
SLIDE 37

Migration

37

TE

C I

Hive Hive

TE

C I

Switch Switch Switch Switch m

slide-38
SLIDE 38

Migration

38

TE

C I

Hive Hive

TE

C I

m Switch Switch Switch Switch

slide-39
SLIDE 39

Migration

39

TE

C I

Hive Hive

TE

C I

m Switch Switch Switch Switch

This is not optimal and can happen often.

slide-40
SLIDE 40

Migration

40

TE

C I

Hive Hive

TE

C I

m Switch Switch Switch Switch m m

slide-41
SLIDE 41

Migration

41

TE

C I

Hive Hive

TE

C I

m Switch Switch Switch Switch m m

slide-42
SLIDE 42

Migration

42

TE

C I

Hive Hive

TE

C I

m Switch Switch Switch Switch m m m m

slide-43
SLIDE 43

Migration

43

TE

C I

Hive Hive

TE

C I

m Switch Switch Switch Switch m m m m

When/where should we migrate bees?

  • NP-Hard problem
  • We use a simple heuristic
slide-44
SLIDE 44

Optimized Placement

44

TE

C I

Hive Hive

TE

C I

Switch Switch Switch Switch

Our heuristic A bee that receives the majority of its messages from bees on another hive is migrated to that hive.

slide-45
SLIDE 45

Runtime instrumentation

45

TE

C I

Hive Hive

TE

C I

Switch Switch Switch Switch

  • traffjc matrix among bees
  • resource consumption
  • message provenance
slide-46
SLIDE 46

Analytics & FeedBack

46

TE

C I

Hive Hive

TE

Switch Switch Switch Switch

R Q C I R Q

slide-47
SLIDE 47

Analytics & FeedBack

47

TE

C I

Hive Hive

TE

Switch Switch Switch Switch

R Q C I R Q

slide-48
SLIDE 48

Analytics & FeedBack

48

TE

C I

Hive Hive

TE

Switch Switch Switch Switch

R Q C I R Q

centralized Hives Hives

slide-49
SLIDE 49

Analytics & FeedBack

49

TE

C I

Hive Hive

TE

Switch Switch Switch Switch

R Q C I R Q

centralized Hives Hives

m m

slide-50
SLIDE 50

Analytics & FeedBack

50

Hive Hive Switch Switch Switch Switch

well-balanced Hives Hives

TE

C I

TE

R Q C I R Q

m m

slide-51
SLIDE 51

Fault tolerance

51

Hive Hive Hive

Colony of replicated bees all in consensus about their state.

TE

C I R Q

TE

C I R Q

TE

C I R Q

slide-52
SLIDE 52

Generality

52

Centralized Kandoo NIB

func Centralized(msg): … map Centralized(msg): return {(D, 0)} func Local(msg): … map Local(msg): return {(D, hiveid)} func NIB(msg): … map NIB(msg): return {(N, nodeid)}

Virtual Networking

func VN(msg): … map VN(msg): return {(VN, vnid)}

Routing

func Router(msg): … map Router(msg): return {(Adv, msg.n[0])}

+ you don’t need to think about placement and load balancing in most cases.

slide-53
SLIDE 53

Implementation

  • Free & Open Source, written in Go:
  • https://github.com/kandoo/beehive
  • https://github.com/kandoo/beehive-netctrl
  • No external dependency in the most recent version
  • OpenFlow bindings are generated from high level specs:
  • https://github.com/packet/packet

53

slide-54
SLIDE 54

Evaluation

  • The TE application
  • Simulated environment
  • A 40 node cluster on GCE

54

Hives (1−40) Hives (1−40) Hives (1−40) Hives (1−40)

Centralized Decoupled

400 1000

BW (KB/s)

400 1000

BW (KB/s)

These spikes are for instrumentation data (periodic at 10s)

slide-55
SLIDE 55

Hives (1−40) Hives (1−40)

Evaluation

  • The TE application
  • Simulated environment
  • A 40 node cluster on GCE

55

Decoupled

400 1000

BW (KB/s)

400 1000

BW (KB/s) Hives (1−40) Hives (1−40)

then dynamically optimized All artifjcially centralized

This spike is for replicating cells on 40 hives. (~4sec.)

slide-56
SLIDE 56

Final Remarks

  • Beehive = Abstraction + Control Platform
  • Almost identical to centralized controllers
  • Dynamically optimized placement
  • Runtime instrumentation and feedback
  • Moving forward
  • Strengthen our evaluation
  • Performance optimizations

56

slide-57
SLIDE 57

Distributed programming in SDN doesn’t have to be complicated.

Thanks