The Art of Consistent SDN Updates Stefan Schmid Aalborg University - - PowerPoint PPT Presentation

the art of consistent sdn updates
SMART_READER_LITE
LIVE PREVIEW

The Art of Consistent SDN Updates Stefan Schmid Aalborg University - - PowerPoint PPT Presentation

The Art of Consistent SDN Updates Stefan Schmid Aalborg University The Art of Consistent SDN Updates Stefan Schmid Aalborg University Smart students in Berlin & Wroclaw: Arne Ludwig, Jan Marcinkowski, Szymon Dudycz, Matthias Rost,


slide-1
SLIDE 1

The Art of Consistent SDN Updates

Stefan Schmid

Aalborg University

slide-2
SLIDE 2

The Art of Consistent SDN Updates

Stefan Schmid

Aalborg University Smart students in Berlin & Wroclaw: Arne Ludwig, Jan Marcinkowski, Szymon Dudycz, Matthias Rost, Damien Foucard, Saeed Amiri

slide-3
SLIDE 3

SDN: Algorithms with a fundamental twist!

Ctrl

Control Programs Control Programs

slide-4
SLIDE 4

SDN: Algorithms with a fundamental twist!

Ctrl

Control Programs Control Programs

Applications and Control Plane … and regarding decoupling / inter- connect! Data Plane

slide-5
SLIDE 5

SDN: Flexiblities and Constraints

Ctrl

Control Programs Control Programs

Applications and Control Plane … and regarding inter-connect! Data Plane SDN/OpenFlow is about generality and flexibility: in terms

  • f how packets are matched (L2-L4 header fields and

beyond), how flows are defined (fine vs coarse granular, proactive vs reactive), events can be handled centrally vs in a distributed manner, etc. But there are also constraints and challenges: SDN is an inherently asynchronous distributed system (controller decoupled), switches are simple devices (not a Turing or even state machine!), IP-routing is prefix based, careful use of dynamic flexibilities: don’t shoot in your foot!

slide-6
SLIDE 6

Applications: Algorithms with a twist!

Ctrl

❏ Let’s consider: Traffic Engineering

❏ Circuit routing, call admission ❏ Raghavan, Wolsey, Awerbuch, etc.

❏ SDN twist: more general/flexible!

❏ Non-shortest paths and more ❏ Enables complex network services: steer traffic through middleboxes i.e. waypoints (firewall, proxy etc.): paths may contain loops! ❏ More than independent routing per segment: none-or-all segment admission control, joint optimization ❏ E.g., LP relaxation (Raghavan et al.): how to randomly round and decompose complex requests?

Control Programs Control Programs

slide-7
SLIDE 7

Applications: Algorithms with a twist!

Ctrl

❏ Let’s consider: Traffic Engineering

❏ Classic routing, call admission ❏ Wolsey, Awerbuch, Plotkin, etc.

❏ SDN twist: more general/flexible!

❏ Non-shortest paths ❏ Enables complex network services: steer traffic through middleboxes i.e. waypoints (firewall, proxy etc.): paths may contain loops! ❏ More than independent routing per segment: none-or-all segment admission control, joint optimization ❏ E.g., LP relaxation (Raghavan et al.): how to randomly round and decompose complex requests?

Control Programs Control Programs

Optionally NFV twist: where to place NFV (or hybrid SDN)? Facility location / capacitated dominating set, but: not distance to but distance via function(s) matters!

slide-8
SLIDE 8

Applications: Algorithms with a twist!

Ctrl

❏ Let’s consider: Traffic Engineering

❏ Classic routing, call admission ❏ Wolsey, Awerbuch, Plotkin, etc.

❏ SDN twist: more general/flexible!

❏ Non-shortest paths ❏ Enables complex network services: steer traffic through middleboxes i.e. waypoints (firewall, proxy etc.): paths may contain loops! ❏ More than independent routing per segment: none-or-all segment admission control, joint optimization ❏ E.g., LP relaxation (Raghavan et al.): how to randomly round and decompose complex requests?

Control Programs Control Programs

SIROCCO 2015, arxiv 2016

Optionally NFV twist: where to place NFV (or hybrid SDN)? Facility location / capacitated dominating set, but: not distance to but distance via function(s) matters!

Service Chain and Virtual Network Embeddings: Approximations using Randomized Rounding Matthias Rost and Stefan Schmid. ArXiv Technical Report, April 2016. Online Admission Control and Embedding of Service Chains Tamás Lukovszki and Stefan Schmid. 22nd International Colloquium on Structural Information and Communication Complexity (SIROCCO), Montserrat, Spain, July 2015. An Approximation Algorithm for Path Computation and Function Placement in SDNs Guy Even, Matthias Rost, and Stefan Schmid. ArXiv Technical Report, March 2016.

slide-9
SLIDE 9

Applications: Algorithms with a twist!

Ctrl

❏ Let’s consider: Traffic Engineering

❏ Classic routing, call admission ❏ Wolsey, Awerbuch, Plotkin, etc.

❏ SDN twist: more general/flexible!

❏ Non-shortest paths ❏ Enables complex network services: steer traffic through middleboxes i.e. waypoints (firewall, proxy etc.): paths may contain loops! ❏ More than independent routing per segment: none-or-all segment admission control ❏ E.g., LP relaxation (Raghavan et al.): how to randomly round and decompose complex requests?

Control Programs Control Programs

Optionally NFV twist: where to place NFV (or hybrid SDN)? Facility location / capacitated dominating set, but: not distance to but distance via function(s) matters! Migration upon each new request undesirable: want incremental deployment! Related to submodular capacitated set cover and scheduling (Fleischer, Khuller), but end-to-end.

slide-10
SLIDE 10

Applications: Algorithms with a twist!

Ctrl

❏ Let’s consider: Traffic Engineering

❏ Classic routing, call admission ❏ Wolsey, Awerbuch, Plotkin, etc.

❏ SDN twist: more general/flexible!

❏ Non-shortest paths ❏ Enables complex network services: steer traffic through middleboxes i.e. waypoints (firewall, proxy etc.): paths may contain loops! ❏ More than independent routing per segment: none-or-all segment admission control ❏ E.g., LP relaxation (Raghavan et al.): how to randomly round and decompose complex requests?

Control Programs Control Programs

SIROCCO 2015, arxiv 2016

Optionally NFV twist: where to place NFV (or hybrid SDN)? Facility location / capacitated dominating set, but: not distance to but distance via function(s) matters! Migration upon each new request undesirable: want incremental deployment! Related to submodular capacitated set cover and scheduling (Fleischer, Khuller), but end-to-end.

It's a Match! Near-Optimal and Incremental Middlebox Deployment Tamás Lukovszki, Matthias Rost, and Stefan Schmid. ACM SIGCOMM Computer Communication Review (CCR), January 2016.

slide-11
SLIDE 11

Control Plane: Algorithms with a twist!

Ctrl Ctrl

❏ Reduce latency and overhead: What can be computed locally?

❏ Routing vs heavy-hitter detection? ❏ LOCAL model! Insights apply: verification vs optimization

❏ SDN twist: pre-processing!

❏ Hard in LOCAL: symmetry breaking! But unlike ad-hoc networks: no need to discover network from scratch ❏ Topology events less frequent than flow related events ❏ If links fail: subgraph! Find recomputed structures that are still useful in subgraph (e.g., proof labelings) ❏ Precomputation known to help for relevant problems: load-balancing / matching

Ctrl Ctrl

slide-12
SLIDE 12

Control Plane: Algorithms with a twist!

Ctrl Ctrl

❏ Reduce latency and overhead: What can be computed locally?

❏ Routing vs heavy-hitter detection? ❏ LOCAL model! Insights apply: verification vs optimization

❏ SDN twist: pre-processing!

❏ Hard in LOCAL: symmetry breaking! But unlike ad-hoc networks: no need to discover network from scratch ❏ Topology events less frequent than flow related events ❏ If links fail: subgraph! Find recomputed structures that are still useful in subgraph (e.g., proof labelings) ❏ Precomputation known to help for relevant problems: load-balancing / matching

Ctrl Ctrl

How to make control plane robust? Software transactional memory problem: network configuration = shared memory, updates = transactions, but with a twist: flows are uncontrolled, real-time transactions: do not abort! (And not only read!)

slide-13
SLIDE 13

Control Plane: Algorithms with a twist!

Ctrl Ctrl

❏ Reduce latency and overhead: What can be computed locally?

❏ Routing vs heavy-hitter detection? ❏ LOCAL model! Insights apply: verification vs optimization

❏ SDN twist: pre-processing!

❏ Hard in LOCAL: symmetry breaking! But unlike ad-hoc networks: no need to discover network from scratch ❏ Topology events less frequent than flow related events ❏ If links fail: subgraph! Find recomputed structures that are still useful in subgraph (e.g., proof labelings) ❏ Precomputation known to help for relevant problems: load-balancing / matching

Ctrl Ctrl

HotSDN 2013

How to make control plane robust? Software transactional memory problem: network configuration = shared memory, updates = transactions, but with a twist: flows are uncontrolled, real-time transactions: do not abort! (And not only read!)

A Distributed and Robust SDN Control Plane for Transactional Network Updates Marco Canini, Petr Kuznetsov, Dan Levin, and Stefan Schmid. 34th IEEE Conference on Computer Communications (INFOCOM), Hong Kong, April 2015.

slide-14
SLIDE 14

Control Plane: Algorithms with a twist!

Ctrl Ctrl

❏ Reduce latency and overhead: What can be computed locally?

❏ Routing vs heavy-hitter detection? ❏ LOCAL model! Insights apply: verification vs optimization

❏ SDN twist: pre-processing!

❏ Hard in LOCAL: symmetry breaking! But unlike ad-hoc networks: no need to discover network from scratch ❏ Topology events less frequent than flow related events ❏ If links fail: subgraph! Find recomputed structures that are still useful in subgraph (e.g., proof labelings) ❏ Precomputation known to help for relevant problems: load-balancing / matching

Ctrl Ctrl

HotSDN 2013

Careful: independent flow spaces does not imply that controllers can concurrently update without conflict: e.g., due to shared embedding! Atomic read-modify-write? How to make control plane robust? Software transactional memory problem: network configuration = shared memory, updates = transactions, but with a twist: flows are uncontrolled, real-time transactions: do not abort! (And not only read!)

slide-15
SLIDE 15

Control Plane: Algorithms with a twist!

Ctrl Ctrl

❏ Reduce latency and overhead: What can be computed locally?

❏ Routing vs heavy-hitter detection? ❏ LOCAL model! Insights apply: verification vs optimization

❏ SDN twist: pre-processing!

❏ Hard in LOCAL: symmetry breaking! But unlike ad-hoc networks: no need to discover network from scratch ❏ Topology events less frequent than flow related events ❏ If links fail: subgraph! Find recomputed structures that are still useful in subgraph (e.g., proof labelings) ❏ Precomputation known to help for relevant problems: load-balancing / matching

Ctrl Ctrl

HotSDN 2013

Careful: independent flow spaces does not imply that controllers can concurrently update without conflict: e.g., due to shared embedding! Atomic read-modify-write? How to make control plane robust? Software transactional memory problem: network configuration = shared memory, updates = transactions, but with a twist: flows are uncontrolled, real-time transactions: do not abort! (And not only read!)

In-Band Synchronization for Distributed SDN Control Planes Liron Schiff, Petr Kuznetsov, and Stefan Schmid. ACM SIGCOMM Computer Communication Review (CCR), January 2016.

slide-16
SLIDE 16

Data Plane: Algorithms with a twist!

Ctrl

HotSDN 2014

❏ Even in SDN: Keep some functionality in the data plane!

❏ E.g., for performance: OpenFlow local fast failover: 1st line of defense

❏ SDN twist: data plane algorithms

  • perate under simple conditions

❏ Failover tables are statically (proactively) preconfigured, w/o multiple faiures knowledge ❏ At runtime: local view only and header space is scarce resource ❏ W/ tagging: graph exploration ❏ W/o tagging: combinatorial problem ❏ Later: consolidate this with controller!

slide-17
SLIDE 17

Data Plane: Algorithms with a twist!

Ctrl

HotSDN 2014

❏ Even in SDN: Keep some functionality in the data plane!

❏ E.g., for performance: OpenFlow local fast failover: 1st line of defense

❏ SDN twist: data plane algorithms

  • perate under simple conditions

❏ Failover tables are statically (proactively) preconfigured, w/o multiple faiures knowledge ❏ At runtime: local view only and header space is scarce resource ❏ W/ tagging: graph exploration ❏ W/o tagging: combinatorial problem ❏ Later: consolidate this with controller!

With infinite header space ideal robustness possible. But what about bounded header space? And resulting route lengths? Without good algorithms, routing may disconnect way before physical network does!

slide-18
SLIDE 18

Data Plane: Algorithms with a twist!

Ctrl

HotSDN 2014

❏ Even in SDN: Keep some functionality in the data plane!

❏ E.g., for performance: OpenFlow local fast failover: 1st line of defense

❏ SDN twist: data plane algorithms

  • perate under simple conditions

❏ Failover tables are statically (proactively) preconfigured, w/o multiple faiures knowledge ❏ At runtime: local view only and header space is scarce resource ❏ W/ tagging: graph exploration ❏ W/o tagging: combinatorial problem ❏ Later: consolidate this with controller!

With infinite header space ideal robustness possible. But what about bounded header space? And resulting route lengths? Without good algorithms, routing may disconnect way before physical network does!

Provable Data Plane Connectivity with Local Fast Failover: Introducing OpenFlow Graph Algorithms Michael Borokhovich, Liron Schiff, and Stefan Schmid. ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN), Chicago, Illinois, USA, August 2014. How (Not) to Shoot in Your Foot with SDN Local Fast Failover: A Load-Connectivity Tradeoff Michael Borokhovich and Stefan Schmid. 17th International Conference on Principles of Distributed Systems (OPODIS), Nice, France, Springer LNCS, December 2013.

slide-19
SLIDE 19

Decoupling: Algorithms with a twist!

Ctrl

❏ Decoupling already challenging for a single switch! ❏ Network Hello World application: MAC learning ❏ MAC learning has SDN twist: MAC learning SDN controller is decoupled: may miss response and keep flooding! ❏ Need to configure rules s.t. controller stays informed when necessary!

? ? ?

slide-20
SLIDE 20

Decoupling: Algorithms with a twist!

Ctrl

❏ In-band control: cheap but algorithmically challenging!

❏ Distributed coordination algorithms to manage switches? ❏ Powerful fault-tolerance concept: self-stabilization

❏ SDN twist: switches are simple!

❏ Cannot actively participate in arbitrary self-stab spanning tree protocols ❏ Controller needs to install tree rules Ctrl

unmanaged!

slide-21
SLIDE 21

Decoupling: Algorithms with a twist!

Ctrl

❏ In-band control: cheap but algorithmically challenging!

❏ Distributed coordination algorithms to manage switches? ❏ Powerful fault-tolerance concept: self-stabilization

❏ SDN twist: switches are simple!

❏ Cannot actively participate in arbitrary self-stab spanning tree protocols ❏ Controller needs to install tree rules Ctrl

unmanaged!

DISN 2016

Ground Control to Major Faults: Towards a Fault Tolerant and Adaptive SDN Control Network Liron Schiff, Stefan Schmid, and Marco Canini. IEEE/IFIP DSN Workshop on Dependability Issues on SDN and NFV (DISN), Toulouse, France, June 2016.

slide-22
SLIDE 22

Decoupling: Algorithms with a twist!

Ctrl

❏ Researchers proposed to exploit SDN rule definition flexiblities to solve growing FIB size problem

❏ OpenFlow-based IP router: caching and aggregation ❏ Zipf law: many infrequent prefixes at controller ❏ Extremely distributed control 

❏ Online paging with SDN twist

❏ Forwarding semantic: largest common prefix forwarding, i.e., dependencies: only offload root- contiguous set in trie ❏ Can do bypassing Ctrl Ctrl Ctrl Ctrl

to ctrl

ICDCS 2014

slide-23
SLIDE 23

Decoupling: Algorithms with a twist!

Ctrl

❏ Researchers proposed to exploit SDN rule definition flexiblities to solve growing FIB size problem

❏ OpenFlow-based IP router: caching and aggregation ❏ Zipf law: many infrequent prefixes at controller ❏ Extremely distributed control 

❏ Online paging with SDN twist

❏ Forwarding semantic: largest common prefix forwarding, i.e., dependencies: only offload root- contiguous set in trie ❏ Can do bypassing Ctrl Ctrl Ctrl Ctrl

to ctrl

ICDCS 2014

Online Tree Caching Marcin Bienkowski, Jan Marcinkowski, Maciej Pacut, Stefan Schmid, and Aleksandra Spyra. ArXiv Technical Report, February 2016. Competitive FIB Aggregation without Update Churn Marcin Bienkowski, Nadi Sarrar, Stefan Schmid, and Steve Uhlig. 34th International Conference on Distributed Computing Systems (ICDCS), Madrid, Spain, June 2014.

slide-24
SLIDE 24

Interconnect: Algorithms with a twist!

Ctrl

❏ Another challenge: asynchronous communication channel

asynchronous

He et al., ACM SOSR 2015: without network latency

slide-25
SLIDE 25

Interconnect: Algorithms with a twist!

Ctrl

❏ Another challenge: asynchronous communication channel

asynchronous

He et al., ACM SOSR 2015: without network latency

Not only because of network latency, but also data structures!

slide-26
SLIDE 26

untrusted hosts trusted hosts

Controller Platform

What can possibly go wrong?

Invariant: Traffic from untrusted hosts to trusted hosts via firewall!

slide-27
SLIDE 27

untrusted hosts trusted hosts

Controller Platform

What can possibly go wrong?

Invariant: Traffic from untrusted hosts to trusted hosts via firewall!

asynchronous

slide-28
SLIDE 28

Example 1: Bypassed Waypoint

insecure Internet secure zone

Controller Platform

slide-29
SLIDE 29

Example 2: Transient Loop

insecure Internet secure zone

Controller Platform

slide-30
SLIDE 30

Tagging: A Universal Solution?

tag blue red red blue blue new route

❏ Old route: red ❏ New route: blue ❏ 2-Phase Update:

❏ Install blue flow rules internally ❏ Flip tag at ingress ports

  • ld route

tag red

slide-31
SLIDE 31

Tagging: A Universal Solution?

tag blue red red blue blue new route

❏ Old route: red ❏ New route: blue ❏ 2-Phase Update:

❏ Install blue flow rules internally ❏ Flip tag at ingress ports

  • ld route

tag red

Cost of extra rules? Where to tag? Header space? Overhead? Time till new link becomes available?

slide-32
SLIDE 32

Alternative: Weaker Transient Consistency

Idea: Packet may take a mix of old and new path, as long as weaker consistencies are fulfilled transiently, e.g. Loop-Freedom (LF) and Waypoint Enforcement (WPE). Schedule safe subsets in multiple rounds

Controller Platform Controller Platform

Round 1 Round 2 …

slide-33
SLIDE 33

The Spectrum of Consistency

Strong weak, transient consistency (loop-freedom, waypoint enforced)

Mahajan and Wattenhofer, HotNets 2014 Ludwig et al., HotNets 2014

correct network virtualization

Ghorbani and Godfrey, HotSDN 2014

per-packet consistency

Reitblatt et al., SIGCOMM 2012

Weak

slide-34
SLIDE 34

Going Back to Our Examples: LF Update?

insecure Internet secure zone

slide-35
SLIDE 35

Going Back to Our Examples: LF Update!

insecure Internet secure zone insecure Internet secure zone insecure Internet secure zone

R1: R2:

slide-36
SLIDE 36

Going Back to Our Examples: LF Update!

insecure Internet secure zone insecure Internet secure zone insecure Internet secure zone

R1: R2: LF ok! But: WPE violated in Round 1!

slide-37
SLIDE 37

Going Back to Our Examples: WPE Update?

insecure Internet secure zone

slide-38
SLIDE 38

Going Back to Our Examples: WPE Update!

insecure Internet secure zone insecure Internet secure zone insecure Internet secure zone

R1: R2:

slide-39
SLIDE 39

Going Back to Our Examples: WPE Update!

insecure Internet secure zone insecure Internet secure zone insecure Internet secure zone

R1: R2: … ok but may violate LF in Round 1!

slide-40
SLIDE 40

Going Back to Our Examples: Both WPE+LF?

insecure Internet secure zone

slide-41
SLIDE 41

Going Back to Our Examples: WPE+LF!

insecure Internet secure zone insecure Internet secure zone

R1: R2:

insecure Internet secure zone

R3:

slide-42
SLIDE 42

Going Back to Our Examples: WPE+LF!

insecure Internet secure zone insecure Internet secure zone

R1: R2:

insecure Internet secure zone

R3: Is there always a WPE+LF schedule?

slide-43
SLIDE 43

What about this one?

slide-44
SLIDE 44

LF and WPE may conflict! ❏ Cannot update any forward edge in R1: WP ❏ Cannot update any backward edge in R1: LF No schedule exists!

slide-45
SLIDE 45

LF and WPE may conflict! ❏ Cannot update any forward edge in R1: WP ❏ Cannot update any backward edge in R1: LF No schedule exists!

Good Network Updates for Bad Packets: Waypoint Enforcement Beyond Destination-Based Routing Policies Arne Ludwig, Matthias Rost, Damien Foucard, and Stefan Schmid. 13th ACM Workshop on Hot Topics in Networks (HotNets), Los Angeles, California, USA, October 2014.

slide-46
SLIDE 46

What about this one?

slide-47
SLIDE 47

What about this one? 1 ❏ Forward edge after the waypoint: safe!

❏ No loop, no WPE violation

slide-48
SLIDE 48

What about this one? 2 ❏ Now this backward is safe too!

❏ No loop because exit through 1

1

slide-49
SLIDE 49

What about this one? 1 2 3 ❏ Now this is safe: ready back to WP!

❏ No waypoint violation

2

slide-50
SLIDE 50

What about this one? 1 2 3 4 4 ❏ Ok: loop-free and also not on the path (exit via ) 1

slide-51
SLIDE 51

What about this one? 1 2 3 ❏ Ok: loop-free and also not on the path (exit via ) 4 4 1

slide-52
SLIDE 52

What about this one? 1 2 3 4 4 5

slide-53
SLIDE 53

Back to the start: What if…. 1

slide-54
SLIDE 54

Back to the start: What if…. also this one?! 1 1

slide-55
SLIDE 55

Back to the start: What if…. also this one?! 1 1

❏ Update any of the 2 backward edges? LF 

slide-56
SLIDE 56

Back to the start: What if…. also this one?! 1 1

❏ Update any of the 2 backward edges? LF 

slide-57
SLIDE 57

Back to the start: What if…. also this one?! 1 1

❏ Update any of the 2 backward edges? LF 

slide-58
SLIDE 58

Back to the start: What if…. also this one?! 1 1

❏ Update any of the 2 backward edges? LF  ❏ Update any of the 2 other forward edges? WPE  ❏ What about a combination? Nope…

slide-59
SLIDE 59

Back to the start: What if…. also this one?! 1 1

slide-60
SLIDE 60

Back to the start: What if…. also this one?! 1 1

To update or not to update in the first round? That is the question… … which leads to NP-hardness!

slide-61
SLIDE 61

Back to the start: What if…. also this one?! 1 1

To update or not to update in the first round? That is the question… … which leads to NP-hardness!

Transiently Secure Network Updates Arne Ludwig, Szymon Dudycz, Matthias Rost, and Stefan Schmid. 42nd ACM SIGMETRICS, Antibes Juan-les-Pins, France, June 2016.

slide-62
SLIDE 62

Let us focus on loop-freedom only: always possible in n rounds! (How?) But how to minimize rounds?

slide-63
SLIDE 63

Example: Optimal 2-Round Update Schedules

slide-64
SLIDE 64

Clear: in Round 1 (R1), I can only update „forward“ links! What about last round? Observe: Update schedule read backward (i.e., updating from new to old policy), must also be legal! I.e., in last round (R2), I can do all „forward“ edges of old edges wrt to new ones! Symmetry!

Example: Optimal 2-Round Update Schedules

slide-65
SLIDE 65

Optimal Algorithm for 2-Round Instances: Leveraging Symmetry! ❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?
slide-66
SLIDE 66

❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

F F F B B B Optimal Algorithm for 2-Round Instances: Leveraging Symmetry!

slide-67
SLIDE 67

❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

F F F B B B Optimal Algorithm for 2-Round Instances: Leveraging Symmetry!

Old policy from left to right!

slide-68
SLIDE 68

❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

New policy from left to right!

F F F B B B Optimal Algorithm for 2-Round Instances: Leveraging Symmetry!

Old policy from left to right!

slide-69
SLIDE 69

❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backward wrt

(dashed) new path?

New policy from left to right!

F F F B B B Optimal Algorithm for 2-Round Instances: Leveraging Symmetry!

Old policy from left to right!

slide-70
SLIDE 70

❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backward wrt

(dashed) new path?

New policy from left to right!

F F F B B B Optimal Algorithm for 2-Round Instances: Leveraging Symmetry!

Old policy from left to right!

 F  B  B  F  B  F

slide-71
SLIDE 71

Optimal Algorithm for 2-Round Instances: Leveraging Symmetry! ❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backward wrt

(dashed) new path? Insight 1: In the 1st round, I can safely update all forwarding (F) edges! For sure loopfree.

F F F B B B  F  B  B  F  B  F

slide-72
SLIDE 72

Optimal Algorithm for 2-Round Instances: Leveraging Symmetry! ❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backward wrt

(dashed) new path? Insight 1: In the 1st round, I can safely update all forwarding (F) edges! For sure loopfree. Insight 2: Valid schedules are reversible! A valid schedule from old to new read backward is a valid schedule for new to old!

F F F B B B  F  B  B  F  B  F

slide-73
SLIDE 73

Optimal Algorithm for 2-Round Instances: Leveraging Symmetry! ❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backwart wrt

(dashed) new path? Insight 1: In the 1st round, I can safely update all forwarding (F) edges! For sure loopfree. Insight 3: Hence in the last round, I can safely update all forwarding (F) edges! For sure loopfree. Insight 2: Valid schedules are reversible! A valid schedule from old to new read backward is a valid schedule for new to old!

F F F B B B  F  B  B  F  B  F

slide-74
SLIDE 74

Optimal Algorithm for 2-Round Instances: Leveraging Symmetry! ❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backwart wrt

(dashed) new path? Insight 1: In the 1st round, I can safely update all forwarding (F) edges! For sure loopfree. Insight 3: Hence in the last round, I can safely update all forwarding (F) edges! For sure loopfree. 2-Round Schedule: If and only if there are no BB edges! Then I can update F edges in first round and F edges in second round! Insight 2: Valid schedules are reversible! A valid schedule from old to new read backward is a valid schedule for new to old!

slide-75
SLIDE 75

Optimal Algorithm for 2-Round Instances: Leveraging Symmetry! ❏ Classify nodes/edges with 2-letter code:

❏ F, B: Does (dashed) new edge point forward

  • r backward wrt (solid)
  • ld path?

❏ F, B: Does the (solid)

  • ld edge point forward
  • r backwart wrt

(dashed) new path? Insight 1: In the 1st round, I can safely update all forwarding (F) edges! For sure loopfree. Insight 3: Hence in the last round, I can safely update all forwarding (F) edges! For sure loopfree. 2-Round Schedule: If and only if there are no BB edges! Then I can update F edges in first round and F edges in second round! Insight 2: Valid schedules are reversible! A valid schedule from old to new read backward is a valid schedule for new to old! That is, FB must be in first round, BF must be in second round, and FF are flexible!

slide-76
SLIDE 76

Intuition Why 3 Rounds Are Hard ❏ Structure of a 3-round schedule:

Round 1 Round 2 Round 3

F edges: FF,FB F edges: FF,BF all edges: FF,FB,BF,BB

Round 1 Round 2 Round 3

FB BF BB

WLOG Boils down to:

FF

? ?

W.l.o.g., can do FB in R1 and BF in R3.

slide-77
SLIDE 77

Intuition Why 3 Rounds Are Hard ❏ Structure of a 3-round schedule:

Round 1 Round 2 Round 3

F edges: FF,FB F edges: FF,BF all edges: FF,FB,BF,BB

Round 1 Round 2 Round 3

FB BF BB

WLOG Boils down to:

FF

? ?

W.l.o.g., can do FB in R1 and BF in R3. Moving forward edges does not introduce loops, nor does making the graph sparser.

slide-78
SLIDE 78

A hard decision problem: when to update FF?

BB

Intuition Why 3 Rounds Are Hard

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2?

slide-79
SLIDE 79

A hard decision problem: when to update FF?

Exit from loop BB

Intuition Why 3 Rounds Are Hard

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2 ❏ E.g, updating FF-node v4 in R1 allows to update BB v6 in R2

slide-80
SLIDE 80

A hard decision problem: when to update FF?

No exit from loop! BB

Intuition Why 3 Rounds Are Hard

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2 ❏ E.g, updating FF-node v4 in R1 allows to update BB v6 in R2 ❏ But only if FF-node v3 is not updated as well in R1: potential loop

slide-81
SLIDE 81

A hard decision problem: when to update FF?

No exit from loop! BB

Intuition Why 3 Rounds Are Hard

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2 ❏ E.g, updating FF-node v4 in R1 allows to update BB v6 in R2 ❏ But only if FF-node v3 is not updated as well in R1: potential loop

slide-82
SLIDE 82

A hard decision problem: when to update FF?

BB

Intuition Why 3 Rounds Are Hard

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2 ❏ E.g, updating FF-node v4 in R1 allows to update BB v6 in R2 ❏ But only if FF-node v3 is not updated as well in R1: potential loop ❏ Smells like a gadget: which FF nodes to update when is hard!

slide-83
SLIDE 83

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2 ❏ E.g, updating FF-node v4 in R1 allows to update BB v6 in R2 ❏ But only if FF-node v3 is not updated as well in R1: potential loop ❏ Smells like a gadget: which FF nodes to update when is hard!

A hard decision problem: when to update FF?

BB

Intuition Why 3 Rounds Are Hard

Being greedy is bad! Don‘t update all FF!

slide-84
SLIDE 84

❏ We know: BB node v6 can only be updated in R2 ❏ When to update FF nodes to make enable update BB in R2 ❏ E.g, updating FF-node v4 in R1 allows to update BB v6 in R2 ❏ But only if FF-node v3 is not updated as well in R1: potential loop ❏ Smells like a gadget: which FF nodes to update when is hard!

A hard decision problem: when to update FF?

BB

Intuition Why 3 Rounds Are Hard

Being greedy is bad! Don‘t update all FF! Devil lies in details: original paths must also be valid! I.e., to prove that such a configuration can be reached.

slide-85
SLIDE 85

It‘s Good to Relax: How to update LF?

… s d v2 v3 vn-1 vn-2 v4

slide-86
SLIDE 86

LF Updates Can Take Many Rounds!

… s d v2 v3 vn-1 vn-2

Invariant: need to update v2 before v3!

v4

slide-87
SLIDE 87

LF Updates Can Take Many Rounds!

… s d v2 v3 vn-1 vn-2

Invariant: need to update v3 before v4!

v4

slide-88
SLIDE 88

LF Updates Can Take Many Rounds!

… s d v2 v3 vn-1 vn-2 v4

Induction: need to update vi-1 before vi (before vi+1 etc.)! (n) rounds?! In principle, yes…: Need a path back out before updating backward edge!

1 1 2 3

n-3 n-2

slide-89
SLIDE 89

It is good to relax!

… s d v2 v3 vn-1 vn-2 v4

1 1

But: If s has been updated, nodes not on (s,d)-path!

slide-90
SLIDE 90

It is good to relax!

… s d v2 v3 vn-1 vn-2 v4

1 1 2 2 2

Could be updated simultaneously! Could be updated simultaneously! Could be updated simultaneously! But: If s has been updated, nodes not on (s,d)-path!

slide-91
SLIDE 91

It is good to relax!

… s d v2 v3 vn-1 vn-2 v4

1 2 2 2

Could be updated simultaneously! Could be updated simultaneously! Could be updated simultaneously!

3 1

Finally put back on path! But: If s has been updated, nodes not on (s,d)-path!

slide-92
SLIDE 92

It is good to relax!

… s d v2 v3 vn-1 vn-2 v4

1 2 2 2

Could be updated simultaneously! Could be updated simultaneously! Could be updated simultaneously!

3 1

Finally put back on path!

3 rounds only!

But: If s has been updated, nodes not on (s,d)-path!

slide-93
SLIDE 93

A log(n)-time Algorithm: Peacock in Action

93

Shortcut Prune Prune Shortcut

slide-94
SLIDE 94

A log(n)-time Algorithm: Peacock in Action

94

Shortcut Prune Prune Shortcut Greedily choose far-reaching (independent) forward edges.

update

slide-95
SLIDE 95

A log(n)-time Algorithm: Peacock in Action

95

Shortcut Prune Prune Shortcut R1 generated many nodes in branches which can be updated simultaneously!

update

slide-96
SLIDE 96

A log(n)-time Algorithm: Peacock in Action

96

Shortcut Prune Prune Shortcut Line re-established! (all merged with a node on the s-d-path)

slide-97
SLIDE 97

A log(n)-time Algorithm: Peacock in Action

97

Shortcut Prune Prune Shortcut Peacock orders nodes wrt to distance: edge

  • f length x can block at most 2 edges of

length x, so distance 2x.

slide-98
SLIDE 98

A log(n)-time Algorithm: Peacock in Action

98

Shortcut Prune Prune Shortcut At least 1/3 of nodes merged in each round pair (shorter s-d path): logarithmic runtime!

slide-99
SLIDE 99

A log(n)-time Algorithm: Peacock in Action

99

Shortcut Prune Prune Shortcut

slide-100
SLIDE 100

A log(n)-time Algorithm: Peacock in Action Shortcut Prune Prune Shortcut

Scheduling Loop-free Network Updates: It's Good to Relax! Arne Ludwig, Jan Marcinkowski, and Stefan Schmid. ACM Symposium on Principles of Distributed Computing (PODC), Donostia-San Sebastian, Spain, July 2015.

slide-101
SLIDE 101

Remark on the Model Easy to update new nodes which do not appear in old policy. And just keep nodes which are not on new path!

slide-102
SLIDE 102

Loop-Freedom: Summary of Results

❏ Minimizing the number of rounds

❏ For 2-round instances: polynomial time ❏ For 3-round instances: NP-hard, no approximation known

❏ Relaxed notion of loop-freedom: O(log n) rounds

❏ No approximation known

❏ Maximizing the number of updated edges per round: NP-hard (dual feedback arc set) and bad (large number of rounds)

❏ dFASP on simple graphs (out-degree 2 and originates from paths!) ❏ Even hard on bounded treewidth? ❏ Resulting number of rounds up to (n) although O(1) possible

❏ Multiple policies: aggregate updates to given switch!

❏ Related to Shortest Common Supersequence Problem

slide-103
SLIDE 103

Loop-Freedom: Summary of Results

❏ Minimizing the number of rounds

❏ For 2-round instances: polynomial time ❏ For 3-round instances: NP-hard, no approximation known

❏ Relaxed notion of loop-freedom: O(log n) rounds

❏ No approximation known

❏ Maximizing the number of updated edges per round: NP-hard (dual feedback arc set) and bad (large number of rounds)

❏ dFASP on simple graphs (out-degree 2 and originates from paths!) ❏ Even hard on bounded treewidth? ❏ Resulting number of rounds up to (n) although O(1) possible

❏ Multiple policies: aggregate updates to given switch!

❏ Related to Shortest Common Supersequence Problem

Being greedy is bad! And hard 

slide-104
SLIDE 104

Extension: Multiple Policies

At least one node needs to be touched twice:

  • therwise at least one

flow will have a temporary loop: Worst case: k policies require k touches!

slide-105
SLIDE 105

Extension: Multiple Policies

At least one node needs to be touched twice:

  • therwise at least one

flow will have a temporary loop: Worst case: k policies require k touches! On the positive side: given individual transiently consistent schedules, can optimally combine them using dynamic programming! Independently of the consistency property.

slide-106
SLIDE 106

Extension: Multiple Policies

At least one node needs to be touched twice:

  • therwise at least one

flow will have a temporary loop: Worst case: k policies require k touches! On the positive side: given individual transiently consistent schedules, can optimally combine them using dynamic programming! Independently of the consistency property.

Can't Touch This: Consistent Network Updates for Multiple Policies Szymon Dudycz, Arne Ludwig, and Stefan Schmid. 46th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Toulouse, France, June 2016.

slide-107
SLIDE 107

Conclusion

Ctrl

Control Programs Control Programs

Applications and Control Plane … and regarding inter-connect! Data Plane E.g., robust failover. E.g., admission control and routing with waypoints. E.g., distributed control but also MAC learning (Jen@Dagstuhl)! E.g., network updates or self-stabilizing in-band control network.

slide-108
SLIDE 108

Own References

Can't Touch This: Consistent Network Updates for Multiple Policies Szymon Dudycz, Arne Ludwig, and Stefan Schmid. 46th IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Toulouse, France, June 2016. Transiently Secure Network Updates Arne Ludwig, Szymon Dudycz, Matthias Rost, and Stefan Schmid. 42nd ACM SIGMETRICS, Antibes Juan-les-Pins, France, June 2016. Scheduling Loop-free Network Updates: It's Good to Relax! Arne Ludwig, Jan Marcinkowski, and Stefan Schmid. ACM Symposium on Principles of Distributed Computing (PODC), Donostia-San Sebastian, Spain, July 2015. Medieval: Towards A Self-Stabilizing, Plug & Play, In-Band SDN Control Network (Demo Paper) Liron Schiff, Stefan Schmid, and Marco Canini. ACM Sigcomm Symposium on SDN Research (SOSR), Santa Clara, California, USA, June 2015. A Distributed and Robust SDN Control Plane for Transactional Network Updates Marco Canini, Petr Kuznetsov, Dan Levin, and Stefan Schmid. 34th IEEE Conference on Computer Communications (INFOCOM), Hong Kong, April 2015. Good Network Updates for Bad Packets: Waypoint Enforcement Beyond Destination-Based Routing Policies Arne Ludwig, Matthias Rost, Damien Foucard, and Stefan Schmid. 13th ACM Workshop on Hot Topics in Networks (HotNets), Los Angeles, California, USA, October 2014. Provable Data Plane Connectivity with Local Fast Failover: Introducing OpenFlow Graph Algorithms Michael Borokhovich, Liron Schiff, and Stefan Schmid. ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking (HotSDN), Chicago, Illinois, USA, August 2014.