SLIDE 1
HotSwap: Correct and Efficient Controller Upgrades for - - PowerPoint PPT Presentation
HotSwap: Correct and Efficient Controller Upgrades for - - PowerPoint PPT Presentation
HotSwap: Correct and Efficient Controller Upgrades for Software-Defined Networks Laurent Vanbever vanbever@cs.princeton.edu HotSDN August, 16 2013 Joint work with Joshua Reich, Theophilus Benson, Nate Foster and Jennifer Rexford HotSwap:
SLIDE 2
SLIDE 3
disruptive & incorrect
Today’s upgrades
1
The HotSwap system
record, replay, swap
Scalability & correctness
HotSwap: Correct and Efficient Controller Upgrades for Software-Defined Networks
filter & specify
SLIDE 4
As any piece of complex software, SDN controller must be frequently upgraded
SDN controllers must be upgraded to fix bugs improve performance deploy new features or applications
SLIDE 5
As any piece of complex software, SDN controller must be frequently upgraded
7 33 Floodlight Trema 15 Ryu Pox
# commits # releases
3* 1349 897 2106 2670
source: GitHub
(over 2 years) SDN controller
* Pox uses branches instead of releases
SLIDE 6
As any piece of complex software, SDN controller must be frequently upgraded
7 33 Floodlight Trema 15 Ryu Pox
# commits # releases
1 1349 897 2106 2670
source: GitHub
(over 2 years) SDN controller
How is it done today?
SLIDE 7
SDN controllers are usually upgraded by rebooting the controller on the new version
SLIDE 8
SDN controllers are usually upgraded by restarting the controller on the new version
During a controller restart, any is ignored network failure rule timeout diverted packet
SLIDE 9
SDN controllers are usually upgraded by restarting the controller on the new version
After a restart, the controller leading to losses and delays resets all network forwarding state to prevent inconsistencies recreates its state according to the current network traffic leading to bugs
SLIDE 10
SDN controllers are usually upgraded by rebooting the controller on the new version
After a reboot, the controller leading to losses and delays resets all network forwarding state to prevent inconsistencies recreates its state according to the current network traffic leading to bugs
Is it really a problem?
SLIDE 11
Restarting a controller can create network-wide disruption
SLIDE 12
60 100
time (s) probes lost (%)
SLIDE 13
60 100
time (s) probes lost (%)
We stop the controller after 15 seconds
stop
15
SLIDE 14
60 100
time (s) probes lost (%)
We restart it controller after 20 seconds
stop restart
20
SLIDE 15
20 40 60 80 100 5 10 20 30 40 50 60 60 22 37 100 83
time (s) probes lost (%)
Soon after the controller restart, the network suffered from important network-wide losses
stop restart
SLIDE 16
Restarting a controller can create bugs
SLIDE 17
Controller Host 1 Internet Host 2 10 05 H1 H2 H2 H1 fwd fwd
Forwarding table
Let’s restart a controller running a stateful firewall which only allows connection initiated from the inside
stateful firewall
SLIDE 18
Controller Host 1 Internet Host 2 10 05 H1 H2 H2 H1 fwd fwd
Forwarding table
stateful firewall
Let’s restart a controller running a stateful firewall which only allows connection initiated from the inside
SLIDE 19
Controller Host 1 Internet Host 2
Forwarding table
10 05 H1 H2 H2 H1 fwd fwd
Upon restart, the controller wipes out all the forwarding entries
*drop ¡ALL* stateful firewall
SLIDE 20
Controller Host 1 Internet Host 2
Forwarding table
Upon restart, the controller wipes out all the forwarding entries
stateful firewall
SLIDE 21
Controller Host 1 Internet Host 2
Forwarding table
Ongoing flows for which externally originated packets are received first will get dropped by the controller
stateful firewall
SLIDE 22
Controller Host 1 Internet Host 2
Forwarding table
15 H2 H1 drop stateful firewall
Ongoing flows for which externally originated packets are received first will get dropped by the controller
SLIDE 23
Ongoing flows for which externally originated packets are received first will get dropped by the controller
Controller Host 1 Internet Host 2
Forwarding table
15 H2 H1 drop stateful firewall
Restarting the controller can cause allowed traffic to be blocked
SLIDE 24
Ongoing flows for which externally originated packets are received first will get dropped by the controller
Controller Host 1 Internet Host 2
Forwarding table
15 H2 H1 drop stateful firewall
Restarting the controller can also cause forbidden traffic to be allowed
SLIDE 25
disruptive & incorrect
Today’s upgrades The HotSwap system
record, replay, swap 2
Scalability & correctness
filter & specify
HotSwap: Correct and Efficient Controller Upgrades for Software-Defined Networks
SLIDE 26
Keeping as much traffic in the network
avoiding network-wide disruptions
Recreate state in the upgraded controller
in a controlled fashion, guaranteeing correctness
Tolerating different control and forwarding behavior
between the new and old controller
HotSwap warms up the upgraded controller before giving it control over the network
SLIDE 27
OpenFlow messages
Network v1
SDN Controller
SLIDE 28
HotSwap
HotSwap is a hypervisor that sits between the network and the controller
Network v1
SLIDE 29
Network v1 HotSwap
HotSwap proceeds in four stages: record, replay, compare & replace
SLIDE 30
Network State
HotSwap
In the record stage, HotSwap maintains a copy of the network state
Network Events
Network v1
Forwarding Rules v1
SLIDE 31
v2 HotSwap
When an upgrade is initiated, HotSwap sets the upgraded controller as slave
Master Slave
Only the master controller can write to the network
Network v1
SLIDE 32
v2
Network State
Network Events Forwarding Rules v1
HotSwap then replays the recorded network events against the upgraded controller
Master Slave
Network v1 HotSwap
SLIDE 33
Network State
Forwarding Rules v1 Forwarding Rules v2 Network Events
During the replay, HotSwap records the forwarding rules generated by the upgraded controller
v2
Master Slave
Network v1 HotSwap
SLIDE 34
v2 HotSwap
Network State
Forwarding Rules v1 Forwarding Rules v2 Network Events
Once the replay is completed, HotSwap computes the deltas between the initial and upgraded rules
Δ
Master Slave
Network v1
SLIDE 35
v2 HotSwap
Network State
Forwarding Rules v2 Network Events
In the replace stage, HotSwap sets the upgraded controller as master and installs the deltas
Forwarding Rules v1
Δ
Master Slave
Network v1
SLIDE 36
v2 HotSwap
HotSwap finally removes the initial controller and re-enters the record stage
Network State
Network Events Forwarding Rules v2
Master
Network
SLIDE 37
HotSwap performs upgrade in a disruption-free manner
SLIDE 38
60 100
time (s) probes lost (%)
Using HotSwap, not a single packet is lost during the upgrade
HotSwap
Restart
SLIDE 39
disruptive & incorrect
Today’s upgrades The HotSwap system
record, replay, swap
Scalability & correctness
filter & specify 3
HotSwap: Correct and Efficient Controller Upgrades for Software-Defined Networks
SLIDE 40
Recording all network events does not scale
SLIDE 41
Recording all network events does not scale
... but is not needed!
SLIDE 42
Most stateful controllers only require some events to be replayed
SLIDE 43
Yes No Last History Event dependency Network-Traffic Dependency
The number and type of events to be recorded depend on the controller category ...
SLIDE 44
Yes No Last History Event dependency Network-Traffic Dependency
... whether their state depend on the actual traffic being exchanged
SLIDE 45
Yes No Last History Event dependency Network-Traffic Dependency
... whether their state depend on the last network event or on an history of events
SLIDE 46
Learning-Switch Shortest-Path Routing Reliable Routing Stateful Firewall Yes No Network-Traffic Dependency Last History Event dependency
SLIDE 47
Learning-Switch Shortest-Path Routing Reliable Routing Stateful Firewall Yes No Network-Traffic Dependent Last History Event dependency
HotSwap provides a query language to filter stream of events at record and replay time
SLIDE 48
What does it mean for an upgrade to be correct?
SLIDE 49
When we upgrade from v1 to v2, We would like the network to behave as if v2 had been running since the beginning
SLIDE 50
What does it mean? When we upgrade from v1 to v2, We would like the network to behave as if v2 had been running since the beginning
SLIDE 51
same forwarding rules? same forwarding semantic? eventual semantic consistency?
What does it mean? When we upgrade from v1 to v2, We would like the network to behave as if v2 had been running since the beginning
SLIDE 52
same forwarding rules? same forwarding semantic? eventual semantic consistency? It depends ...
SLIDE 53
same forwarding rules? same forwarding semantic? eventual semantic consistency?
=
≅
The operator defines a relation that captures the acceptable differences on the controller outputs
HotSwap verifies if the desired correctness criteria is met before swapping controllers
SLIDE 54
disruptive & incorrect
Today’s upgrades The HotSwap system
record, replay, swap
Scalability & correctness
query language
HotSwap: Correct and Efficient Controller Upgrades for Software-Defined Networks
SLIDE 55
HotSwap enables disruption-free and correct SDN controller upgrade
no assumption on the controller or on the application first implementation on top of FlowVisor
minimum input from the network operator
HotSwap
works in practice is highly general is easy to use
SLIDE 56