Scalable Verification of Stateful Networks Aurojit Panda, Ori - - PowerPoint PPT Presentation

scalable verification of stateful networks
SMART_READER_LITE
LIVE PREVIEW

Scalable Verification of Stateful Networks Aurojit Panda, Ori - - PowerPoint PPT Presentation

Scalable Verification of Stateful Networks Aurojit Panda, Ori Lahav, Katerina Argyraki, Mooly Sagiv, Scott Shenker UC Berkeley, TAU, ICSI Roadmap Why consider stateful networks? The current state of stateful network verification?


slide-1
SLIDE 1

Scalable Verification of Stateful Networks

Aurojit Panda, Ori Lahav, Katerina Argyraki, Mooly Sagiv, Scott Shenker UC Berkeley, TAU, ICSI

slide-2
SLIDE 2

Roadmap

  • Why consider stateful networks?
  • The current state of stateful network verification?
  • VMN: Our system for verifying stateful networks.
  • Scaling verification.
slide-3
SLIDE 3

Why consider stateful networks?

slide-4
SLIDE 4

Network State Increasingly Common

  • 1/3rd of deployed network devices are middleboxes
  • These are typically stateful (e.g., firewalls, caches, etc.)
  • NFV will only make these more common
slide-5
SLIDE 5

Network State Increasingly Common

  • 1/3rd of deployed network devices are middleboxes
  • These are typically stateful (e.g., firewalls, caches, etc.)
  • NFV will only make these more common
  • Later in this conference: stateful programming for P4 switches.
  • SNAP: Stateful Network-Wide Abstractions for Packet Processing
slide-6
SLIDE 6

Network State Increasingly Common

  • 1/3rd of deployed network devices are middleboxes
  • These are typically stateful (e.g., firewalls, caches, etc.)
  • NFV will only make these more common
  • Later in this conference: stateful programming for P4 switches.
  • SNAP: Stateful Network-Wide Abstractions for Packet Processing
  • Bottomline: Stateful is increasingly relevant.
slide-7
SLIDE 7

Verification Checks Invariants

  • We look at Reachability/Isolation invariants (same as stateless verification)
slide-8
SLIDE 8

Verification Checks Invariants

  • We look at Reachability/Isolation invariants (same as stateless verification)
  • Packets from host A cannot reach host B
slide-9
SLIDE 9

Verification Checks Invariants

  • We look at Reachability/Isolation invariants (same as stateless verification)
  • Packets from host A cannot reach host B
  • But statefulness raises some important issues:
slide-10
SLIDE 10

Verification Checks Invariants

  • We look at Reachability/Isolation invariants (same as stateless verification)
  • Packets from host A cannot reach host B
  • But statefulness raises some important issues:
  • Invariants include temporal aspects.
slide-11
SLIDE 11

Verification Checks Invariants

  • We look at Reachability/Isolation invariants (same as stateless verification)
  • Packets from host A cannot reach host B
  • But statefulness raises some important issues:
  • Invariants include temporal aspects.
  • Storing state can result in spooky action at a distance.
slide-12
SLIDE 12

Temporal Invariants

Server 0 Server 1 Firewall User 0 User 1 User 1 receives no packets from server 0 unless a connection is initiated.

deny server* user*

slide-13
SLIDE 13

Temporal Invariants

Server 0 Server 1 Firewall User 0 User 1 User 1 receives no packets from server 0 unless a connection is initiated.

Standard Reachability Temporal Property

deny server* user*

slide-14
SLIDE 14

Action at a Distance

Server 0 Server 1 Firewall Cache User 0 User 1

deny user1 server0

User 1 receives no packets from Server 0

slide-15
SLIDE 15

Action at a Distance

Server 0 Server 1 Firewall

Secret

Cache User 0 User 1

deny user1 server0

User 1 receives no packets from Server 0

slide-16
SLIDE 16

Action at a Distance

Server 0 Server 1 Firewall

Secret Secret

Cache User 0 User 1

deny user1 server0

User 1 receives no packets from Server 0

slide-17
SLIDE 17

Action at a Distance

Server 0 Server 1 Firewall

Secret Secret

Cache User 0 User 1

deny user1 server0

User 1 receives no packets from Server 0

Secret

slide-18
SLIDE 18

Action at a Distance

Server 0 Server 1 Firewall

Secret Secret

Cache User 0 User 1

deny user1 server0

User 1 receives no packets from Server 0 User 1 receives no data from Server 0

Secret

slide-19
SLIDE 19

Roadmap

  • Why consider stateful networks?
  • The current state of stateful network verification?
  • VMN: Our system for verifying stateful networks.
  • Scaling verification.
slide-20
SLIDE 20

Network Verification Today

  • Lots of existing work has looked at network verification.
slide-21
SLIDE 21

Network Verification Today

  • Lots of existing work has looked at network verification.
  • Switches: Static forwarding rules in switches.

HSA, Veriflow, NetKAT, etc.

slide-22
SLIDE 22

Network Verification Today

  • Lots of existing work has looked at network verification.
  • Switches: Static forwarding rules in switches.

HSA, Veriflow, NetKAT, etc.

  • SDN Controller: Code generating these rules.

Vericon, FlowLog, etc

slide-23
SLIDE 23

Network Verification Today

  • Lots of existing work has looked at network verification.
  • Switches: Static forwarding rules in switches.

HSA, Veriflow, NetKAT, etc.

  • SDN Controller: Code generating these rules.

Vericon, FlowLog, etc

  • Testing for stateful networks

Buzz: Generate packets that are likely to trigger interesting behavior.

slide-24
SLIDE 24

Network Verification Today

  • Lots of existing work has looked at network verification.
  • Switches: Static forwarding rules in switches.

HSA, Veriflow, NetKAT, etc.

  • SDN Controller: Code generating these rules.

Vericon, FlowLog, etc

  • Testing for stateful networks

Buzz: Generate packets that are likely to trigger interesting behavior.

  • Verification for stateful networks

SymNet: Uses symbolic execution to verify networks with middleboxes.

slide-25
SLIDE 25

Roadmap

  • Why consider stateful networks?
  • The current state of stateful network verification?
  • VMN: Our system for verifying stateful networks.
  • Scaling verification.
slide-26
SLIDE 26

VMN: System for scalable verification of stateful networks.

slide-27
SLIDE 27

VMN Flow

Model each middlebox in the network Build network forwarding model Invariant Holds Example of violation Logical Invariants SMT Solver (Z3 from MSR)

slide-28
SLIDE 28

Modeling Middleboxes

  • One approach: Extract model from code
slide-29
SLIDE 29

Modeling Middleboxes

  • One approach: Extract model from code
  • Problem: At the wrong level of abstraction.
slide-30
SLIDE 30

Modeling Middleboxes

  • One approach: Extract model from code
  • Problem: At the wrong level of abstraction.
  • Code written to match bit patterns in packet, etc.
slide-31
SLIDE 31

Modeling Middleboxes

  • One approach: Extract model from code
  • Problem: At the wrong level of abstraction.
  • Code written to match bit patterns in packet, etc.
  • Configuration is in terms of higher level abstractions
slide-32
SLIDE 32

Modeling Middleboxes

  • One approach: Extract model from code
  • Problem: At the wrong level of abstraction.
  • Code written to match bit patterns in packet, etc.
  • Configuration is in terms of higher level abstractions
  • E.g., source and destination addresses, payload matches regex, etc.
slide-33
SLIDE 33

Modeling Middleboxes

  • One approach: Extract model from code
  • Problem: At the wrong level of abstraction.
  • Code written to match bit patterns in packet, etc.
  • Configuration is in terms of higher level abstractions
  • E.g., source and destination addresses, payload matches regex, etc.
  • Operators think and configure in terms of these abstractions.
slide-34
SLIDE 34

Modeling Middleboxes

  • One approach: Extract model from code
  • Problem: At the wrong level of abstraction.
  • Code written to match bit patterns in packet, etc.
  • Configuration is in terms of higher level abstractions
  • E.g., source and destination addresses, payload matches regex, etc.
  • Operators think and configure in terms of these abstractions.
  • Verify invariants written in these terms.
slide-35
SLIDE 35

Example Middlebox Configuration

  • Drop all packets from connections transmitting infected files.
  • How to define infected files: bit pattern for all worms: not really accurate
  • Also not how operators think about this.
slide-36
SLIDE 36

Modeling Middleboxes

  • Take a different tack: model specified in terms of classification oracle.
  • Oracle responsible for classifying packet.
  • We are not verifying implementation (nor is anyone else).
slide-37
SLIDE 37

Modeling Middleboxes

  • Take a different tack: model specified in terms of classification oracle.
  • Oracle responsible for classifying packet.
  • We are not verifying implementation (nor is anyone else).
  • Model specifies forwarding behavior in terms of these abstractions.
  • Need to know forwarding behavior to reason about reachability.
  • Require that any state that affects forwarding behavior also specified.
slide-38
SLIDE 38

Modeling Middleboxes

slide-39
SLIDE 39

Modeling Middleboxes

Classify Packet

Determines what application sent a packet, etc. Complex, proprietary processing.

slide-40
SLIDE 40

Modeling Middleboxes

Classify Packet Update Classification State

Determines what application sent a packet, etc. Complex, proprietary processing. Update state required for classification.

slide-41
SLIDE 41

Modeling Middleboxes

Classify Packet Update Classification State

Determines what application sent a packet, etc. Complex, proprietary processing. Update state required for classification.

Update Forwarding State

Update forwarding State.

slide-42
SLIDE 42

Modeling Middleboxes

Classify Packet Update Classification State Forward Packet

Determines what application sent a packet, etc. Complex, proprietary processing. Update state required for classification. Always simple: forward or drop packets.

Update Forwarding State

Update forwarding State.

slide-43
SLIDE 43

Modeling Middleboxes

Classify Packet Update Classification State Forward Packet

Determines what application sent a packet, etc. Complex, proprietary processing. Update state required for classification. Always simple: forward or drop packets.

Oracle: Specify data dependencies and outputs

Update Forwarding State

Update forwarding State.

slide-44
SLIDE 44

Modeling Middleboxes

Classify Packet Update Classification State Forward Packet

Determines what application sent a packet, etc. Complex, proprietary processing. Update state required for classification. Always simple: forward or drop packets.

Oracle: Specify data dependencies and outputs Forwarding Model: Specify Completely

Update Forwarding State

Update forwarding State.

slide-45
SLIDE 45

Modeling Middleboxes

Classify Packet Forward Packet Update Forwarding State Update Classification State

slide-46
SLIDE 46

Modeling Middleboxes

Classify Packet Forward Packet Update Forwarding State

Outputs Is packet infected. Dependencies See all packets in connection (flow).

Update Classification State

slide-47
SLIDE 47

Modeling Middleboxes

Classify Packet Forward Packet Update Forwarding State

Outputs Is packet infected. Dependencies See all packets in connection (flow). if (infected) { infected_connections.add(packet.flow) }

Update Classification State

slide-48
SLIDE 48

Modeling Middleboxes

Classify Packet Forward Packet Update Forwarding State

Outputs Is packet infected. Dependencies See all packets in connection (flow). if (packet.flow not in infected_connections) { forward (packet); } if (infected) { infected_connections.add(packet.flow) }

Update Classification State

slide-49
SLIDE 49

Modeling Middleboxes

infected connection( flow(p)) = ) (♦rcv(n, p0)^ flow(p0) = flow(p)^ infected(p)) snd(n, p) = ⇒ (♦rcv(n, p)∧ ¬infected connection(flow(p)))

slide-50
SLIDE 50

VMN Flow

Model each middlebox in the network Build network forwarding model Invariant Holds Example of violation Logical Invariants SMT Solver (Z3 from MSR)

slide-51
SLIDE 51

Network Transfer Functions

  • Kazemian 2012 developed the idea of a network transfer function.
  • A single function modeling the behavior of the entire network.
  • VMN models static elements in the network using a transfer function.
slide-52
SLIDE 52

Network Transfer Function

Firewall (f) Cache (c) Switch Router Switch A B C D

slide-53
SLIDE 53

f(p, port) ≡                (p, f) if port = A ∧ (dst(p) = C ∨ dst(p) = D) (p, c) if port = f ∧ dst(p) = C ∨ dst(p) = D) (p, C) if port = c ∧ dst(p) = C (p, D) if port = c ∧ dst(p) = D . . .

Network Transfer Function

Firewall (f) Cache (c) A B C D

slide-54
SLIDE 54

Roadmap

  • Why consider stateful networks?
  • The current state of stateful network verification?
  • VMN: Our system for verifying stateful networks.
  • Scaling verification.
slide-55
SLIDE 55

Networks are Large

  • Networks are huge in practice
  • For example Google had 900K machines (approximately) in 2011
  • ISPs connect large numbers of machines.
  • Lots of middleboxes in these networks
  • In datacenter each machine might be one or more middlebox.
  • How do we address this?
slide-56
SLIDE 56

Scaling Techniques Thus Far

  • Abstract middlebox models
  • Simplify what needs to be considered per-middlebox.
  • Abstract network
  • Simplify network forwarding.
slide-57
SLIDE 57

Those Techniques are not Enough

slide-58
SLIDE 58

Those Techniques are not Enough

  • TACAS 2016: Network verification with state is EXPSPACE-complete.
slide-59
SLIDE 59

Those Techniques are not Enough

  • TACAS 2016: Network verification with state is EXPSPACE-complete.
  • Practically for us SMT solvers timeout with large instances.
slide-60
SLIDE 60

Those Techniques are not Enough

  • TACAS 2016: Network verification with state is EXPSPACE-complete.
  • Practically for us SMT solvers timeout with large instances.
  • Other methods also do not handle such large instances
  • Symbolic execution is exponential in number of branches, not better.
slide-61
SLIDE 61

Those Techniques are not Enough

  • TACAS 2016: Network verification with state is EXPSPACE-complete.
  • Practically for us SMT solvers timeout with large instances.
  • Other methods also do not handle such large instances
  • Symbolic execution is exponential in number of branches, not better.
  • Our techniques work for small instances, what to do about large instances?
slide-62
SLIDE 62

Scaling Verification

  • Challenge: Run verification on a subnetwork of size independent of network.
  • Avoid instability and scale to arbitrary network sizes.
slide-63
SLIDE 63

Scaling Verification

  • Challenge: Run verification on a subnetwork of size independent of network.
  • Avoid instability and scale to arbitrary network sizes.
  • Goal: Identify subnetwork where verification results translate to whole network.
slide-64
SLIDE 64

Network Slices

  • Slices: Subnetworks for which a bisimulation with the original network exists.
  • Ensures equivalent step in subnetwork for each step in the original network
  • Slices are selected depending on the invariant being checked.
slide-65
SLIDE 65

Network Slices

ACME Hosting Willie E Coyote Road Runner Firewall Cache Sylvester Tweety Firewall

predator 6$ prey server prey 6$ predator server

slide-66
SLIDE 66

Network Slices

ACME Hosting Willie E Coyote Road Runner Firewall Cache Sylvester Tweety Firewall

predator 6$ prey server prey 6$ predator server

Invariant: RR cannot access data from Coyote’s server

slide-67
SLIDE 67

Network Slices

ACME Hosting Willie E Coyote Road Runner Firewall Cache Sylvester Tweety Firewall

predator 6$ prey server prey 6$ predator server

Invariant: RR cannot access data from Coyote’s server Willie E Coyote

slide-68
SLIDE 68

Network Slices

ACME Hosting Willie E Coyote Road Runner Firewall Cache Sylvester Tweety Firewall

predator 6$ prey server prey 6$ predator server

Invariant: RR cannot access data from Coyote’s server Willie E Coyote Firewall Cache

slide-69
SLIDE 69

Network Slices

ACME Hosting Willie E Coyote Road Runner Firewall Cache Sylvester Tweety Firewall

predator 6$ prey server prey 6$ predator server

Invariant: RR cannot access data from Coyote’s server Willie E Coyote Firewall Cache

slide-70
SLIDE 70

Network Slices

ACME Hosting Willie E Coyote Road Runner Firewall Cache Sylvester Tweety Firewall

predator 6$ prey server prey 6$ predator server

Invariant: RR cannot access data from Coyote’s server Willie E Coyote Firewall Cache

Establishes a bisimulation between slice and network. Allows us to prove invariants in the slice.

slide-71
SLIDE 71

Cannot always find such a slice.

slide-72
SLIDE 72

Finding Slices: Flow Parallel Middleboxes

  • To achieve performance, many middleboxes are flow parallel
  • State from one connection cannot affect another connection.
  • Example: Stateful firewall.
  • For networks with only flow parallel NFs
  • Only need to consider paths between hosts.
  • Network slices whose slice is independent of network size.
slide-73
SLIDE 73

Finding Slices: Origin Equivalence

  • Middleboxes like caches don’t distinguish where a request originates
  • More generally, state is shared, but origin does not matter.
  • In this case, need to ensure that all states in the network can appear in a slice.
  • Pick one member from each policy group.
  • Scalable if increasing network size does not increase number of policy groups
slide-74
SLIDE 74

Symmetry: Going Beyond Slices

  • Slices merely reduce the size of the problem for each invariant
  • Number of invariants is still a problem.
  • Rely on the observation that lots of hosts in networks are symmetric
  • Policies largely applied to groups of hosts (departments, etc.)
  • Can use this symmetry to reduce number of invariants checked
slide-75
SLIDE 75

Evaluation Setup: Datacenter

  • Consider AWS like multi-tenant datacenter.
  • Each tenant has policies for private and public hosts.
  • Three verification tasks
  • Private hosts for one tenant cannot reach another
  • Public host for one tenant cannot reach private hosts for another
  • Public hosts are universally reachable.
slide-76
SLIDE 76

Verification Time (Datacenter)

0.01 0.1 1 10 100 1000 10000 100000 Slice 5 10 15 20 Time (S) # of Tenants Priv-Priv Pub-Priv Priv-Pub

slide-77
SLIDE 77

Verification Time (Datacenter)

0.01 0.1 1 10 100 1000 10000 100000 Slice 5 10 15 20 Time (S) # of Tenants Priv-Priv Pub-Priv Priv-Pub

slide-78
SLIDE 78

Role of Symmetry

  • Consider a private datacenter
  • User verification to prevent some bugs from a Microsoft DC (IMC 2013)
  • Bugs include
  • Misconfigured firewalls
  • Misconfigured redundant firewalls
  • Misconfigured redundant routing
  • Measure time to verify as a function of number of symmetric policy groups
slide-79
SLIDE 79

Verification Time (With Symmetry)

50 100 150 200 250 300 350 25 50 100 250 500 1000 Time (S) # of Policy Equivalence Classes Rules Redundancy Traversal

slide-80
SLIDE 80

Conclusion

  • Verifying stateful networks is increasingly more important.
  • The primary challenge is scaling to realistic network.
  • Splitting network into smaller verifiable portions is necessary.