Reflections on data plane performance, iptables and ipsets Neil - - PowerPoint PPT Presentation

reflections on data plane
SMART_READER_LITE
LIVE PREVIEW

Reflections on data plane performance, iptables and ipsets Neil - - PowerPoint PPT Presentation

Reflections on data plane performance, iptables and ipsets Neil Jerram Metaswitch & Project Calico @neiljerram www.projectcalico.org Who am I? Free software hacker since 1990s ; line+.el ; ; version 1.1 ; ; This has not (yet)


slide-1
SLIDE 1

Reflections on data plane performance, iptables and ipsets

Neil Jerram – Metaswitch & Project Calico @neiljerram www.projectcalico.org

slide-2
SLIDE 2

Who am I?

  • Free software hacker since 1990s
  • Metaswitch (previously Data Connection) since 1995

; line+.el ; ; version 1.1 ; ; This has not (yet) been accepted by the Emacs Lisp archive, ; but if it is the archive entry will probably be something like this: ;; line+|Neil Jerram|nj...@cus.cam.ac.uk| ;; Line Numbering & Interrupt Driven Actions| ;; 1993-02-18|1.1|<archive pathname of line+.el>| ; Mished and mashed by Neil Jerram <nj...@cus.cam.ac.uk>, ; Monday 21 December 1992.

slide-3
SLIDE 3

Free software work

  • Emacs
  • Guile
  • Openmoko and GTA04 smartphones
slide-4
SLIDE 4

Metaswitch and Project Calico

  • 30+ year provider of high quality networking software, but mostly

proprietary

  • Software -> hardware -> and now back again!
  • Now also leading projects as open source
  • Project Clearwater
  • Project Calico
slide-5
SLIDE 5

So, Calico?

  • Connectivity and security for workloads (aka endpoints, aka micro-services,

aka containers or VMs) in an elastic computing environment

  • e.g. a data center
  • Emphasis on simplicity and scalability
  • Based on standard Linux features
  • routing, iptables
  • and Internet protocols (BGP)
  • Mainline case L3 only
slide-6
SLIDE 6

Old, zone-based security

slide-7
SLIDE 7

Services in an elastic environment

slide-8
SLIDE 8

Distributed firewall security

slide-9
SLIDE 9

Calico architecture

slide-10
SLIDE 10

Data plane performance questions

  • Can we get same bandwidth between endpoints as between those

endpoints’ hosts?

  • What is CPU cost, and how does it compare with other networking

approaches?

  • What are the effects of our iptables and ipset programming?
slide-11
SLIDE 11

Testing methodology

  • Two hosts, directly connected by 10Gb link
  • 8 core
  • 64Gb RAM
  • 3.13 kernel
  • No tuning
  • qperf, using TCP
  • Measure CPU usage, raw throughput and packet latency
slide-12
SLIDE 12

Configurations

  • Bare metal, i.e. host to host
  • Between OpenStack VMs
  • ‘TAP’ interface between VM and host
  • Between containers
  • veth pair between container namespace and host namespace
  • Between OpenStack VMs using Open vSwitch (OVS) and VXLAN
  • MTU 1500, send sizes 20000 and 500
slide-13
SLIDE 13

Data plane throughput

  • Saturation for 20k messages …

(red bars)

  • … but not for 500 messages

(blue bars)

  • Why?
  • OpenStack better than bare metal?
  • OVS case reaches >8Gb/s if MTU

is increased to 9000

slide-14
SLIDE 14

CPU usage

  • CPU-limited for small messages
  • OpenStack cases can use more

cores

  • Extra CPU cost for virtualization
  • Namespace
  • TAP or veth interface
  • Routing in guest as well as host
slide-15
SLIDE 15

CPU usage per throughput

  • CPU required to drive each

Gb/s of throughput

slide-16
SLIDE 16

Latency

  • Tiny extra latency for

containers

  • More for VMs
  • But acceptable
  • Note micro seconds
  • Not milli!
slide-17
SLIDE 17

Security rules

slide-18
SLIDE 18

iptables and ipsets

  • iptables on a given host should be the composition of many logical

security rules

  • Will this impact data plane performance?
  • Actually, no
  • A felix-FORWARD -i tap+ -j felix-FROM-ENDPOINT
  • A felix-FORWARD -o tap+ -j felix-TO-ENDPOINT
  • A felix-FORWARD -i tap+ -j ACCEPT
  • A felix-FORWARD -o tap+ -j ACCEPT
  • A felix-FROM-ENDPOINT -i tap7f470881-51 -g felix-from-7f470881-51
  • A felix-FROM-ENDPOINT -j DROP
  • A felix-INPUT -i tap+ -j felix-FROM-ENDPOINT
  • A felix-INPUT -i tap+ -j ACCEPT
  • A felix-TO-ENDPOINT -o tap7f470881-51 -g felix-to-7f470881-51
  • A felix-TO-ENDPOINT -j DROP
  • A felix-from-7f470881-51 -m conntrack --ctstate INVALID -j DROP
  • A felix-from-7f470881-51 -m conntrack --ctstate RELATED,ESTABLISHED -j RETURN
  • A felix-from-7f470881-51 -p udp -m udp --sport 68 --dport 67 -j RETURN
  • A felix-from-7f470881-51 -s 10.28.0.40/32 -m mac --mac-source FA:16:3E:4E:7A:0E -g felix-p-_6b340324948a39b-o
  • A felix-from-7f470881-51 -m comment --comment "Anti-spoof DROP (endpoint 7f470881-5156-47ce-a67d-b971ef5e5cde):" -j

DROP

  • A felix-p-_6b340324948a39b-i -p icmp -m set --match-set felix-v4-_6b340324948a39b src -j RETURN
  • A felix-p-_6b340324948a39b-i -s 172.18.203.20/32 -p tcp -m multiport --dports 22 -j RETURN
  • A felix-p-_6b340324948a39b-i -s 172.18.203.20/32 -p udp -m multiport --dports 5060 -j RETURN
  • A felix-p-_6b340324948a39b-i -s 172.18.203.20/32 -p tcp -m multiport --dports 80 -j RETURN
  • A felix-p-_6b340324948a39b-i -m comment --comment "Default DROP rule (72d696a9-f715-495f-9152-7f5e6a69fd0f):" -j DROP
slide-19
SLIDE 19

What saves us?

  • conntrack
  • ipsets scale well, thanks to hash table implementation
  • Nested design for source/destination interface mapping
slide-20
SLIDE 20

Arjan Schaaf’s measurements

slide-21
SLIDE 21

What is happening here?

  • http://www.slideshare.net/ArjanSchaaf/docker-network-performance-in-

the-public-cloud

  • Various approaches to networking between containers on AWS hosts
  • For this case Calico uses IP-in-IP between the hosts
  • Calico bandwidth less than half of native
  • We set up the same system, got same results as Arjan
  • For t2.micro bandwidth = 65.3 MB/sec compared with native = 125 MB/sec.
  • For m4.xlarge bandwidth = 108 MB/sec compared with native = 267 MB/sec
  • Why?
slide-22
SLIDE 22

It’s all about the MTU

  • Calico in a public cloud uses IP-in-IP, with tunnel MTU = 1440
  • 1440 was optimised for GCE, which has an MTU of 1460 on its VM

interfaces

  • But AWS instances have an MTU = 9001!
  • So native tests were using jumbo frames, and the calico test was using 1440.
  • If Calico’s tunnel MTU is increased to 8980
  • For t2.micro, Calico bandwidth = 114 MB/sec
  • For m4.xlarge, Calico bandwidth = 266 MB/sec
  • Problem solved – Calico throughput is now close to native
slide-23
SLIDE 23

So what have we learned?

  • With Calico connectivity, VMs or containers can saturate a 10Gb link

between hosts, just as much as the hosts themselves could

  • There is a CPU cost to virtualization
  • But mostly inevitable if you want virtualization at all (non-accelerated)
  • Calico does not add any significant extra cost
  • Conntrack largely saves us from the effects of complex iptables
  • ipsets and clever programming design also help
  • Be humble about performance comparisons
slide-24
SLIDE 24

Further information, and thanks!

  • Project Calico
  • http://www.projectcalico.org/
  • http://docs.projectcalico.org/en/latest/
  • https://github.com/projectcalico
  • Blog on Calico data plane performance
  • http://www.projectcalico.org/calico-dataplane-performance/
  • Thanks!
  • @neiljerram
  • @projectcalico
  • www.metaswitch.com