networking approaches in a container world who we are
play

Networking approaches in a Container World Who we are Flavio - PowerPoint PPT Presentation

Networking approaches in a Container World Who we are Flavio Castelli Neil Jerram Antoni Segura Puimedon Engineering Manager Senior Sw. Engineer Principal Sw. Engineer Disclaimer There a many container engines, well focus on Docker


  1. Networking approaches in a Container World

  2. Who we are Flavio Castelli Neil Jerram Antoni Segura Puimedon Engineering Manager Senior Sw. Engineer Principal Sw. Engineer

  3. Disclaimer ● There a many container engines, we’ll focus on Docker ● Multiple networking solutions are available: Introduce the core concepts ○ ○ Many projects → cover only some of them Container orchestration engines: ● ○ Often coupled with networking ○ Focus on Docker Swarm and Kubernetes Remember: the container ecosystem moves at a fast pace, things can ● suddenly change

  4. The problem ● Containers are lightweight ● Containers are great for microservices Microservices: multiple distributed processes communicating ● Lots of containers that need to be connected together ●

  5. Single host

  6. host networking Containers have full access to the host interfaces!!! container-a host lo eth0 ...

  7. host networking $ docker run --net=host -it --rm alpine /bin/sh Containers able to: / # ip link show 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state See all host interfaces ● UNKNOWN qlen 1 ● Use all host interfaces link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 Containers can’t (without CAPS) 2: wlp4s0 : <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP qlen 1000 Modify their IP addresses ● link/ether e4:b3:18:d2:f6:ea brd ff:ff:ff:ff:ff:ff ● Modify their IP routes 3: enp0s31f6 : <BROADCAST,MULTICAST> mtu 1500 qdisc noop state Create virtual devices ● DOWN qlen 1000 ● Interact with iptables/ebtables link/ether c8:5b:76:36:b6:0b brd ff:ff:ff:ff:ff:ff 4: docker0 : <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN link/ether 02:42:7e:62:3d:37 brd ff:ff:ff:ff:ff:ff / #

  8. Bridged networking ● Linux bridge ● Containers connected to the container-a container-b bridge with veth pairs ● Each container gets its own IP veth1 veth3 and kernel networking namespace ● Containers can talk to each veth0 veth2 other and to the host via IP docker0 172.17.0.0/16 Forwarding host eth0

  9. $ ip address show dev docker0 4: docker0 : <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 Bridged networking qdisc noqueue state DOWN group default link/ether 02:42:7e:62:3d:37 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever ● Outwards connectivity via IP $ sudo iptables -t nat -L POSTROUTING forwarding and masquerading Chain POSTROUTING (policy ACCEPT) ● The bridge and containers use a target prot opt source destination private subnet MASQUERADE all -- 172.17.0.0/16 anywhere $ docker run --net=bridge -it --rm alpine /bin/sh -c '/sbin/ip -4 address show dev eth0; ip -4 route show' 50: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1500 qdisc noqueue state UP inet 172.17.0.2 /16 scope global eth0 valid_lft forever preferred_lft forever default via 172.17.0.1 dev eth0 172.17.0.0/16 dev eth0 src 172.17.0.2

  10. $ docker run --net=bridge -d --name nginx -p 8000:80 nginx $ sudo iptables -t nat -n -L Chain PREROUTING (policy ACCEPT) Bridged networking target prot opt source destination DOCKER all -- 0.0.0.0/0 0.0.0.0/0 ADDRTYPE match dst-type LOCAL ● Services are exposed with Chain OUTPUT (policy ACCEPT) iptables DNAT rules target prot opt source destination ● Iptables performance DOCKER all -- 0.0.0.0/0 !127.0.0.0/8 deteriorates as rule amount ADDRTYPE match dst-type LOCAL increases ● Limited to how many host ports Chain POSTROUTING (policy ACCEPT) are free to be bound target prot opt source destination MASQUERADE all -- 172.17.0.0/16 0.0.0.0/0 MASQUERADE tcp -- 172.17.0.2 172.17.0.2 tcp dpt:80 Chain DOCKER (2 references) target prot opt source destination RETURN all -- 0.0.0.0/0 0.0.0.0/0 DNAT tcp -- 0.0.0.0/0 0.0.0.0/0 tcp dpt:8000 to:172.17.0.2:80

  11. Multi host

  12. Multi host networking scenarios frontend container-0 container-0 network 1 2 container-0 container-0 application 3 4 network container-0 container-0 database 5 6 network host-A host-B host-C eth0 eth0 eth0

  13. Multi host networking scenarios frontend container-0 container-0 network 1 2 container-0 container-0 application 3 4 network container-0 container-0 database 5 6 network VM-1 VM-2 VM-3 a big host-A

  14. Multi host routing solutions

  15. container-a container-b 10.0.8.2/24 10.0.8.3/24 Routing approach docker0 10.0.8.1/24 ● Managed common IP space at the container level host-a ● Assigns /24 subnet to each 172.16.0.4/16 eth0 host ● Inserts routes to each host /24 172.16.0.0/16 into the routing table of each host eth0 ● Main implementations 172.16.0.5/16 host-b ○ Calico ○ Flannel ○ Romana docker0 10.0.9.1/24 ○ Kuryr ■ Calico container-c container-d 10.0.9.2/24 10.0.9.3/24

  16. container-a container-b 10.0.8.2/24 10.0.8.3/24 Calico’s approach docker0 10.0.8.1/24 ● Felix agent agent per node that sets up a vRouter: host-a ○ Kernel’s L3 forwarding 172.16.0.4/16 BGP vRouter eth0 ○ Handles ACLs with iptables 172.16.0.0/16 ○ Uses BIRD’s BGP to keep /32 or /128 routes to eth0 each container updated 172.16.0.5/16 BGP vRouter host-b ○ Etcd as data store ○ Replies container ARP reqs with host hwaddr docker0 10.0.9.1/24 container-c container-d 10.0.9.2/24 10.0.9.3/24

  17. Flannel approach ● Flanneld agent ○ Etcd as data store ○ Keeps /24 routes to hosts up to date ○ No ACLs/isolation

  18. Canal ● Developed by Tigera ● Announced on May 9th 2016

  19. Multi host overlay solutions

  20. container-a container-b container-c 10.0.8.2/24 10.0.8.3/24 10.0.7.2/24 Overlay approach net-x net-y 10.0.8.1/24 10.0.7.1/24 ● Encapsulates multiple networks encapsulation over the physical networking host-a ○ UDP 172.16.0.4/16 eth0 ○ vxlan encapsulated container traffic ○ geneve 172.16.0.0/16 ○ GRE ● Connect containers to virtual eth0 networks 172.16.0.5/16 host-b encapsulation ● Main projects ○ Docker’s native overlay ○ Flannel net y 10.0.9.1/24 ○ Weave ○ Kuryr ■ OVS (OVN, Dragonflow) container-c container-d ■ MidoNet 10.0.7.4/24 10.0.7.3/24 ■ PLUMgrid

  21. OpenStack & containers with Kuryr ● Allows you to have both VMs, containers and containers-in-VMs in the same Overlay overlay ● Allows reusing VM nets for containers and viceversa underlay ● Allows you to have separate overlay nets routed to each other ● Isolation from the host networking ● Can have Swarm and Kubernetes on the same overlay

  22. Routing vs Overlay Good Bad Routing ● Native performance ● Requires control over the ● Easy debugging infrastructure ● Hybrid cloud more complicated (requires VPN) ● Can run out of addresses (mitigation: IPv6) ● Easier inter-cloud ● Inferior performances (mitigation: Overlay ● Easier hybrid workloads hw acceleration and jumbo ● Doesn’t require control over the frame) infrastructure ● Debugging more complicated ● More implementation choice

  23. Competing COE-Networking interaction Container Network Model (CNM) Container Network Interface (CNI) ● Implemented by Docker’s Libnetwork ● Implemented by Kubernetes, rkt, Mesos, ● Separated IPAM and Remote Drivers Cloud Foundry and Kurma ● Docker ≥ 1.12 Swarm mode only works ● Plugins: Calico with native overlay driver ○ ○ Flannel ● Some of the Libnetwork remote drivers: Weave ○ OpenStack Kuryr ○ ○ OpenStack Kuryr (unreleased) ○ Calico Weave ○

  24. More challenges

  25. Service discovery ● Producer : A container that runs a service ● Consumer : A container when consuming a service Need a way for consumers to find producer endpoints ●

  26. Service discovery challenges #1 Finding the producer #2 Moving services Where is redis? web-01 redis-01 web-01 redis-01 redis-02 host-A host-B host-A host-B host-C eth0 eth0 eth0 eth0 eth0 Lacks SD

  27. Service discovery challenges #3 Multiple choice Which redis? web-01 redis-01 redis-02 redis-03 host-A host-B host-C host-D eth0 eth0 eth0 eth0

  28. Addressing service discovery

  29. Use DNS ● Problematic for highly dynamic deployments: ○ Containers can die/be moved somewhere more often than DNS caches expire ○ If we try to improve it by reducing DNS TTL → more load on the server ○ Some clients ignore TTL → old entries are cached Note well: ● Docker < 1.11: updates /etc/hosts dynamically Docker ≥1.11: integrates a DNS server ●

  30. Key-value store ● Rely on a k/v store ○ etcd, ○ consul, ○ zookeeper Producer register its IP and port ● ● Orchestration engine handles this data to the consumer ● At run time either: Change your application to read data straight from the k/v ○ ○ Rely on some helper that exposes the values via environment file or configuration file

  31. Changes, multiple choices & ingress traffic

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend