Running Kubernetes on OpenStack and Bare Metal
OpenStack Summit Berlin, November 2018 Ramon Acedo Rodriguez Product Manager, Red Hat OpenStack Team @ramonacedo | racedo@redhat.com
Running Kubernetes on OpenStack and Bare Metal OpenStack Summit - - PowerPoint PPT Presentation
Running Kubernetes on OpenStack and Bare Metal OpenStack Summit Berlin, November 2018 Ramon Acedo Rodriguez Product Manager, Red Hat OpenStack Team @ramonacedo | racedo@redhat.com Bare Metal On-Trend Bare Metal On-Trend Among users who run
OpenStack Summit Berlin, November 2018 Ramon Acedo Rodriguez Product Manager, Red Hat OpenStack Team @ramonacedo | racedo@redhat.com
OpenStack User Survey 2017
Among users who run Kubernetes on OpenStack, adoption of Ironic is even stronger with 37% relying on it.
OpenStack User Survey 2018
Popular Use Cases Kubernetes on Bare Metal High-Performance Computing Direct Access to Dedicated Hardware Devices Big Data and Scientific Applications blog.openshift.com/kubernetes-on-metal-with-openshift
Particularly, on OpenStack Bare Metal
Datacentre
WORKLOAD DRIVEN PROGRAMMATIC SCALE-OUT ACROSS INFRASTRUCTURE DEEPLY INTEGRATED
kubernetes
Ironic Introduction
Hardware Lifecycle Management Hardware Inspection
Servers and Network Switches (via LLDP)
OS Provisioning
Supporting qcow2 images
Routed Spine/Leaf Networking
Provision over routed networks
Multi-Tenancy
ML2 Networking Ansible plug-in
Node Auto-discovery Broad BMC Support
Redfish, iDrac, iRMC, iLo, IPMI, oVirt, vBMC
Simple Architecture Highly Available
Run multiple Ironic instances in HA
Mixed VMs and Bare Metal Instances
Simply add Nova compute nodes
Register Bare Metal Nodes
Create Networks Create Flavors Upload Images
Select Network
Start VM Instances Start BM Instances
Select OS and Flavor
Ironic and OpenStack Features
Ironic Multi-Tenant with Isolation Between Tenants Dedicated Provider Networks
Instead of a shared flat network
Provisioning Over an Isolated, Dedicated Network Physical Switch Ports Dynamically Configured
At deployment time and on termination
Support for Neutron Port Groups and Security Groups
For Link Aggregation and switch ACLs
L2 Switch BM
NIC NIC LAG bond
Configured by ML2 plug-in Configured by cloud-init using metadata
L2 Switch BM
NIC
VLANs set by by ML2 plug-in
BM
NIC
L2 Switch
https://docs.openstack.org/ironic/latest/admin/multitenancy.html https://docs.openstack.org/ironic/latest/install/configure-tenant-networks.html
https://docs.openstack.org/ironic/latest/admin/portgroups.html
Upstream Docs
ML2 Networking Ansible Neutron ML2 Networking Ansible Driver Multiple Switch Platforms in a Single ML2 Driver
Leveraging the Networking Ansible modules
New in OpenStack Rocky
Provisioning Network is configured in the switch Boot BM on Tenant Network ML2 Plug-in Configures Switch BM is Provisioned ML2 Plug-in Configures Switch Tenant Network is configured in the switch BM is ready
L2 Switch BM
NIC
BM
NIC
spine switch Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal spine switch spine switch
L3 routed networks
ToR/leaf switch
Bare Metal Ironic Node Ironic Node Ironic Node Bare Metal
ToR/leaf switch
ToR/leaf switch
DHCP Relay DHCP Relay DHCP Relay
L3 routed networks
L3 Routed Networks (Spine/Leaf Network Topologies) L3 Spine and Leaf Topologies
Ironic provisioning bare metal nodes over routed networks
DHCP Relay
Allowing PXE booting over L3 routed networks
Ironic Inspector Nodes Auto-Discovery Use Rules to Set Node Properties
E.g. set Ironic driver (iDrac, Redfish…) based
etc.
Just Power On the Nodes
Nodes PXE boot from the provisioning network used by Ironic
Automatic Node Inspection
Nodes boot from the network and their hardware is inspected
Automatically Registered with Ironic
After inspection they are registered with Ironic and ready to be deployed
cat > rules.json << EOF [ { "description": "Set the vendor driver for Dell hardware", "conditions": [ {"op": "eq", "field": "data://auto_discovered", "value": true}, {"op": "eq", "field": "data://inventory.system_vendor.manufacturer", "value": "Dell Inc."} ], "actions": [ {"action": "set-attribute", "path": "driver", "value": "idrac"}, {"action": "set-attribute", "path": "driver_info/drac_username", "value": "root"}, {"action": "set-attribute", "path": "driver_info/drac_password", "value": "calvin"}, {"action": "set-attribute", "path": "driver_info/drac_address", "value": "{data[inventory][bmc_address]}"} ] } ] EOF $ openstack baremetal introspection rule import rules.json
Data collected during inspection
E.g: Use the the idrac driver and its credentials if a Dell node is detected
Redfish Support in Ironic API-driven Remote Management Platform
Manage large amounts of physical nodes via API. redfish.dmtf.org
Included in Modern BMCs
Most vendors support Redfish in the latest models
Supported in Ironic
Introduced in Pike along with the Sushy library
OpenStack Stain Addition
Out-of-band inspection of nodes, boot from virtual media (without DHCP) and BIOS configurations
Get and Set BIOS Settings
Retrieve and apply BIOS settings via CLI or REST API. The desired BIOS settings are applied during manual cleaning.
Settings Applied During Node Cleaning
The desired BIOS settings are applied during manual cleaning
Ironic BIOS Configuration docs.openstack.org/ironic/latest/admin/bios.html [{ "name": "hyper_threading_enabled”, "value": "False" }, { "name": "cpu_vt_enabled", "value": "True" }]
Central Site
Ironic Conductor Bare Metal Bare Metal Site B Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal ... Ironic Conductor Bare Metal Bare Metal Bare Metal Bare Metal Bare Metal Site D Ironic Conductor Bare Metal Bare Metal Bare Metal Site C Ironic Controller Ironic Controller Ironic Controller Site A
Multi-Site Ironic Conductor and Node Grouping Affinity
Using the conductor/node grouping affinity spec
Each Ironic Conductor Manages a Group of Nodes
No need to expose access to BMC (e.g.
PXE boot or Virtual Media Provisioning
We will be able to boot nodes without DHCP (see spec Ironic L3 based deployment)
Deployment of Kubernetes on the metal
Kubernetes Cluster
Deploy Kubernetes on OpenStack Ironic-managed bare metal nodes
Kubernetes Installer
Master Node Infra Node Worker Node
Deploy Kubernetes
OpenStack with Ironic OpenStack Installer
1 2 3
Deploy OpenStack with Ironic
docs.openshift.com/container-platform/3.11/getting_started/install_openshift.html
Workflow to Install an OpenShift Cluster on Bare Metal
Provision Bare Metal Nodes
Ironic provisions the OS image and configures the network
Add DNS Entries
Wildcard DNS for container apps and fully-qualified names for the nodes
Distribute SSH keys
Cluster nodes need to access each other passwordless
Install with the OpenShift Ansible Installer
Install the openshift-ansibe installer on an admin node and point it to the bare metal nodes
DNS entries with wildcard for apps Cluster Installation
OpenShift to the Rescue
Kubernetes Cluster
TripleO Node
integrates openshift-ansible
Master Nodes Infra Nodes Worker Nodes
Deploy an OpenShift/OKD cluster and a GlusterFS on bare metal nodes
Provision nodes and deploy Kubernetes with Ironic in TripleO New in Rocky!
[stack@undercloud-0 ~]$ cat /home/stack/home/stack/openshift_env.yaml [...] OS::TripleO::OpenShiftMaster::Net::SoftwareConfig: /home/stack/master-nic.yaml OS::TripleO::OpenShiftWorker::Net::SoftwareConfig: /home/stack/worker-nic.yaml OS::TripleO::OpenShiftInfra::Net::SoftwareConfig: /home/stack/infra-nic.yaml [...] OpenShiftMasterCount: 3 OpenShiftWorkerCount: 3 OpenShiftInfraCount: 3 [...] OpenShiftInfraParameters: OpenShiftGlusterDisks:
[...]
Provision nodes and deploy Kubernetes with Ironic in TripleO Create OpenShift Roles
Master, Workers and Infra nodes in TripleO
Configure the Network Settings in TripleO
E.g. Internal, External and Storage networks and the NIC configuration for each node
Set OpenShift and GlusterFS Options
E.g. Number of nodes, disk for Gluster
Deploy with TripleO
Run the usual ‘openstack overcloud deploy’ command
[stack@undercloud-0 ~]$ cat overcloud_deploy.sh
https://github.com/openstack/tripleo-heat-templates
GlusterFS, Manila/CephFS, NFS
GlusterFS
NFS/Manila (CephFS)
Storage Should be Highly Available
GlusterFS and CephFS provide HA
Storage Should Allow RWX Mode
Allowing ReadWriteMany is required by some apps. GlusterFS and CephFS are supported backends for RWX access mode Local HostPath
GlusterFS
Kubernetes Cluster on Bare Metal with Converged GlusterFS Storage
Master Node Infra Node Master Node Master Node Infra Node Infra Node Worker Node Worker Node Worker Node
Infra GlusterFS Cluster Apps GlusterFS Cluster
OpenStack Storage Not Required
We deploy with OpenStack (TripleO) but Kubernetes don’t use OpenStack
TripleO Deploys GlusterFS on Bare Metal
Optionally, we can request TripleO to deploy GlusterFS for the OpenShift cluster
GlusterFS Can Be Hosted On the Infra and Worker Nodes
The GlusterFS Cluster can be hosted in “converged” mode along with the Infra and Worker nodes
Manila with CephFS/NFS
Manila Provides RWX Access
PVs can be created with ReadWriteMany (RWX) access mode
Ceph as a Single Storage Backend
Manila is backed by CephFS/NFS allowing to use Ceph for OpenStack and OpenShift workloads and infra
Kubernetes Registry on Object Storage from Ceph
Ceph RadosGW configured with OpenStack for Object Storage can be used for the registry
Kubernetes Cluster on Bare Metal Consuming Storage from OpenStack Manila Backed by Ceph
Bare Metal Kubernetes OpenStack Ironic Manila Bare Metal Kubernetes Bare Metal Kubernetes Ceph Storage Ceph Storage Ceph Storage Ceph Cluster OpenStack Ironic Manila OpenStack Ironic Manila
OpenShift Networking Architecture
Kubernetes Cluster on Bare Metal
OpenStack Cluster
More info at docs.openshift.com/container-platform/3.11/architecture/networking/sdn.html
Master Node Infra Node Master Node Master Node Infra Node Infra Node Worker Node Worker Node Worker Node
Ironic Controller Ironic Controller Ironic Controller Provisioning Network Data Network Public Network Provisioning Network Data Network Public Network Provisioning Network Data Network Public Network
Load Balancers
VXLAN (Container to Container) BMC (IPMI/Redfish/iDrac, etc.)
BMC Network
Ironic manages the servers via their BMC (IPMI, Redfish, iDrac, iLO, iRMC, etc.)
Provisioning Network
When deploying from Ironic, a NIC is used to DHCP/PXE-boot. This is usually a single NIC (or one NIC from a bond with LACP fallback)
Data Network
Pod to pod traffic goes through the data
Open vSwitch and CNI
OVS is used for traffic flow within the cluster (pod-to-pod, and node-to-node) and ingress/egress traffic to the cluster. OVS is used as the Container Network Interface (CNI) plug-in for Kubernetes
Ramon Acedo Rodriguez Product Manager, Red Hat OpenStack Team @ramonacedo | racedo@redhat.com