HYPER COOL INFRASTRUCTURE
OPENSTACK SUMMIT BOSTON | MAY 2017 RANDY RUBINS Sr Cloud Consultant, Red Hat May 10, 2017
HYPER COOL INFRASTRUCTURE OPENSTACK SUMMIT BOSTON | MAY 2017 RANDY - - PowerPoint PPT Presentation
HYPER COOL INFRASTRUCTURE OPENSTACK SUMMIT BOSTON | MAY 2017 RANDY RUBINS Sr Cloud Consultant, Red Hat May 10, 2017 WHY HYPER "COOL" INFRA? Did not like "CONVERGED" Needed to preserve a "C-word" Could have used
OPENSTACK SUMMIT BOSTON | MAY 2017 RANDY RUBINS Sr Cloud Consultant, Red Hat May 10, 2017
Did not like "CONVERGED" Needed to preserve a "C-word" Could have used "COMPLEX" or "CRAMMED" Ended up with a four-letter word that helped get my presentation accepted
What is HCI? Drivers and Use Cases Red Hat Hyperconverged Solutions Architectural Considerations Implementation Details Performance and Scale Considerations Futures Q & A
Smaller hardware footprint Lower cost of entry Standardization Maximized capacity utilization D-NFV vCPE ROBO Lab/Sandbox
DRIVERS
USE CASES
TRADITIONAL VIRTUALIZTION PRIVATE CLOUD CONTAINERIZED CLOUD APPS
hosted-engine
grafton-0 grafton-1 grafton-2
RHV-S
1
Valid subscriptions for RHV 4.1 & RHGS 3.2
2
Exactly 3 physical nodes with adequate memory and storage
3
2 network interfaces (gluster back-end, and
4
RAID10/5/6 supported/recommended
5
1 spare hot drive recommended per node
6
RAID cards must use flash backed write cache
7
3-4 gluster volumes (engine, vmstore, data, shared_storage geo-replicated volume)
8
29 - 40 VMs supported
9
4 vCPUs / 2TB max per VM supported Currently in Beta/LA - subject to change in GA version Full details can be found here: http://red.ht/2qKwMKY
undercloud
OSP-HCI
1
Valid subscriptions for RHOSP 10 & RHCS 2.0
2
(1) OSP undercloud (aka "director") - can be a VM
3
(3) "OSP controller + Ceph MON" nodes
4
(3+) "OSP compute + Ceph OSD" nodes with adequate memory and storage
5
10Gbps network interfaces for Ceph storage and OpenStack tenant networks
6
Up to 1 datacenter rack (42 nodes) for "OSP compute + Ceph OSD" Currently in Tech Preview, soon to reach fully-supported status, GA being evaluated. Full details can be found here: http://red.ht/2jXvxkB
undercloud
hosted-engine grafton-2 grafton-0 grafton-1 ansible-tower cloudforms
undercloud
hosted-engine grafton-2 grafton-0 grafton-1 ansible-tower cloudforms
RHV-S
1
Install RHEL 7.3 and RHV 4.1 on (3) grafton nodes
2
Configure public key authentication based SSH
3
Deploy gluster via cockpit plugin / gdeploy
4
Deploy hosted-engine via cockpit plugin
5
Enable gluster functionality on hosted-engine
6
Create networks for gluster storage, provisioning, and the rest of the OSP isolated networks
7
Create master storage domain
8
Add remaining (2) hypervisors to hosted-engine
9
Upload RHEL 7.3 guest image
10
Create RHEL 7.3 template
OSP-HCI Deploy director (undercloud) on RHV-S using RHEL 7.3 template
OSP-HCI Install and configure director via ansible-undercloud playbook
... undercloud files certs build_undercloud_cert.sh cacert.pem
privkey.pem stack.sudo undercloud.pem tasks main.yml templates hosts.j2 instackenv.json.j2 resolv.conf.j2 undercloud.conf.j2 undercloud.yml ...
OSP-HCI Prepare and upload overcloud images
become_user: stack unarchive: copy: false src: /usr/share/rhosp-director-images/overcloud-full-latest-{{ osp_version }}.tar dest: /home/stack/images/
become_user: stack unarchive: copy: false src: /usr/share/rhosp-director-images/ironic-python-agent-latest-{{ osp_version }}.tar dest: /home/stack/images/
shell: export LIBGUESTFS_BACKEND=direct && virt-customize -a /home/stack/images/overcloud-full.qcow2 --root- password password:{{ admin_password }}
become_user: stack shell: source ~/stackrc && openstack overcloud image upload --image-path /home/stack/images --update-existing ignore_errors: true
OSP-HCI Customize tripleo heat templates based on Reference Architecture doc
[stack@director ~]$ tree custom-templates/ custom-templates/ ceph.yaml certs build_overcloud_cert.sh cacert-oc.pem
privkey-oc.pem compute.yaml custom-roles.yaml enable-tls.yaml first-boot-template.yaml inject-trust-anchor.yaml layout.yaml network.yaml nic-configs compute-nics.yaml controller-nics.yaml numa-systemd-osd.sh post-deploy-template.yaml rhel-registration environment-rhel-registration.yaml rhel-registration-resource- registry.yaml rhel-registration.yaml scripts rhel-registration rhel-unregistration scripts configure_fence.sh deploy.sh ironic-assign.sh nova_mem_cpu_calc.py nova_mem_cpu_calc_results.txt wipe-disk.sh
NOTE: Use Github repo https://github.com/RHsyseng/hci
OSP-HCI Add resource isolation and tuning to custom templates NOTE: Follow Chapter 7 of OSP10/RHCS2 Reference Architecture Guide(!)
OSP-HCI Forced to use a KVM host and virtual-bmc ipmi-to-libvirt proxy due to lack of oVirt/RHV ironic driver. RFE: KVM instances w/vbmc RHV 4.1
https://bugs.launchpad.net/ironic-staging-drivers/+bug/1564841
OSP-HCI Create instackenv.json file and register (3) KVM instances and (3) OSP baremetal nodes and run introspection
{ "name": "hci-comp0", "pm_type": "pxe_ipmitool", "mac": [ "84:2b:2b:4a:0c:3f" ], "cpu": "1", "memory": "4096", "disk": "50", "arch": "x86_64", "pm_user": "root", "pm_password": "calvin", "pm_addr": "192.168.0.104", "capabilities": "node:comp0,boot_option:local" } { "name": "hci-ctrl0", "pm_type": "pxe_ipmitool", "mac": [ "52:54:00:b7:c2:7d" ], "cpu": "1", "memory": "4096", "disk": "50", "arch": "x86_64", "pm_user": "root", "pm_password": "calvin", "pm_addr": "192.168.2.10", "pm_port": "6230", "capabilities": "node:ctrl0,boot_option:local" }
OSP-HCI Deploy overcloud hci stack using deploy.sh script
source ~/stackrc time openstack overcloud deploy \
OSP-HCI Validate deployment
[stack@director ~]$ openstack catalog show keystone +-----------+-------------------------------------------------+ | Field | Value | +-----------+-------------------------------------------------+ | endpoints | regionOne | | | publicURL: https://hci.rrubins.lan:13000/v2.0 | | | internalURL: http://172.20.15.102:5000/v2.0 | | | adminURL: http://172.20.16.14:35357/v2.0 | | | | | name | keystone | | type | identity | +-----------+-------------------------------------------------+
Stack CREATE COMPLETE (hci): Stack CREATE completed successfully
OSP-HCI
[root@hci-ctrl0 ~]# ceph -s cluster aaaabbbb-cccc-dddd-eeee-ff0123456789 health HEALTH_OK monmap e1: 3 mons at {hci-ctrl0=172.20.17.200:6789/0,hci- ctrl1=172.20.17.201:6789/0,hci-ctrl2=172.20.17.202:6789/0} election epoch 6, quorum 0,1,2 hci-ctrl0,hci-ctrl1,hci-ctrl2
flags sortbitwise pgmap v1071: 704 pgs, 6 pools, 1132 kB data, 76 objects 816 MB used, 2764 GB / 2765 GB avail 704 active+clean [stack@director ~]$ openstack server list +--------------------------------------+-----------+--------+-----------------------+----------------+ | ID | Name | Status | Networks | Image Name | +--------------------------------------+-----------+--------+-----------------------+----------------+ | e957e67f-a494-4ff0-a274-8b87c86e5bc8 | hci-comp2 | ACTIVE | ctlplane=172.20.16.11 | overcloud-full | | 37ffa97b-08c9-44db-a597-28b2b2e28b28 | hci-ctrl2 | ACTIVE | ctlplane=172.20.16.9 | overcloud-full | | 495a1c59-8cd5-48da-84d9-84a17192fdce | hci-comp1 | ACTIVE | ctlplane=172.20.16.13 | overcloud-full | | 58998f26-48c3-436c-9368-e669f9cf16bd | hci-comp0 | ACTIVE | ctlplane=172.20.16.12 | overcloud-full | | a4277acf-5862-489b-bba3-6a4b203e70ea | hci-ctrl1 | ACTIVE | ctlplane=172.20.16.17 | overcloud-full | | e86ac0a0-af08-4a6b-a2a1-a080b8b74cb6 | hci-ctrl0 | ACTIVE | ctlplane=172.20.16.15 | overcloud-full | +--------------------------------------+-----------+--------+-----------------------+----------------+
ADDITIONAL TASKS Deploy CloudForms appliance on RHV-S Configure infrastructure provider for RHV and cloud provider for OSP Provision some RHV and OSP instances to validate functionality
(6) Dell PowerEdge R710
(2) Intel Xeon E5520 quad-core CPUs 96GB Memory (1) 120GB SATA SSD Hard Drive (7) 146GB SAS Hard Drive (4) 1GbE Port (2) 10Gb Port (Intel X520-DA2)
(1) Cisco SG300 52-port 1GbE switch (1) Quanta LB6M 24-port 10GbE switch
(3) Dell PowerEdge R630
(2) Intel E5-2640v4 @ 2.4GHz CPUs 128GB RAM (1) H730 RAID Controller (2) 400GB SATA SSD (RAID1) (1) 1GbE quad-port (on-board) (1) 10GbE dual-port DA/SFP+ (Intel X520-DA2
(6) Dell PowerEdge R730XD
(2) Intel E5-2640v4 @ 2.4GHz CPUs 256GB RAM (1) H730 RAID Controller (2) 400GB SATA SSD (RAID1) (12) 1.2TB SAS HDD (3) 480GB SAS SSD (1) 1GbE quad-port (on-board) (1) 10GbE dual-port DA/SFP+ (Intel X520-DA2
RHV-S & OSP-HCI (COMP+OSD) OSP-HCI (CTRL+MON)
(2) Dell Networking S3048-ON
(48) 1GbE ports
(2) Dell Networking S4048-ON
(48) 10GbE ports NETWORK SWITCHES RHV-S + OSP-HCI
(6) Dell PowerEdge R730XD
(2) Intel E5-2640v4 @ 2.4GHz CPUs 384GB RAM (1) H730 RAID Controller (2) 400GB SATA SSD (RAID1) (12) 1.2TB SAS HDD (3) 480GB SAS SSD (1) 1GbE quad-port (on-board) (2) 10GbE dual-port DA/SFP+ (Intel X520-DA2 or X710-DA2) HYPER COOL INFRA
(2) Dell Networking S4048-ON
(48) 10GbE SFP+ ports NETWORK SWITCHES
1
10Gbps interfaces and jumbo frames for storage and tenant traffic
2
Set Nova reserved_memory and cpu_allocation_ratio based on calculations using the reference architecture supplied script
3
Avoid resource congestion with NUMA alignment and CPU pinning
4
Reduce Ceph backfill and recovery operations
5
Make sure the proper RHEL7 tuned profile is selected (throughput-performance)
6
Can scale osp-compute/ceph-osd nodes (3-42) RHV-S OSP-HCI
1
Use 10Gbps interfaces and jumbo frames for storage traffic
2
Gluster volumes must be configured with replica 3, features.shard enable, and features.shard-block-size 512MB
3
Currently 3-node cluster and CANNOT be scaled(!) Planned for a future release
Reduce footprint to 6-nodes for a fully-supported hybrid HCI solution
* Requires completion of pxe_ovirt ironic driver implementation (!)
Standardize on SDN (OVN)
* Already a Tech Preview in RHV 4.1 SHORT-TERM LONGER-TERM
Containerize OSP services (Kolla, Kubernetes) Further automate the HCI buildout and configuration using Ansible
twitter.com/RedHatNews youtube.com/redhat facebook.com/redhatinc
plus.google.com/+RedHat linkedin.com/company/red-hat