Ironic Grenade Blowing up our upgrades
Vlad Drok (vdrok) - Mirantis Vasyl Saienko (vsaienko) - Mirantis John Villalovos (jlvillal) - Intel Corporation
mirantis.com intel.com
Ironic Grenade Blowing up our upgrades Vlad Drok (vdrok) - Mirantis - - PowerPoint PPT Presentation
Ironic Grenade Blowing up our upgrades Vlad Drok (vdrok) - Mirantis Vasyl Saienko (vsaienko) - Mirantis John Villalovos (jlvillal) - Intel Corporation mirantis.com intel.com How we made Grenade work in Ironic What is Grenade and Ironic
Ironic Grenade Blowing up our upgrades
Vlad Drok (vdrok) - Mirantis Vasyl Saienko (vsaienko) - Mirantis John Villalovos (jlvillal) - Intel Corporation
mirantis.com intel.com
How we made Grenade work in Ironic
What is Grenade? Why do we want to use it?
upgrade process between releases.
end-goal of supporting “Rolling Upgrades.”
http://docs.openstack.org/developer/grenade/
works and new patches don’t break cold-upgrades.
Grenade as a plug-in
To use Grenade as a plug-in a few things are required:
○ The GRENADE_PLUGINRC environment variable must be setup ○ export GRENADE_PLUGINRC="enable_grenade_plugin ironic https://git.openstack.org/openstack/ironic"
Grenade Resource Phases
Base devstack setup of current stable branch, start services, and smoke test early_create phase create phase verify pre-upgrade phase State is saved and services shutdown Upgrade to current master with proposed patch and start services destroy phase Final smoke test verify post-upgrade phase verify_noapi phase
Networking of Ironic and Devstack
the Ironic Python Agent (IPA) ramdisk during provisioning
Ironic operates with bare-metal servers, but in the gate we use VMs to emulate them. Due to this network setup in the gate it looks very complex, but it is only first look :)
tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10
private 10.1.0.0/20
Neutron Integration bridge: br-int
Phase: base stack.sh
Baremetal bridge: brbm
1. Create brbm 2. Create VMs (ironic nodes)
br-node-N
node-0 node-N
br-node-0
Ironic nodes (KVM VMs)
3. Add brbm <----> br-int
Port: ovs-tap brbm-tap
4. Connect nodes to private net (set tag: 10 on ovs-tap)
br-node-N
node-0 node-N
br-node-0
Port: ovs-tap brbm-tap tag: 10
5. Run base smoke test
tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10
private 10.1.0.0/20
Neutron Integration bridge: br-int
Phase: resources.sh early_create
Baremetal bridge: brbm br-node-N
node-0 node-N
br-node-0
Ironic nodes (KVM VMs)
Port: ovs-tap brbm-tap
tap DHCP Namespace qdhcp tap Router Namespace qrouter
ironic_grenade 10.2.0.0/20
Port: tap tag: 20 Port: tap tag: 20 tag: 10
net (set tag: 20 on ovs-tap)
br-node-N
node-0 node-N
br-node-0
Port: ovs-tap brbm-tap tag: 10 br-node-N
node-0 node-N
br-node-0
Port: ovs-tap brbm-tap tag: 20
tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10
private 10.1.0.0/20
Neutron Integration bridge: br-int
Phase: resources.sh create/verify
Baremetal bridge: brbm Port: ovs-tap brbm-tap tap DHCP Namespace qdhcp tap Router Namespace qrouter
ironic_grenade 10.2.0.0/20
Port: tap tag: 20 Port: tap tag: 20 tag: 20
br-node-N br-node-0
Ironic nodes (KVM VMs)
node-N
node-0
tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10
private 10.1.0.0/20
Neutron Integration bridge: br-int
Phase: resources.sh shutdown services/verify
Baremetal bridge: brbm Port: ovs-tap brbm-tap tap DHCP Namespace qdhcp tap Router Namespace qrouter
ironic_grenade 10.2.0.0/20
Port: tap tag: 20 Port: tap tag: 20 tag: 20
br-node-N br-node-0
Ironic nodes (KVM VMs)
node-N
node-0
tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10
private 10.1.0.0/20
Neutron Integration bridge: br-int
Phase: resources.sh upgrade services/verify
Baremetal bridge: brbm Port: ovs-tap brbm-tap tap DHCP Namespace qdhcp tap Router Namespace qrouter
ironic_grenade 10.2.0.0/20
Port: tap tag: 20 Port: tap tag: 20 tag: 20
br-node-N br-node-0
Ironic nodes (KVM VMs)
node-N
migrations
Instancenode-0
Neutron picked new tag for networks tag: 10 → tag: 11 tag: 20 → tag: 21
tap Port: tap tag: 11 tap Port: tap tag: 11 tap Port: tap tag: 21 tap Port: tap tag: 21
nodes back to ironic_grenade
Port: ovs-tap brbm-tap tag: 21
br-node-N br-node-0
node-N node-0
tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 11 Port: tap tag: 11
private 10.1.0.0/20
Neutron Integration bridge: br-int
Phase: resources.sh destroy
Baremetal bridge: brbm
Ironic nodes (KVM VMs)
created during resources.sh create
InstancePort: tap tag: 21 Port: tap tag: 21
ironic_grenade 10.2.0.0/20
DHCP Namespace qdhcp tap Router Namespace qrouter tap Port: ovs-tap brbm-tap tag: 21
br-node-N br-node-0
node-N node-0
nodes back to private (set tag: 11)
Port: ovs-tap brbm-tap tag: 11
br-node-N br-node-0
node-N node-0
Ironic hypervisor
Since we can not create additional bare-metal resources, we have to use what was created during devstack setup. In the case of the grenade job, 7 nodes are created. We have a process called cleaning that happens after the instance deletion request.
node-0 node-N
Ironic nodes (KVM VMs) devstack VM
Tests
Resource verification phase uses one node. In smoke tests, none of them boots more than three instances, so we should be safe running all of them with concurrency=1. The situation right before the target smoke run might be the following:
cleaning after base smoke test cleaning after resource verify available
Ironic nodes (KVM VMs)
Tests
Some smoke tests were skipped or worked around:
After the skips, we have 8 requests to boot an instance. For comparison, in full tempest, an instance is booted 154 times.
Ironic issues
The main one was the lack of versioning of IPA (Ironic Python Agent), which led to a versioning spec proposal [1]. Upgrade testing brought some changes to devstack plugin too, as some parts of it had to be reused by two different releases of OpenStack. Backward compatibility is important! [1] https://review.openstack.org/341086
start_iscsi_target(..., portal_port=2222, wipe_disk_metadata=True) TypeError Old IPA Ironic conductor
Future work: multinode is reality...
Multinode DevStack networking is even more complex… But multinode grenade job will help us to:
Review / Call to Action
in the Ocata* cycle!
○ https://etherpad.openstack.org/p/ironic-ocata-multinode-whiteboard ○ Weekly Ironic QA meeting: https://wiki.openstack.org/wiki/Meetings/Ironic-QA
https://etherpad.openstack.org/p/ironic-ocata-summit-grenade-presentation
Grenade phases
Resource phases: Running base stack.sh Running base smoke test Running resource phase: early_create Running resource phase: create Running resource phase: verify Saving current state information (ugly hacks lurk…) Shutting down all services on base devstack... Running resource phase: verify_noapi Preparing the target devstack environment Upgrading services Running upgrade-tempest Running resource phase: verify Dumping new databases Running resource phase: destroy Grenade has completed the pre-programmed upgrade scripts. Target smoke test (final smoke test)
Code pointers
Ironic devstack/upgrades/settings file: https://github.com/openstack/ironic/blob/659f951d72e96f39bb967455a68556 82e517ca43/devstack/upgrade/settings GRENADE_PLUGINRC openstack-infra/project-config setting: https://github.com/openstack-infra/project-config/blob/589c6c3a9f37de9c278c 4c61ac77c2fd666c24e7/jenkins/jobs/ironic.yaml#L1005