Ironic Grenade Blowing up our upgrades Vlad Drok (vdrok) - Mirantis - - PowerPoint PPT Presentation

ironic grenade blowing up our upgrades
SMART_READER_LITE
LIVE PREVIEW

Ironic Grenade Blowing up our upgrades Vlad Drok (vdrok) - Mirantis - - PowerPoint PPT Presentation

Ironic Grenade Blowing up our upgrades Vlad Drok (vdrok) - Mirantis Vasyl Saienko (vsaienko) - Mirantis John Villalovos (jlvillal) - Intel Corporation mirantis.com intel.com How we made Grenade work in Ironic What is Grenade and Ironic


slide-1
SLIDE 1

Ironic Grenade Blowing up our upgrades

Vlad Drok (vdrok) - Mirantis Vasyl Saienko (vsaienko) - Mirantis John Villalovos (jlvillal) - Intel Corporation

mirantis.com intel.com

slide-2
SLIDE 2
slide-3
SLIDE 3

How we made Grenade work in Ironic

  • What is Grenade and Ironic
  • Grenade as a plug-in
  • Grenade phases
  • Networking of Ironic in Devstack and Grenade
  • Grenade testing difficulties for Ironic
  • Future work
slide-4
SLIDE 4

What is Grenade? Why do we want to use it?

  • Grenade is a test harness to exercise the OpenStack*

upgrade process between releases.

  • It allows us to test a “Cold Upgrade” and is a step in our

end-goal of supporting “Rolling Upgrades.”

  • Grenade docs:

http://docs.openstack.org/developer/grenade/

  • Ironic is a service for bare-metal provisioning.
  • Ironic + Grenade allows us to verify that cold-upgrade

works and new patches don’t break cold-upgrades.

slide-5
SLIDE 5

Grenade as a plug-in

To use Grenade as a plug-in a few things are required:

  • Setup the openstack/ironic/devstack/upgrade/settings file
  • Create a ‘grenade’ job in openstack-infra/project-config
  • Update openstack-infra/project-config

○ The GRENADE_PLUGINRC environment variable must be setup ○ export GRENADE_PLUGINRC="enable_grenade_plugin ironic https://git.openstack.org/openstack/ironic"

slide-6
SLIDE 6

Grenade Resource Phases

Base devstack setup of current stable branch, start services, and smoke test early_create phase create phase verify pre-upgrade phase State is saved and services shutdown Upgrade to current master with proposed patch and start services destroy phase Final smoke test verify post-upgrade phase verify_noapi phase

slide-7
SLIDE 7

Networking of Ironic and Devstack

  • L3 (Layer 3) access from the ironic-conductor service to

the Ironic Python Agent (IPA) ramdisk during provisioning

  • L3 access from the IPA ramdisk to the ironic-api service

Ironic operates with bare-metal servers, but in the gate we use VMs to emulate them. Due to this network setup in the gate it looks very complex, but it is only first look :)

slide-8
SLIDE 8

tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10

private 10.1.0.0/20

Neutron Integration bridge: br-int

Phase: base stack.sh

Baremetal bridge: brbm

1. Create brbm 2. Create VMs (ironic nodes)

br-node-N

node-0 node-N

br-node-0

Ironic nodes (KVM VMs)

  • vs-node-0
  • vs-node-N

3. Add brbm <----> br-int

Port: ovs-tap brbm-tap

4. Connect nodes to private net (set tag: 10 on ovs-tap)

br-node-N

node-0 node-N

br-node-0

  • vs-node-0
  • vs-node-N

Port: ovs-tap brbm-tap tag: 10

5. Run base smoke test

slide-9
SLIDE 9

tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10

private 10.1.0.0/20

Neutron Integration bridge: br-int

Phase: resources.sh early_create

Baremetal bridge: brbm br-node-N

node-0 node-N

br-node-0

Ironic nodes (KVM VMs)

  • vs-node-0
  • vs-node-N

Port: ovs-tap brbm-tap

  • 1. Create network ironic_grenade

tap DHCP Namespace qdhcp tap Router Namespace qrouter

ironic_grenade 10.2.0.0/20

Port: tap tag: 20 Port: tap tag: 20 tag: 10

  • 2. Connect nodes to ironic_grenade

net (set tag: 20 on ovs-tap)

br-node-N

node-0 node-N

br-node-0

  • vs-node-0
  • vs-node-N

Port: ovs-tap brbm-tap tag: 10 br-node-N

node-0 node-N

br-node-0

  • vs-node-0
  • vs-node-N

Port: ovs-tap brbm-tap tag: 20

slide-10
SLIDE 10

tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10

private 10.1.0.0/20

Neutron Integration bridge: br-int

Phase: resources.sh create/verify

Baremetal bridge: brbm Port: ovs-tap brbm-tap tap DHCP Namespace qdhcp tap Router Namespace qrouter

ironic_grenade 10.2.0.0/20

Port: tap tag: 20 Port: tap tag: 20 tag: 20

  • vs-node-0
  • vs-node-N

br-node-N br-node-0

Ironic nodes (KVM VMs)

node-N

  • 1. Create resources (boot Instance)
Instance

node-0

  • 2. Verify (ping instance)
slide-11
SLIDE 11

tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10

private 10.1.0.0/20

Neutron Integration bridge: br-int

Phase: resources.sh shutdown services/verify

Baremetal bridge: brbm Port: ovs-tap brbm-tap tap DHCP Namespace qdhcp tap Router Namespace qrouter

ironic_grenade 10.2.0.0/20

Port: tap tag: 20 Port: tap tag: 20 tag: 20

  • vs-node-0
  • vs-node-N

br-node-N br-node-0

Ironic nodes (KVM VMs)

node-N

  • 1. Shutdown all services
Instance

node-0

  • 2. Verify (ping instance)
slide-12
SLIDE 12

tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 10 Port: tap tag: 10

private 10.1.0.0/20

Neutron Integration bridge: br-int

Phase: resources.sh upgrade services/verify

Baremetal bridge: brbm Port: ovs-tap brbm-tap tap DHCP Namespace qdhcp tap Router Namespace qrouter

ironic_grenade 10.2.0.0/20

Port: tap tag: 20 Port: tap tag: 20 tag: 20

  • vs-node-0
  • vs-node-N

br-node-N br-node-0

Ironic nodes (KVM VMs)

node-N

  • 1. Upgrade services/run db

migrations

Instance

node-0

  • 2. Start services

Neutron picked new tag for networks tag: 10 → tag: 11 tag: 20 → tag: 21

tap Port: tap tag: 11 tap Port: tap tag: 11 tap Port: tap tag: 21 tap Port: tap tag: 21

  • 3. Update tag on ovs-tap to connect

nodes back to ironic_grenade

Port: ovs-tap brbm-tap tag: 21

  • vs-node-0
  • vs-node-N

br-node-N br-node-0

node-N node-0

  • 4. Verify (ping instance)
slide-13
SLIDE 13

tap DHCP Namespace qdhcp tap Router Namespace qrouter Port: tap tag: 11 Port: tap tag: 11

private 10.1.0.0/20

Neutron Integration bridge: br-int

Phase: resources.sh destroy

Baremetal bridge: brbm

Ironic nodes (KVM VMs)

  • 1. Destroy resources that were

created during resources.sh create

Instance
  • 3. Run target smoke test

Port: tap tag: 21 Port: tap tag: 21

ironic_grenade 10.2.0.0/20

DHCP Namespace qdhcp tap Router Namespace qrouter tap Port: ovs-tap brbm-tap tag: 21

  • vs-node-0
  • vs-node-N

br-node-N br-node-0

node-N node-0

  • 2. Update tag on ovs-tap to connect

nodes back to private (set tag: 11)

Port: ovs-tap brbm-tap tag: 11

  • vs-node-0
  • vs-node-N

br-node-N br-node-0

node-N node-0

slide-14
SLIDE 14

Ironic hypervisor

Since we can not create additional bare-metal resources, we have to use what was created during devstack setup. In the case of the grenade job, 7 nodes are created. We have a process called cleaning that happens after the instance deletion request.

node-0 node-N

Ironic nodes (KVM VMs) devstack VM

slide-15
SLIDE 15

Tests

Resource verification phase uses one node. In smoke tests, none of them boots more than three instances, so we should be safe running all of them with concurrency=1. The situation right before the target smoke run might be the following:

cleaning after base smoke test cleaning after resource verify available

Ironic nodes (KVM VMs)

slide-16
SLIDE 16

Tests

Some smoke tests were skipped or worked around:

  • Some features are not supported by Ironic (disk-config)
  • Networking service ports remain in the down state when using ironic

After the skips, we have 8 requests to boot an instance. For comparison, in full tempest, an instance is booted 154 times.

slide-17
SLIDE 17

Ironic issues

The main one was the lack of versioning of IPA (Ironic Python Agent), which led to a versioning spec proposal [1]. Upgrade testing brought some changes to devstack plugin too, as some parts of it had to be reused by two different releases of OpenStack. Backward compatibility is important! [1] https://review.openstack.org/341086

start_iscsi_target(..., portal_port=2222, wipe_disk_metadata=True) TypeError Old IPA Ironic conductor

slide-18
SLIDE 18

Future work: multinode is reality...

Multinode DevStack networking is even more complex… But multinode grenade job will help us to:

  • Test rolling upgrades!
  • Test multi compute/conductor scenarios (takeover)
  • Increase test concurrency
slide-19
SLIDE 19

Review / Call to Action

  • Help us finish the multi-node grenade (rolling upgrade testing) work for Ironic

in the Ocata* cycle!

○ https://etherpad.openstack.org/p/ironic-ocata-multinode-whiteboard ○ Weekly Ironic QA meeting: https://wiki.openstack.org/wiki/Meetings/Ironic-QA

  • What is Grenade
  • Grenade phases
  • Grenade as a plug-in
  • Networking of Ironic in Devstack and Grenade
  • Grenade testing difficulties for Ironic
  • Future work
slide-20
SLIDE 20

Thank you Q&A

slide-21
SLIDE 21

https://etherpad.openstack.org/p/ironic-ocata-summit-grenade-presentation

slide-22
SLIDE 22
  • penstack-infra/project-config job-template snippet
  • job-template:
name: '{pipeline}-grenade-dsvm-ironic{job-suffix}' node: '{node}'
  • shell: |
#!/bin/bash -xe export PROJECTS="openstack/ironic $PROJECTS" export PROJECTS="openstack/ironic-lib $PROJECTS" export PROJECTS="openstack/ironic-python-agent $PROJECTS" export PROJECTS="openstack/python-ironicclient $PROJECTS" export PROJECTS="openstack-dev/grenade $PROJECTS" export PYTHONUNBUFFERED=true export DEVSTACK_GATE_TEMPEST=1 export DEVSTACK_GATE_GRENADE=pullup export DEVSTACK_GATE_IRONIC=1 export DEVSTACK_GATE_NEUTRON=1 export DEVSTACK_GATE_VIRT_DRIVER=ironic export TEMPEST_CONCURRENCY=1 export DEVSTACK_GATE_OS_TEST_TIMEOUT=2400 export DEVSTACK_GATE_TEMPEST_BAREMETAL_BUILD_TIMEOUT=1200 export DEVSTACK_GATE_IRONIC_BUILD_RAMDISK=0 export GRENADE_PLUGINRC="enable_grenade_plugin ironic https://git.openstack.org/openstack/ironic" cp devstack-gate/devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh ./safe-devstack-vm-gate-wrap.sh
slide-23
SLIDE 23

Grenade phases

Resource phases: Running base stack.sh Running base smoke test Running resource phase: early_create Running resource phase: create Running resource phase: verify Saving current state information (ugly hacks lurk…) Shutting down all services on base devstack... Running resource phase: verify_noapi Preparing the target devstack environment Upgrading services Running upgrade-tempest Running resource phase: verify Dumping new databases Running resource phase: destroy Grenade has completed the pre-programmed upgrade scripts. Target smoke test (final smoke test)

slide-24
SLIDE 24

Code pointers

Ironic devstack/upgrades/settings file: https://github.com/openstack/ironic/blob/659f951d72e96f39bb967455a68556 82e517ca43/devstack/upgrade/settings GRENADE_PLUGINRC openstack-infra/project-config setting: https://github.com/openstack-infra/project-config/blob/589c6c3a9f37de9c278c 4c61ac77c2fd666c24e7/jenkins/jobs/ironic.yaml#L1005