Deploying a baremetal cloud is hard Julia Kreger Open Source - - PowerPoint PPT Presentation

deploying a baremetal cloud is hard
SMART_READER_LITE
LIVE PREVIEW

Deploying a baremetal cloud is hard Julia Kreger Open Source - - PowerPoint PPT Presentation

Deploying a baremetal cloud is hard Julia Kreger Open Source Developer Advocate IBM Twitter: @ashinclouds OpenStack Summit Sydney Email: juliaashleykreger@gmail.com November 6th, 2017 A little about me! Ironic contributor since early 2015.


slide-1
SLIDE 1

Deploying a baremetal cloud is hard

Julia Kreger Open Source Developer Advocate IBM Twitter: @ashinclouds Email: juliaashleykreger@gmail.com OpenStack Summit Sydney November 6th, 2017

slide-2
SLIDE 2

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

A little about me!

Ironic contributor since early 2015. Ironic core and recently elected to the OpenStack Technical Committee Author of Bifrost, a set of Ansible playbooks for leveraging ironic to deploy baremetal servers. Knows the pain of deploying fleets of servers from many years of experience! Prefers purple bike sheds!

slide-3
SLIDE 3

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

So why baremetal?

Physical Infrastructure… As in what the cloud is built on. High Performance Computing High Memory Regulatory or Compliance Production-like environments

slide-4
SLIDE 4

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

What most imagine deployments are like

slide-5
SLIDE 5

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

A deployment is more like...

RDU -> PHL -> RDU -> ATL -> SFO -> SLC -> RDU -> SLC -> SJC -> SFO -> RDU -> ATL -> SLC

  • > SEA -> NRT -> LAX -> ATL -> RDU -> AUS -> RDU -> BOS -> RDU -> MSP -> RAP -> MSP ->

ATL -> RDU -> PHL -> ABQ -> DEN -> SLC -> PDX -> SEA -> SJC -> SEA -> PDX -> SMF -> PSP

  • > LAS -> ABQ -> SLC -> LAX -> SYD …
slide-6
SLIDE 6

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

A few typical realistic steps

Unbox Finish base setup, i.e. add cards. Record additional information, such as MACs and WWNs Configure the Baseboard Management Controller Connect all of the cables! Verify all of the cabling! Then begins the burn-in process!

slide-7
SLIDE 7

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Easy? Right?

slide-8
SLIDE 8

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

One would think...

slide-9
SLIDE 9

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Operational constraints are a thing!

What might seem like a simple step could actually take many steps. Depending on the organization specific processes may vary endlessly. Imagine a ticket per network port! At the end of the day though, self-imposed red tape can slow a deployment to a crawl… Imagine double-verification of all data! Or mandatory paper checklists per server chassis!

slide-10
SLIDE 10

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Common Headaches

slide-11
SLIDE 11

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Architectural Mandates

Photo credit: torkildr via Foter.com / CC BY-SA

slide-12
SLIDE 12

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Labeling? Lack their of?

Photo credit: one individual via Foter.com / CC BY-SA

slide-13
SLIDE 13

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Deployment via Human

Photo credit: Jemimus via Foter.com / CC BY

slide-14
SLIDE 14

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Inconsistent Hardware

Photo credit: Julia Kreger - @ashinclouds

slide-15
SLIDE 15

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

VLANs, Inside VLANs?

Photo credit: jronaldlee via Foter.com / CC BY

slide-16
SLIDE 16

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Policy, More Humans?

Photo credit: Foter.com

slide-17
SLIDE 17

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Here be Dragons!

slide-18
SLIDE 18

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

A bare metal cloud is not Traditional IT!

slide-19
SLIDE 19

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

It is a union of self-service and raw infrastructure!

slide-20
SLIDE 20

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

It can support traditional workloads, but processes and workflows must adapt!

slide-21
SLIDE 21

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

There must be willingness for change!

"Do not meddle in the affairs of dragons for you are crunchy and taste good with ketchup" -- source unknown

slide-22
SLIDE 22

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

How to get it right?

slide-23
SLIDE 23

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Step 0: Identify needs, not wants!

slide-24
SLIDE 24

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Treat it as an island not as an addition.

slide-25
SLIDE 25

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Plan ahead to walk through everything!

At least once!

slide-26
SLIDE 26

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Plan on inconsistencies in hardware!

slide-27
SLIDE 27

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Run current software!

slide-28
SLIDE 28

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Run what the community develops!

Consider NOT running vendor packages!

slide-29
SLIDE 29

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Engage the community!

We don’t read minds!

slide-30
SLIDE 30

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Helpful hints with Ironic

slide-31
SLIDE 31

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic is intended to be Admin-only

slide-32
SLIDE 32

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Nova initiated and manually deployed baremetal can co-exist!

slide-33
SLIDE 33

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Networking will be the headache!

slide-34
SLIDE 34

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic Networking

Expectations

  • At least one network interface must be registered as a “port”
  • A cleaning network
  • A deployment network
  • Cleaning and deployment networks able to reach the Ironic API
  • A network for the node to live on after deployment
slide-35
SLIDE 35

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic Networking

Three use models

  • “flat” - Preconfigured static networking
  • “neutron” - Dynamic networking through Neutron ml2 driver controlled switches
  • “noop” - Short for “no operation” used by neutron-less and stand-alone users.
slide-36
SLIDE 36

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic Networking

Hardware iPXE support is not always “real”

  • Tools like WireShark can be extremely useful for troubleshooting.
  • iPXE continues to evolve and add new features.

Some newer hardware will only boot via PXE when in UEFI mode.

slide-37
SLIDE 37

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Hardware drivers often the next headache!

slide-38
SLIDE 38

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Hardware Drivers

Some hardware needs special drivers! To properly support, the drivers MUST be in the “deployment ramdisk” and the “instance image”.

slide-39
SLIDE 39

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic Drivers/Hardware Types

slide-40
SLIDE 40

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic Drivers and Hardware Types

“Classic Drivers”

  • pxe_ipmitool for “iscsi” based write from the conductor
  • Agent_ipmitool for “direct” write to disk via the agent

Hardware Type based drivers

  • Driver set to “ipmi”
  • deploy_interface can now be “iscsi” or “direct”
slide-41
SLIDE 41

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Ironic’s State Machine

slide-42
SLIDE 42

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Node States

slide-43
SLIDE 43

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

What is cleaning?

In production, never turn cleaning off and never turn off a node that is cleaning! Ironic Python Agent (IPA) will utilize shred or ATA Secure Erase to wipe the contents from disks. Custom IPA Hardware Managers can also do things like flash firmware or assert settings.

slide-44
SLIDE 44

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

How do I troubleshoot?

slide-45
SLIDE 45

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Troubleshooting Hints

Look at the node “last_error” field `openstack baremetal node show <uuid>` If in a “wait” state?

  • Is there connectivity? Heartbeating?
  • Is there a clean_step populated or running?

Deploy failing? Consider the [agent]/deploy_logs_collect setting in ironic.conf And feel free to ask us for help in #openstack-ironic on irc.freenode.net

slide-46
SLIDE 46

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Questions?

slide-47
SLIDE 47

@ashinclouds - Deploying a baremetal cloud is hard - November 6th, 2017

Thanks!

https://docs.openstack.org/ironic