Scaling Nova with CellsV2 The Nova Developer and the CERN Operator - - PowerPoint PPT Presentation

scaling nova with cellsv2
SMART_READER_LITE
LIVE PREVIEW

Scaling Nova with CellsV2 The Nova Developer and the CERN Operator - - PowerPoint PPT Presentation

Scaling Nova with CellsV2 The Nova Developer and the CERN Operator perspective Dan Smith (Red Hat) Belmiro Moreira (CERN) Your deployment probably looks like this: API DB/MQ Computes Nova with Cells(v1) This special router needs separate


slide-1
SLIDE 1

Scaling Nova with CellsV2

The Nova Developer and the CERN Operator perspective

Dan Smith (Red Hat) Belmiro Moreira (CERN)

slide-2
SLIDE 2

Your deployment probably looks like this:

API DB/MQ Computes

slide-3
SLIDE 3

Nova with Cells(v1)

Replication In Python

nova cells

This special router needs separate code for almost every feature!

slide-4
SLIDE 4

Native sharding of the contended resources

API DB/MQ Computes

slide-5
SLIDE 5

CellsV2 Services

Scheduler “Super” Conductor API Conductor Compute Compute Conductor Compute Compute Placement

slide-6
SLIDE 6

Design and Development Tenets

Mainstream

  • CellsV2 should not

be opt-in or a different code path

  • Full upstream testing

in a reasonably cells-y configuration

  • Cells should be

invisible to regular API users

No Python Replication

  • Data should either

live at the global or cell level (not both)

  • Aim for no

“unsupported in cells” features

Performance

  • Optimize cross-cell

instance-based API

  • perations
  • Introduce caching

and fault tolerance as needed

slide-7
SLIDE 7

Development Challenges

  • Unify two camps of Nova users

○ Those for which CellsV1 will never be a desirable solution ○ Those for which CellsV1 is a necessary evil

  • Must be able to prescribe a transition for both camps

○ Regular operators have minimal tolerance for unnecessary steps and sometimes fewer resources ○ Typical CellsV1 operators often have more resources, but have large existing deployments

  • Major re-architecting of a large amount of Nova internals
  • All of this must happen in parallel to other efforts
  • The world kept changing while we worked on this
slide-8
SLIDE 8

How’d that go then?

  • Mostly good?

○ Obviously this introduced bugs and churn

  • Some additional operational overhead for regular operators
  • Existing CellsV1 users faced a big transition

○ Deployment assumptions ○ Some of the least-desirable attributes became “features”

  • Resulted in some cleanups and stricter rules around existing Nova code

○ Laid the groundwork for future non-scale-related use cases

slide-9
SLIDE 9

Status (Rocky)

  • Fully developed and tested in mainstream Nova - there is no “non-cells”

deployment arrangement

  • Good multi-cell performance

○ Focus has been on instance operations ○ Some admin-type operations may still need optimizing

  • Some remaining functions fail to work properly in a fully-isolated environment

○ Late affinity check

  • Performance is rapidly improving
  • Fault tolerance is naive but improving
slide-10
SLIDE 10

What’s next?

  • Cross-cell migrations

○ Further eliminating the restrictions of running with multiple cells

  • Fault tolerance improvements

○ API availability when cells are down ○ Improving quota handling when cells are down ○ Still plenty of room to improve with caching and DB replication

  • Affinity via placement
slide-11
SLIDE 11
slide-12
SLIDE 12

CERN - Cloud resources status board - 06/11/2018@11:26

slide-13
SLIDE 13

Cells at CERN

  • CERN uses cells since 2013
  • Why we use cells?

○ Single endpoint. Scale transparently between different Data Centres ○ Availability and Resilience ○ Isolate failure domains ○ Dedicate cells to projects ○ Hardware type per cell ○ Easy to introduce new configurations

slide-14
SLIDE 14

CellsV1 and The Operational Nightmare

  • Unmaintained upstream
  • Only few deployments using CellsV1
  • Several functionality missing

○ Flavor Propagation ○ No aggregates support ○ No server group support ○ No security groups with nova-network

  • A lot of local patches to make other basic functionality work

○ Examples:

■ Boot more than one instance per request ■ Availability Zones support

  • DBs can get out of sync
  • Upgrade is hard!
slide-15
SLIDE 15

Journey to CellsV2 at CERN

Newton Ocata Pike Queens (https://www.youtube.com/watch?v=49CFXNIDM3c&t) Grizzly . . .

CellsV1 deployed at CERN Cloud 2 cells 2013 CellsV2 deployed at CERN Cloud 70 cells 2018

slide-16
SLIDE 16

Why we are excited about CellsV2?

  • Upstream code
  • All nova deployments use now cells

○ We are not in the “blackhole” anymore

  • Finally we can use nova full feature set
  • Promise of sane DBs
  • Rolling upgrades for old CellsV1 users
  • CERN moved fast to CellsV2

Identified few interesting issues at scale. Most already fixed in Rocky

slide-17
SLIDE 17

HOT Databases

Nova API Servers CellA compute nodes CellA controller CellB compute nodes CellB controller TOP Cell controllers

nova_api DB nova DB nova DB RabbitMQ RabbitMQ RabbitMQ

CellZ compute nodes CellZ controller

nova DB RabbitMQ

slide-18
SLIDE 18

HOT Databases

  • Cell databases activity increased a lot with cellsV2

○ Simple API operations need to connect to all DBs. Most of these operations were sequential ■ nova list; nova boot ○ Most of the issues are already fixed in Queens/Rocky or in progress

■ For example:

  • https://bugs.launchpad.net/nova/+bug/1771810
  • https://bugs.launchpad.net/nova/+bug/1746558
  • https://bugs.launchpad.net/nova/+bug/1746561

Number of queries and connections in one Cell DB after Nova Queens upgrade with cellsV2 enabled. API only available to few users

slide-19
SLIDE 19

DB Down! Cloud Down!

  • Fault tolerant DB solution per cell is recommended by Nova team

○ Very challenging for CERN considering the number of cells ○ One of the reasons that we decided to use cells was failure domains

  • An unavailable cell DB affects the entire cloud

○ Can’t create, list, delete instances…

  • No perfect solution... few compromises

○ https://review.openstack.org/#/q/topic:bp/handling-down-cell ○ For example:

■ Not all information is available when getting instances

  • nova list; nova show
  • returns a minimalistic construct from the available information in the API DB

■ Is not possible to calculate quota if project has instances in an unavailable cell

  • Policy: os_compute_api:servers:create:cell_down
slide-20
SLIDE 20

Scheduling

  • Central Scheduling

○ Filters are not per cell

■ Ex: “PCIPassthroughFilter” runs in every schedule request because we deploy GPUs in one cell

○ “request-filter” for Placement ■

Allows placement to be be aware of project cell mapping and AVZs ■ Basic filtering on Placement. Reduces the number of allocation candidates ■ Uses aggregates and placement aggregates

  • Automatic sync in Rocky

■ https://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#aggregates-in-placement

○ However, we can still get a large number of allocation candidates

■ scheduler/max_placement_results = 10

  • Improves scheduling performance
  • But… Unveiled some issues… (https://bugs.launchpad.net/nova/+bug/1777591)

○ Live migration with a defined target ○ Rebuild with a new image

slide-21
SLIDE 21

Miscellaneous

  • Delete "Orphan" request_specs and instance_mappings

○ https://bugs.launchpad.net/nova/+bug/1761198

  • Slow AVZ list. Important for Horizon

○ https://bugs.launchpad.net/nova/+bug/1801897

  • Scheduling time is higher than in CellsV1
  • Don’t expect always a consistent state from 5 years old DBs

○ Delete aggregate_hosts fails if service not available

slide-22
SLIDE 22

Rocky Upgrade - Nova

  • Control plane

Upgraded in 1h (nova API unavailable) ○ VMs (4 vcpus/8GB RAM) ○ Top control plane ■ 16 nova-api ■ 10 nova-conductor; 10 nova-scheduler ■ 10 nova-placement-api ○ 73 cell controllers ■ nova-api; nova-conductor; nova-network

  • DBs sync done the day before
  • upgrade_levels/compute=auto

Number of nova api requests

slide-23
SLIDE 23

Rocky Upgrade - Nova

  • Compute nodes upgraded during the next 24h after control plane
  • Number of placement requests increased with compute nodes upgrade
  • Needed to x3 the number of placement nodes
  • Impact in the VM scheduling time
  • nova-compute (ironic driver) rollback to Queens!
  • http://lists.openstack.org/pipermail/openstack-dev/2018-November/136251.html
  • https://review.openstack.org/#/c/614886/
slide-24
SLIDE 24

Summary

CERN Cloud is running Nova Rocky with CellsV2

  • Few issues found during Queens. Most of them are already fixed in Rocky
  • CellsV2 works at scale
  • No more code handcraft like in CellsV1 to have basic functionality
  • Performance is improving
  • Much easier upgrade

Thanks to everyone from the Nova Team!