Containers Infrastructure for Advanced Management Federico - - PowerPoint PPT Presentation

containers infrastructure for advanced management
SMART_READER_LITE
LIVE PREVIEW

Containers Infrastructure for Advanced Management Federico - - PowerPoint PPT Presentation

Containers Infrastructure for Advanced Management Federico Simoncelli Associate Manager, Red Hat October 2016 About Me Kubernetes Decoupling problems to hand out to different teams Developers do operations for their application


slide-1
SLIDE 1

Containers Infrastructure for Advanced Management

Federico Simoncelli Associate Manager, Red Hat October 2016

slide-2
SLIDE 2

About Me

slide-3
SLIDE 3

Kubernetes

  • Decoupling problems to hand out to different teams

○ Developers do operations for their application ○ Cluster Admins do operations for cluster software ○ Kernel and Operating System do operations for nodes ○ Hardware operations for clouds

  • Layer of abstraction for Application definition
  • Machines don’t have an identity or a specific function

○ “All ...machines are created equal”

  • Developers do not know about Operators issues
  • Operators do not know about Applications issues
slide-4
SLIDE 4

OpenShift

  • 100% based and compatible with Kubernetes
  • Kubernetes influencer for new features

○ Projects and Namespaces ○ Templates ○ Routes and Ingress

  • Additional features related to images life-cycle and

rolling updates

  • Integrated experience in many areas

○ Opinionated metrics and logging solutions ○ Developer Web Console

slide-5
SLIDE 5

Application Components Distribution

Traditional and Kubernetes distribution of application components

slide-6
SLIDE 6

SCALE COMPLEXITY

Dev team. How can we move faster? Dev meets Ops. How do we run at scale? DevOps. Can we turn it into a platform? Production Ops. How do we manage at scale? One developer. How do I containerize?

New Set Of (Old) Problems for Operators

slide-7
SLIDE 7

Deployment Requirements

  • Standardized and easy to reproduce

○ Pick a platform Atomic vs Traditional

  • Automatic and composable
  • Deploy-and-forget is not enough
  • Maintainable

○ Definition of desired state and reconciliation

  • Allow to reliably modify infrastructure

○ Scaling (add and remove nodes) ○ Change configurations, etc.

  • Somehow similar to Kubernetes principles
slide-8
SLIDE 8

Deployment Status

  • Kubernetes

○ kube-up based on SaltStack (turning into kube-deploy) ■ Mostly for GCE (and Vagrant for development) ○ Kargo based on Ansible ○ GKE (possible future)

  • OpenShift

○ https://github.com/openshift/openshift-ansible ○ Supports AWS GCE libvirt OpenStack Vagrant

  • Containers on OpenStack

○ Kubernetes and OpenShift Heat templates ○ Magnum container orchestration as first class resources ○ https://github.com/redhat-openstack/openshift-on-openstack

slide-9
SLIDE 9

OpenShift-Ansible

  • Actively maintained and feature-rich
  • Based on a healthy Open Source automation project

○ Large ecosystem ○ Composable with other automations

  • Describe your infrastructure as “inventory”

○ Inventory can be versioned and updated

  • Simple interactive installation

○ atomic-openshift-installer

  • Advanced installation supporting many advanced

features

○ Possibly hard to master

slide-10
SLIDE 10

Monitoring Objectives

  • Notification of incidents

○ Grace period ○ Notifications

  • Debug new or unknown issues

○ Quickly have at hand the overall status of the cluster ○ Easy access to metrics and logging ■ Metrics and logging at all levels (infrastructure, etc.)

  • Analyze trending and proactively avoid future

incidents

○ Scheduled maintenance ○ Datacenter Hardware upgrades

slide-11
SLIDE 11

Common Monitoring Architecture

slide-12
SLIDE 12

Monitor Kubernetes-Based Clusters with Heapster

  • Leverage the infrastructure to monitor

the same infrastructure

○ What if monitoring is failing continuously?

  • Heapster

○ Enables Container Cluster Monitoring and Performance Analysis ○ Different sinks

  • Autoscaling

○ Collected data are then used to autoscale Pods (when configured)

slide-13
SLIDE 13

Agile Monitoring

  • Running continuously a data center 24/7 demands

more than Metrics collection

  • Contribution to Heapster and cAdvisor is “slow”
  • Integrate additional solutions and technologies
  • Agile addition of new Metrics

○ No development involved

  • Monitoring for known issues

○ Nodes can self-heal

  • Statistics on most recurring issues

○ Identify fragile components or architecture ○ Focus development for reliability

slide-14
SLIDE 14

Application and Infrastructure Monitoring

  • Roles and duties separation (once again)

○ Developers should be interested only on metrics and logs of applications ■ Developers must see only data of objects they own ○ Operators are mostly interested on metrics and logs of the infrastructure (e.g. nodes)

  • Metrics, logging and alerts belong to objects

○ Heapster collects metrics per object (node, container, etc.)

  • Security considerations

○ Applications and infrastructure in the same data store? ○ Tenancy in data store is enough for you?

slide-15
SLIDE 15

Monitoring Architecture Considerations

  • Reliability and disruptions isolation
  • Scalability of each subsystem
  • Data locality
  • Reuse of existing solutions
  • Security (and isolation of data)
  • Monitoring life-cycle (upgrade and rollback)
  • Cross correlation of multiple clusters and solutions
  • Single technology for Metrics and Logging?
slide-16
SLIDE 16

Direct Monitoring

slide-17
SLIDE 17

Metrics and Logging Federation

slide-18
SLIDE 18

Hawkular and ElasticSearch

  • Open Source solutions for metrics and logging

Hawkular based on Cassandra

ElasticSearch based on Lucene

  • Data stores used by many existing projects
  • Technologies of choice for OpenShift

○ Work out of the box in OpenShift

  • Hawkular trigger definitions for Alerts
  • Kibana visualization tool for ElasticSearch
slide-19
SLIDE 19

Image and Security

Security assessment

  • How to trust underlying images?
  • How to keep the images safe
  • How to enforce security policies?

Technologies

  • Signed images
  • OpenSCAP assessment tools
  • Atomic Scan and Blackduck
slide-20
SLIDE 20

Putting It All Together

  • Maintainable deployment solution

○ Support cluster re-shaping ○ Versionable

  • Monitoring unexpected events and alerts
  • Planning data center evolution over time
  • Ability of monitoring and cross-link with the

underlying infrastructure

  • Out-Of-The-Box experience

○ Knowledge gathered from a community of Operators

slide-21
SLIDE 21

ManageIQ Comprehensive Cloud Management

  • Single-Pane of Glass

○ Monitoring ○ Management

  • Private and Public All-Around

○ VMs, Instances, Containers, Storage, Network

  • Management Framework

○ Infrastructure applications

  • Policies and Alerts
  • Reports and Chargeback Reports
  • Automation
  • Capacity Planning
slide-22
SLIDE 22

ManageIQ Project and History

  • Virtualization Management since 2006
  • Acquired by Red Hat in December 2012
  • Open-Sourced in June 2014

7 Technical Leaders 3 Monthly Stable Builds ~50 Core Engineers Nightly Builds ~100 Contributors (and counting) 3 Weeks Sprints 3 Companies Involved 200 Average PR (per Sprint)

slide-23
SLIDE 23

Introducing Containers to ManageIQ 2015 - 2016

  • Inventory collection of major objects

○ Nodes, Pods, Services, Replicators, etc.

  • Cross-linking for nodes on known instances
  • Dashboard and Topology
  • Metrics collection from Hawkular

○ Utilization aggregation (Project, Service, etc.)

  • Smart-State Analysis

○ Collection of images packages

  • OpenSCAP for container images
  • Policies for container objects
  • Chargeback
slide-24
SLIDE 24
slide-25
SLIDE 25
slide-26
SLIDE 26

ManageIQ Inventory and Relationships

Service Container Pod Image Node Cluster Instance

slide-27
SLIDE 27
slide-28
SLIDE 28
slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33

Containers Management in ManageIQ in 2017

Current ongoing efforts for 2017

  • Alerts dashboard and life-cycle
  • Live Metrics and Alerts

○ Metrics served by Hawkular to ManageIQ ○ Support native Hawkular triggers for Alerts

  • Dynamic Metrics and Alerts

○ Custom metrics and alerts on-demand

  • Automation

○ Manage and re-provision ManageIQ using Ansible

  • Integration with Logging and ELK stack
slide-34
SLIDE 34

Get Involved!

  • Community http://talk.manageiq.org
  • Code https://github.com/ManageIQ/manageiq providers/containers
  • Documentation http://manageiq.org/documentation
  • Social:

○ Twitter @manageiq #manageiq

Federico Simoncelli fsimonce@redhat.com https://twitter.com/simon3z