containers infrastructure for advanced management
play

Containers Infrastructure for Advanced Management Federico - PowerPoint PPT Presentation

Containers Infrastructure for Advanced Management Federico Simoncelli Associate Manager, Red Hat October 2016 About Me Kubernetes Decoupling problems to hand out to different teams Developers do operations for their application


  1. Containers Infrastructure for Advanced Management Federico Simoncelli Associate Manager, Red Hat October 2016

  2. About Me

  3. Kubernetes ● Decoupling problems to hand out to different teams ○ Developers do operations for their application ○ Cluster Admins do operations for cluster software ○ Kernel and Operating System do operations for nodes ○ Hardware operations for clouds ● Layer of abstraction for Application definition ● Machines don’t have an identity or a specific function ○ “All ...machines are created equal” ● Developers do not know about Operators issues ● Operators do not know about Applications issues

  4. OpenShift ● 100% based and compatible with Kubernetes ● Kubernetes influencer for new features ○ Projects and Namespaces ○ Templates ○ Routes and Ingress ● Additional features related to images life-cycle and rolling updates ● Integrated experience in many areas ○ Opinionated metrics and logging solutions ○ Developer Web Console

  5. Application Components Distribution Traditional and Kubernetes distribution of application components

  6. New Set Of (Old) Problems for Operators SCALE COMPLEXITY One developer . Dev team . Dev meets Ops . DevOps . Production Ops. How do I How can we How do we run at Can we turn it into How do we containerize? move faster? scale? a platform? manage at scale?

  7. Deployment Requirements ● Standardized and easy to reproduce ○ Pick a platform Atomic vs Traditional ● Automatic and composable ● Deploy-and-forget is not enough ● Maintainable ○ Definition of desired state and reconciliation ● Allow to reliably modify infrastructure ○ Scaling (add and remove nodes) ○ Change configurations, etc. ● Somehow similar to Kubernetes principles

  8. Deployment Status ● Kubernetes ○ kube-up based on SaltStack (turning into kube-deploy) ■ Mostly for GCE (and Vagrant for development) ○ Kargo based on Ansible ○ GKE (possible future) ● OpenShift ○ https://github.com/openshift/openshift-ansible ○ Supports AWS GCE libvirt OpenStack Vagrant ● Containers on OpenStack ○ Kubernetes and OpenShift Heat templates ○ Magnum container orchestration as first class resources ○ https://github.com/redhat-openstack/openshift-on-openstack

  9. OpenShift-Ansible ● Actively maintained and feature-rich ● Based on a healthy Open Source automation project ○ Large ecosystem ○ Composable with other automations ● Describe your infrastructure as “inventory” ○ Inventory can be versioned and updated ● Simple interactive installation ○ atomic-openshift-installer ● Advanced installation supporting many advanced features ○ Possibly hard to master

  10. Monitoring Objectives ● Notification of incidents ○ Grace period ○ Notifications ● Debug new or unknown issues ○ Quickly have at hand the overall status of the cluster ○ Easy access to metrics and logging ■ Metrics and logging at all levels (infrastructure, etc.) ● Analyze trending and proactively avoid future incidents ○ Scheduled maintenance ○ Datacenter Hardware upgrades

  11. Common Monitoring Architecture

  12. Monitor Kubernetes-Based Clusters with Heapster ● Leverage the infrastructure to monitor the same infrastructure ○ What if monitoring is failing continuously? ● Heapster ○ Enables Container Cluster Monitoring and Performance Analysis ○ Different sinks ● Autoscaling ○ Collected data are then used to autoscale Pods (when configured)

  13. Agile Monitoring ● Running continuously a data center 24/7 demands more than Metrics collection ● Contribution to Heapster and cAdvisor is “slow” ● Integrate additional solutions and technologies ● Agile addition of new Metrics ○ No development involved ● Monitoring for known issues ○ Nodes can self-heal ● Statistics on most recurring issues ○ Identify fragile components or architecture ○ Focus development for reliability

  14. Application and Infrastructure Monitoring ● Roles and duties separation (once again) ○ Developers should be interested only on metrics and logs of applications ■ Developers must see only data of objects they own ○ Operators are mostly interested on metrics and logs of the infrastructure (e.g. nodes) ● Metrics, logging and alerts belong to objects ○ Heapster collects metrics per object (node, container, etc.) ● Security considerations ○ Applications and infrastructure in the same data store? ○ Tenancy in data store is enough for you?

  15. Monitoring Architecture Considerations ● Reliability and disruptions isolation ● Scalability of each subsystem ● Data locality ● Reuse of existing solutions ● Security (and isolation of data) ● Monitoring life-cycle (upgrade and rollback) ● Cross correlation of multiple clusters and solutions ● Single technology for Metrics and Logging?

  16. Direct Monitoring

  17. Metrics and Logging Federation

  18. Hawkular and ElasticSearch ● Open Source solutions for metrics and logging Hawkular based on Cassandra ○ ElasticSearch based on Lucene ○ ● Data stores used by many existing projects ● Technologies of choice for OpenShift ○ Work out of the box in OpenShift ● Hawkular trigger definitions for Alerts ● Kibana visualization tool for ElasticSearch

  19. Image and Security Security assessment ● How to trust underlying images? ● How to keep the images safe ● How to enforce security policies? Technologies ● Signed images ● OpenSCAP assessment tools ● Atomic Scan and Blackduck

  20. Putting It All Together ● Maintainable deployment solution ○ Support cluster re-shaping ○ Versionable ● Monitoring unexpected events and alerts ● Planning data center evolution over time ● Ability of monitoring and cross-link with the underlying infrastructure ● Out-Of-The-Box experience ○ Knowledge gathered from a community of Operators

  21. ManageIQ Comprehensive Cloud Management ● Single-Pane of Glass ○ Monitoring ○ Management ● Private and Public All-Around ○ VMs, Instances, Containers, Storage, Network Management Framework ● ○ Infrastructure applications ● Policies and Alerts ● Reports and Chargeback Reports ● Automation ● Capacity Planning

  22. ManageIQ Project and History ● Virtualization Management since 2006 ● Acquired by Red Hat in December 2012 ● Open-Sourced in June 2014 7 Technical Leaders 3 Monthly Stable Builds ~50 Core Engineers Nightly Builds ~100 Contributors (and counting) 3 Weeks Sprints 3 Companies Involved 200 Average PR (per Sprint)

  23. Introducing Containers to ManageIQ 2015 - 2016 ● Inventory collection of major objects ○ Nodes, Pods, Services, Replicators, etc. ● Cross-linking for nodes on known instances ● Dashboard and Topology ● Metrics collection from Hawkular ○ Utilization aggregation (Project, Service, etc.) ● Smart-State Analysis ○ Collection of images packages ● OpenSCAP for container images ● Policies for container objects ● Chargeback

  24. ManageIQ Inventory and Relationships Service Pod Container Image Cluster Node Instance

  25. Containers Management in ManageIQ in 2017 Current ongoing efforts for 2017 ● Alerts dashboard and life-cycle ● Live Metrics and Alerts ○ Metrics served by Hawkular to ManageIQ ○ Support native Hawkular triggers for Alerts ● Dynamic Metrics and Alerts ○ Custom metrics and alerts on-demand ● Automation ○ Manage and re-provision ManageIQ using Ansible ● Integration with Logging and ELK stack

  26. Get Involved! ● Community http://talk.manageiq.org ● Code https://github.com/ManageIQ/manageiq providers/containers ● Documentation http://manageiq.org/documentation ● Social: ○ Twitter @manageiq #manageiq Federico Simoncelli fsimonce@redhat.com https://twitter.com/simon3z

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend