Stateful Services on DC/OS Santa Clara, California | April 23th - - PowerPoint PPT Presentation

stateful services on dc os
SMART_READER_LITE
LIVE PREVIEW

Stateful Services on DC/OS Santa Clara, California | April 23th - - PowerPoint PPT Presentation

Stateful Services on DC/OS Santa Clara, California | April 23th 25th, 2018 Who Am I? Shafique Hassan Solutions Architect @ Mesosphere Operator 2 Agenda DC/OS Introduction and Recap Why Stateful Services on


slide-1
SLIDE 1

Santa Clara, California | April 23th – 25th, 2018

Stateful Services on DC/OS

slide-2
SLIDE 2

2

Who Am I?

  • Shafique Hassan
  • Solutions Architect @ Mesosphere
  • “Operator”
slide-3
SLIDE 3

3

Agenda

  • DC/OS Introduction and Recap
  • Why Stateful Services on DC/OS?
  • Introduction to the DC/OS SDK
  • Demo

○ Deploying a Data Service on DC/OS

  • Wrap-Up and Summary
slide-4
SLIDE 4

4

Takeaways for this session

  • Why DC/OS is the best place to run stateful services
  • Introduction to the DC/OS SDK and how you can leverage it to build your
  • wn stateful services on DC/OS
slide-5
SLIDE 5

Why Stateful Services on DC/OS?

slide-6
SLIDE 6

6

DC/OS 101

Powered by Apache Mesos

Unified hybrid cloud operations

Securely manage cloud, datacenter, and edge infrastructures from a single control plane

4

Intelligent resource pooling

Optimize workload density for highest utilization with resource guarantees

3

Broad workload coverage

Run today & tomorrow’s applications including traditional J2EE, containers, analytics & ML

1

Application-aware automation

Automate workload-specific operating procedures to “as-a-Service” anything from Kubernetes to data services

2

slide-7
SLIDE 7

7

DC/OS Hybrid Cloud

  • Minimize footprint at edge or

remote infrastructures

  • Consistent operations across

clouds

  • Deploy applications to multiple clouds

simultaneously

  • Workloads automatically deployed

across fault domains (Racks or Cloud Availability Zones) Edge and Multi-Cloud Federation

  • Easily add and remove cloud

capacity to on-premise clusters Business Continuity & Disaster Recovery Cloud Bursting

slide-8
SLIDE 8

8

The DC/OS Catalog

  • Fast Data and Big Data
  • Scalable datacenter-wide

services

  • Open source &

Partner-supported packages

Mesosphere Enterprise DC/OS

Over 100 Services Made For Enterprise DC/OS

slide-9
SLIDE 9

9

Why Run Stateful Services on DC/OS?

On-demand provisioning

1 2 3

Simplified operations Elastic data infrastructure

  • Single command install of services
  • Runtime software upgrade
  • Runtime application settings update
  • Monitoring & metrics
  • Managed persistent storage volumes
  • Data services and containerized applications share

resources

  • Deploy instances with different versions on the same

infrastructure fabric

  • Resize instances
  • Add more instances
slide-10
SLIDE 10

10

The SMACK Stack

EVENTS

Ubiquitous data streams from connected devices

FEEDS Kafka ANALYTICS Spark STORAGE Cassandra REACTIVE APP Akka

Ingest millions of events per second Real-time and batch process data Distributed & highly scalable database Scalable, resilient, data driven applications

Sensors Devices Clients

MESOSPHERE DC/OS

  • Integrated set of data services to

ingest, analyze, and store streaming data

  • Simple deployment and operations

to get your apps to market faster

  • Highly available so you don’t miss a

single customer interaction

  • Increased utilization of hardware

and cloud resources through workload consolidation

slide-11
SLIDE 11

11

DC/OS Summary

DC/OS Approach: Datacenter-cloud as a single computer

1. Application-aware automation for complete lifecycle automation of platform services 2. Workload pooling and density optimization for dramatic cost savings 3. Unified hybrid cloud operations with high availability, security, and multi-tenancy

Data Analytics Cluster Message Queue Cluster Data Persistence Cluster Container Orchestration Cluster CI/CD Cluster

Traditional Approach: Slow, Expensive, Hard

Data Analytics Message Queue Data Persistence Container Orchestration Continuous Integration & Delivery

Datacenter-Cloud Operating System

1 2 3

  • Manual & applications-specific configurations

are slow and difficult to maintain

  • Cluster sprawl and low utilization
  • High risk with unique “snowflake”

configurations in cloud or datacenter silos

slide-12
SLIDE 12

The DC/OS SDK

slide-13
SLIDE 13

13

DC/OS SDK

  • A declarative orchestration abstraction for Apache Mesos and DC/OS
  • An Apache Mesos scheduler factory
  • Simplify the framework development process
  • Current frameworks include MongoDB, Kubernetes, Kafka, Cassandra, Elastic,

HDFS, EdgeLB, Zookeeper, Jenkins, Spark and more on the way

slide-14
SLIDE 14

14

Components of a Service

  • Mesos

○ Foundation of a DC/OS cluster; Resource manager

  • Zookeeper

○ SDK Schedulers use Zookeeper as their persistent store across restarts

  • Marathon

○ “Init system” of a DC/OS cluster

  • Scheduler

○ Management layer of the service; exposes endpoints and maintains services nodes

  • Packaging

○ Packaging schema for SDK services; defines how options are exposed

slide-15
SLIDE 15

15

Mesos Recap: Anatomy of a Resource Offer

Mesos Master(s) Mesos Agent

16 CPUs 128 GB RAM 1 TB disk

Mesos Agent

16 CPUs 128 GB RAM 1 TB disk

Mesos Agent

16 CPUs 128 GB RAM 1 TB disk

Master offers resources to scheduler

Scheduler

Executor

Available compute resources Resource offer accepted, launch executors/tasks

Tasks

Executor

Tasks

Executor

Tasks

Scheduler accepts or declines an offer

slide-16
SLIDE 16

16

DC/OS SDK

DC/OS Documentation Tools and Utilities Apache Mesos API Platform Feature Integration

Kafka Cockroach Spark

Finite State Machine Execution Plans Automated Recovery Universe Packaging App Configuration Networking & Discovery Storage Security Monitoring Offer Evaluation Resource Accounting Task Reconciliation Developer Environment Integration Test Framework Developer Guide Tutorials & Code Samples API Reference

Best Practices

Services SDK Platform

slide-17
SLIDE 17

17

DC/OS SDK Features

  • Horizontal scale out
  • Vertical scaling
  • Service discovery
  • Virtual Networks (CNI)
  • Readiness checks
  • Health checks
  • Custom recovery
  • Persistent volumes
  • Resource sets
  • Operator friendly tools (API)
  • Sidecars
  • Placement constraints
  • Configuration templating
  • Rolling updates (configuration)
  • Rolling upgrades (binaries)
  • GPUs
  • Fine-grained plan control
  • Secrets (EE)
  • Security (EE)
  • TLS provisioning (EE)
slide-18
SLIDE 18

18

DC/OS SDK Anatomy

POD: What? PLAN: How and When?

slide-19
SLIDE 19

19

DC/OS SDK Anatomy: Pods

pods: kafka: count: {{BROKER_COUNT}} placement: {{PLACEMENT_CONSTRAINTS}} tasks: broker: cpus: {{BROKER_CPUS}} memory: {{BROKER_MEM}} goal: RUNNING

slide-20
SLIDE 20

20

DC/OS SDK Anatomy: Plans

plans: deploy: strategy: serial phases: Deployment: strategy: {{DEPLOY_STRATEGY}} pod: kafka

slide-21
SLIDE 21

21

Why build Stateful Services using the DC/OS SDK?

  • Ease of install: DC/OS UI and DC/OS CLI
  • Persistent storage volumes: DC/OS reservations and persistent storage

volumes for data safety and durability.

  • Runtime configuration update: Update configuration during runtime.
  • Runtime software upgrade: Upgrade software during runtime.
  • Fault domain aware placement and data replication: Frameworks

automatically provision nodes and intelligently replicate data across fault domains.

  • Monitoring and metrics: Frameworks send metrics to customer provided

statsd metrics service for health and capacity monitoring.

slide-22
SLIDE 22

22

Runtime Configuration Updates

  • Minimize disruption to running

services.

  • Detect errors early and “rollback”.
  • Tight integration with DC/OS.
slide-23
SLIDE 23

23

Software and Configuration Updates

Change settings post installation:

  • Runtime update of configuration
  • ptions
  • Breakpoints for operator inputs
  • Rollback

dcos kafka update start --options=config.json dcos kafka update status dcos kafka update pause dcos kafka update resume

slide-24
SLIDE 24

Demo

slide-25
SLIDE 25

25

Summary

  • DC/OS presents a great option to run any application “as-a-Service” on

any infrastructure

  • The DC/OS SDK allows for technologies to be run as stateful services on

DC/OS with reduced operational complexity and increased agility

slide-26
SLIDE 26

Thank You!

slide-27
SLIDE 27

27

Resources

Documentation for data frameworks on DC/OS https://docs.mesosphere.com/services/ SDK https://github.com/mesosphere/dcos-commons https://mesosphere.github.io/dcos-commons/developer-guide/ https://docs.mesosphere.com/services/ops-guide/

slide-28
SLIDE 28

28

Rate My Session

slide-29
SLIDE 29

29

The “Operator” and “Developer”

The “Operator”

  • Operates the platform
  • IaaS, PaaS or XaaS
  • Responsible for keeping the lights on

and effective utilization of infrastructure

  • The “Developer”
  • And here
  • And maybe even here