An Early Adopters Story About SUSE Cloud Application Platform - - PowerPoint PPT Presentation

an early adopters story about suse cloud application
SMART_READER_LITE
LIVE PREVIEW

An Early Adopters Story About SUSE Cloud Application Platform - - PowerPoint PPT Presentation

An Early Adopters Story About SUSE Cloud Application Platform Adfinis SyGroup - Switzerland Nicolas Christener CEO/CTO Adfinis SyGroup nicolas.christener@adfinis-sygroup.ch twitter.com/nikslor linkedin.com/in/christener 2 Lucas Bickel


slide-1
SLIDE 1

An Early Adopters Story About SUSE Cloud Application Platform

Adfinis SyGroup - Switzerland

slide-2
SLIDE 2

2

Nicolas Christener

CEO/CTO Adfinis SyGroup

nicolas.christener@adfinis-sygroup.ch twitter.com/nikslor linkedin.com/in/christener

slide-3
SLIDE 3

3

Lucas Bickel

Developer @ Adfinis SyGroup OSS dev by night

lucas.bickel@adfinis-sygroup.ch twitter.com/hairmare

slide-4
SLIDE 4

4

Since 2000

About Adfinis SyGroup

Berne, Basel, Zürich & Lausanne Over 55 employees 100% Open Source Broad customer base

slide-5
SLIDE 5

5

Our Services

Engineering Managed Services DevOps Development

slide-6
SLIDE 6

6

Our Partners

slide-7
SLIDE 7

7

Switzerland

slide-8
SLIDE 8

8

Stereotypical Swiss Icons

slide-9
SLIDE 9

9

Also Swiss

OS4: Early Free Flying Quadcopter at a Swiss Federal Institute of Technology

  • Collision avoidance
  • Autonomous flight
  • Flight planning
  • Helped kickstart the

current drone craze

slide-10
SLIDE 10

10

Also Swiss

VS-Code: Source-code Editor by MSFT

  • Most popular developer environment
  • Built in Zürich
  • By Erich Gamma and team
  • Open source made in Switzerland
slide-11
SLIDE 11

11

Not Swiss

Dedicated Regions by Public Cloud Providers

  • Some countries have tailor-made solutions for their government

customers - none in Switzerland so far

  • Governance, compliance, and data protection regulatories mandate a

custom built solution for Swiss government customers

  • We took the opportunity to build a solution in cooperation with SUSE
slide-12
SLIDE 12

12 12

Let's talk about CAP baby, let's talk about you and me, …, let's talk about CAP

slide-13
SLIDE 13

13

From Docker to Cloud Foundry

  • Docker introduced the masses to containers
  • Container workloads require a container orchestration solution
  • Kubernetes (K8s) is the de facto container orchestrator
  • Cloud Foundry (CF) development started even earlier
  • SUSE is bringing Cloud Foundry to the K8s ecosystem
slide-14
SLIDE 14

14

Container 101

App 1 App 3 Bins/ Libs Guest OS App 2 Bins/ Libs Bins/ Libs Guest OS Guest OS Hypervisor Host OS Bare Metal VM Container Engine Host OS Bare Metal App 1 App 3 Bins/ Libs App 2 Bins/ Libs Bins/ Libs Container

slide-15
SLIDE 15

15

Kubernetes 101

Node Container Engine

POD Container POD Container POD Container

Node Container Engine

POD Container POD Container POD Container

Node Container Engine

POD Container POD Container POD Container

Master

slide-16
SLIDE 16

16

CloudFoundry 101 (Eirini Style)

code cf push

POD Container

running app

kubernetes

slide-17
SLIDE 17

17 17

Why is this so cool?

slide-18
SLIDE 18

18

SPOILER ALERT SUSE Cloud Application Platform (CAP) is the simplest Cloud Foundry distribution available

slide-19
SLIDE 19

19

A Lightweight Cloud Foundry

Pivotal Cloud Foundry (PCF)

31 Nodes, 43 vCPUs, 122 GB RAM

http://pcfsizer.pivotal.io/#!/sizing/azure/2.2/small

Cloud Application Platform

11 Nodes, 22 vCPUs, 88 GB RAM Small setup on Azure using AKS

slide-20
SLIDE 20

20

A Portable PaaS Solution

SUSE Cloud Application Platform

  • fficially supports different K8s

stacks

  • SUSE CaaSP
  • Amazon EKS
  • Microsoft AKS
  • Google GKE

This is not the cases for other PaaS solutions

  • OpenShift is not supported on
  • ther K8s platforms
  • Pivotal does not support plain

K8s at the moment

slide-21
SLIDE 21

21

A Developer Centric Solution

Cloud Foundry style

  • Developers can use "cf push"

and focus on code

  • Sensible amount of settings
  • Open Service Broker API was

born in CF area

  • Perfect for Spring-Boot devs

Kubernetes style

  • Developers need to mess with

"kubectl" and "s2i" and friends

  • Large amount of settings
  • Open Service Broker API still an

alien

  • Generic Infrastructure Solution
slide-22
SLIDE 22

22

If one focuses on developer agility, Cloud Foundry (CF) is the answer!

slide-23
SLIDE 23

23 23

A Story About a Somewhat Bigger Project

slide-24
SLIDE 24

24

Let the Developers do Their Magic!

  • Teams need a platform to deploy their tailor-made applications
  • Time to market is a key factor in enabling customers success!
  • DevOps mindset requires access to self-service capabilities
slide-25
SLIDE 25

25

More Details About the Project

  • Swiss organizations tend to run their own infrastructure
  • "No one has access to my data" was the #1 strategy for many Swiss

services - e.g. banking, pharmaceutical and other industries

  • Swiss Government wants to keep the data in Switzerland as well
  • Broad acceptance of public cloud offerings will not happen soon
slide-26
SLIDE 26

26

Goals

  • Provide a PaaS for Swiss federal offices
  • Integrate provided services into existing service catalog
  • Cloud-like billing, connected to existing SAP landscape
  • We need to separate tenants physically in some cases
  • Make developers and operators happy by building an awesome platform
slide-27
SLIDE 27

27

Lovely Details Worth Mentioning

  • Direct contact with SUSE product management allows influencing the

future of the platform

  • The shift to “Open Source first, upstream first” by SUSE was done at the

right time for us. It enables us to help fix documentation for our customer at the proper upstream venues!

  • We have the opportunity to help lay the foundation for the future of

government cloud computing in Switzerland

slide-28
SLIDE 28

28

Use Cases

slide-29
SLIDE 29

29

Use Cases - Goals

  • End-user self-service portal for non-dev customers
  • Allow automating all the things
  • Offer various in-cluster services
  • Allow users to consume standard services w/o deploying to the cluster
  • Everything needs to be billable
slide-30
SLIDE 30

30

Use Cases - User Scopes

  • Can order services

and pay for them

  • Does not have any

escalated privileges

  • Uses a well

integrated self- service portal (HP-OO based)

  • Time to market and

elasticity are crucial

  • May also be end user
  • Develops & deploys

apps using CF API

  • Automates everything

from development to deployment & day two operations

  • May also use space
  • n the development

environment to learn about the platform

  • Maintains the cluster
  • Has full RBAC

access

  • Owns government

cloud strategy

  • Responsible for

platform & dev lifecycle mgmt

  • Manages commercial

& financing aspects

  • f the platform

End User “Internal” User Operator Platform Owner

slide-31
SLIDE 31

31

Use Cases - Solutions

  • Service offerings exposed via Open Service Broker API
  • High-order PaaS features from CAP exposed to users through CF API
  • HP Operations Orchestration (HP-OO) integration as self-service portal
  • Billing integration done by customer

In the future we plan on standardizing on cf-abacus if it reaches general availability

slide-32
SLIDE 32

32 32

Lessons Learned

slide-33
SLIDE 33

33

Integrating into an Existing Environment

slide-34
SLIDE 34

34

Existing Architecture

Compute

  • Hewlett Packard
  • On premise

Storage

  • NetApp NFS storage
  • Cluster on NAS level

Network

  • Edge: F5 Big-IP
  • LAN: Cisco
slide-35
SLIDE 35

35

Architecture Insights

  • Deployment on bare-metal
  • Reuse existing hardware
  • Software defined ... where possible
slide-36
SLIDE 36

36

Compute

slide-37
SLIDE 37

37

Compute - Goals

  • Compute is mostly a no-brainer however…
  • Spectre/Meltdown/etc. led to the "separate physically" requirement

Customer with specific security demands get their own CaaSP / bare-metal setup

  • We want to automate Velum (admin) node installs

CaaSP + CAP installation should be automated as much as possible

  • Automation means faster time to market
slide-38
SLIDE 38

38

Compute - Reality

  • Installation of a CaaSP cluster not fully automated out of the box

Velum node is usually installed manually

new Kubernetes nodes need to be assigned a system role

  • Manual installation is not an option for a service provider
  • With automation we can tick the "reproducible" checkbox
slide-39
SLIDE 39

39

Compute - Solutions

  • Complete CaaSP automation is achievable

Had to create some AutoYAST + Cloud-init configuration

  • Integrated pipeline to set up the rest done by customer

Does other things like set up billing, backup, etc.

  • We strive to make documentation & code available so others can use it
slide-40
SLIDE 40

40

Networking

slide-41
SLIDE 41

41

Networking - Goals

  • Exposing an internal service should automatically configure the LB
  • The F5 LB in front of the cluster is integrated into the stack
  • Restrict network connections of Pods
  • Pods should not be able to sniff traffic of their neighbours
slide-42
SLIDE 42

42

Networking - Reality

  • Flannel doesn't offer enough network restriction
  • Network automation has its own pitfalls

processes, governance, licensing, etc.

F5 automation needs "F5 SDN services" (additional $$$)

  • Workloads with specific security demands need their own CaaSP / bare

metal setup

slide-43
SLIDE 43

43

Networking - Solutions

  • Outgoing proxy was a challenge both for the deploy and for ops

was fixed upstream by SUSE

check “Using a Proxy Server with Authentication” in the deployment guide

  • Waiting to switch CNI (from Flannel → Cilium)
slide-44
SLIDE 44

44

Storage

slide-45
SLIDE 45

45

Storage - Goals

  • Customers want their own storage volume on the storage cluster
  • Volume provisioning should be automated
  • Existing NetApp infrastructure shall be integrated into the stack
slide-46
SLIDE 46

46

Storage - Reality

  • NetApp offers a K8s storage orchestrator (Trident)

Open Source & works like a charm - thanks NetApp!

  • Current CaaSP can't use Trident snapshot functionality

K8s volume snapshots are an alpha feature - would be nice to have

Snapshots are manageable out of band for now

slide-47
SLIDE 47

47

Storage - Solutions

  • If you have NetApp, go and try Trident
  • Can't say much about other storage vendors for now

Any feedback from the audience?

  • Can't say much about SUSE Enterprise Storage (SES) in this context

I'm sure Lars Marowsky-Brée et al. made SES + CAP = 💗

slide-48
SLIDE 48

48

BackupRestore Strategy

slide-49
SLIDE 49

49

BackupRestore Strategy - Goals

  • Primary areas of focus:

Disaster Recovery (DR)

Business Continuity Management (BCM)

  • Reinstall and restore state rather than restoring full “nodes”
  • Stateful resources (storage, databases) are kept in dedicated external

clusters

slide-50
SLIDE 50

50

BackupRestore Strategy - Reality

  • Internal container images aren't part of the backup

Can be rebuilt using CI/CD

Nevertheless we keep backups of all images for faster restores

  • We don’t have complete control over all images in a users container

registry but their images need to be restorable

  • Stateful services (i.e. MongoDB, Cassandra) in the cluster are a reality

even though we would prefer to run those externally

slide-51
SLIDE 51

51

BackupRestore Strategy - Solutions

  • Multi-pronged approach

K8s State: Heptio Velero (formerly Ark)

CF State: cf-plugin-backup

Storage: PV snapshots through NetApp Trident

etcd State: DIY tooling as an additional fallback should Velero fail

Dedicated solutions for in-cluster services (i.e. MongoDB, Cassandra)

  • All of these need to be understood and managed
slide-52
SLIDE 52

52

Logging

slide-53
SLIDE 53

53

Logging - Goals

  • Centralized logging is a first-class citizen
  • Batteries included for customers
  • Existing customers can use tooling they are used to (i.e. Splunk)
slide-54
SLIDE 54

54

Logging - Reality

  • Splunk has really high operational expenses
  • Log shippers for Splunk exist but aren’t integrated out of the box
  • Hopping onboard an existing Splunk installation is harder than expected
slide-55
SLIDE 55

55

Logging - Solutions

  • Integrated tooling has support for shipping to

Elasticsearch/Logstash/Kibana (ELK) so we use it

  • K8s logs are shipped to ELK using fluentd
  • CF logs get shipped to ELK using firehose nozzle
  • If customer isn’t happy with ELK we can switch to Splunk at any time
slide-56
SLIDE 56

56

Monitoring

slide-57
SLIDE 57

57

Monitoring - Goals

  • Establish key performance indicator (KPI) based monitoring
  • Offer monitoring as a service for tenants
  • No single point of failure in monitoring
  • Expose metrics while adhering to policy of least privilege
  • We want to use included best of breed tooling like Prometheus
slide-58
SLIDE 58

58

Monitoring - Reality

  • Prometheus needs to be monitored itself
  • There are no easy ways to enforce authorization in Prometheus
  • Prometheus and Grafana need quite some configuration to be useful
slide-59
SLIDE 59

59

Monitoring - Solutions

  • Use Grafana as frontend to Prometheus

Deployment of dashboards to Grafana is important to help users get up and running

  • Deploy multiple Prometheus stacks for multi-tenancy

This leads to tolerable redundancy over various Prometheus databases

  • An additional external Prometheus instance monitors the

in-cluster Prometheus

slide-60
SLIDE 60

60

Security

slide-61
SLIDE 61

61

Security - Goals

  • Best of breed security concepts and tooling are the only option
  • Network isolation is a must
  • We want to automate as much as possible
slide-62
SLIDE 62

62

Security - Reality

  • Network isolation isn't possible with Flannel
  • Helms tiller doesn’t make it as easy as we would like on the K8s parts
  • RBAC is a lot of work to set up
slide-63
SLIDE 63

63

Security - Solutions

  • NIST Special Publication 800-190 - Application Container Security

Guide (let’s not invent our own standards here)

  • Adapt existing cultural and technical conventions to a new highly

dynamic operating model

  • Infrastructure and customer apps are grouped by protection, needs help

enforcing network isolation where needed, if needed we deploy an additional cluster for sensitive apps

slide-64
SLIDE 64

64

Security - Solutions

  • MicroOS as a small ephemeral base for nodes
  • Vulnerability management by scanning containers in continuous

integration (CI) process

  • A tiller installation per “RBAC-Group” for K8s users

We’re looking forward to helm 3

  • CF already has a quite robust security model
slide-65
SLIDE 65

65

Enabling DevOps

slide-66
SLIDE 66

66

Enabling DevOps - Goals

  • DevOps as key to enabling automation
  • Allows a high degree of self-service capabilities
  • Both the CF API and K8s are available to users
slide-67
SLIDE 67

67

Enabling DevOps - Solutions

  • Paradigm shift when it comes to adopting existing processes into a

highly dynamic container environment

  • Continuous Integration and Deployment is mandatory
  • Three classical environments to aid rollout and testing

development

integration

production

slide-68
SLIDE 68

68

A defined reference architecture helps onboard new developers

Enabling DevOps - Solutions

Gateway Backing Services Telemetry Charging & Billing DevOps Automation Managed Containers Service Mesh Dev User

slide-69
SLIDE 69

69

Enabling DevOps - Solutions

  • Users that want full control can use helm and co.
  • Users that want an opinionated stack use CF tooling
  • Trainings for the customer and users → https://ad-sy.ch/trainings
slide-70
SLIDE 70

70 70

Outlook & Thanks

slide-71
SLIDE 71

71

To the Cloud and Beyond

  • Federate across multiple data centers
  • Be the default PaaS provider for Swiss government customers
  • The platform is available and builds the foundation for large digital

transformation projects

  • Looking forward to new cloud native services from ISVs
slide-72
SLIDE 72

72

Thanks! Merci! Dankeschön!

Early adoption needed help from SUSE that we received in time

  • Markus Wolf, Country Manager SUSE Switzerland
  • Ralf Dannert, Systems Engineer SUSE
  • Carsten Duch, Sales Engineer SUSE
  • Jeff Hobbs, Director of Engineering CaaSP/CAP, SUSE
  • plus countless other awesome chameleons
slide-73
SLIDE 73

73 73