Continuous availability: from the shift paradigm to unmanned - - PowerPoint PPT Presentation

continuous availability from the shift paradigm
SMART_READER_LITE
LIVE PREVIEW

Continuous availability: from the shift paradigm to unmanned - - PowerPoint PPT Presentation

Continuous availability: from the shift paradigm to unmanned operation. Pietro Tiberi 17 January 2018 TIPS Contact Group Agenda 1 2 3 4 Introduction Continuous Results Conclusions and Availability perspective 2 Continuous


slide-1
SLIDE 1

Continuous availability: from the shift paradigm to unmanned operation.

Pietro Tiberi 17 January 2018 – TIPS Contact Group

slide-2
SLIDE 2

2

Agenda

Continuous availability: from the shift paradigm to unmanned operation

1

Introduction

2

Continuous Availability

3

Results

4

Conclusions and perspective

slide-3
SLIDE 3

3

Introduction

TIPS Non functional requirements - Reliability / Availability (RPO=0) (RTO=15 minutes)

Transactions Lost Downtime

99.9%

Continuous availability: from the shift paradigm to unmanned operation

slide-4
SLIDE 4

4

Introduction

Datacenter Operations

Continuous availability: from the shift paradigm to unmanned operation

Human based (on shifts) Unmanned

slide-5
SLIDE 5

5

CONTINUOUS OPERATION

Continuous availability: from the shift paradigm to unmanned operation

slide-6
SLIDE 6

6

Continuous Availability

From high availability to continuous availability

Continuous availability: from the shift paradigm to unmanned operation

  • Redundancy
  • Fault Tolerance
  • Clustering
  • Active Active configuration
  • Proactive

monitoring

  • Continuous

delivery

  • Automatic

remediation

  • Dynamic capacity

management

slide-7
SLIDE 7

7

Continuous Availability

Proactive Monitoring

Continuous availability: from the shift paradigm to unmanned operation

  • Infrastructure monitoring
  • Application monitoring
  • Detect events

before failures

  • Trigger automatic

actions

  • Analyze the event
slide-8
SLIDE 8

8

Continuous Availability

IT Automation

Continuous availability: from the shift paradigm to unmanned operation

slide-9
SLIDE 9

9

Continuous Availability

From Agile to Devops

Continuous availability: from the shift paradigm to unmanned operation

slide-10
SLIDE 10

10

Continuous Availability

DevOps - Everything as Code

Continuous availability: from the shift paradigm to unmanned operation

Code Virtual Infrastructure

slide-11
SLIDE 11

11

Continuous Availability

Dynamic Capacity Management

Continuous availability: from the shift paradigm to unmanned operation

  • Consumption

trend analysis

  • Resource utilization

rate optimization

  • What if scenarios
  • Predict

future requirements and trends

slide-12
SLIDE 12

12 Continuous availability: from the shift paradigm to unmanned operation

slide-13
SLIDE 13

13

Test Plant

Architecture

Continuous availability: from the shift paradigm to unmanned operation

Message Layer Database Layer

User A User B Message Router Message Processor Message Router Kafka Broker Aerospike Database write store store write write read put get get put

Application Layer

slide-14
SLIDE 14

14

Results

T est Architecture Specific tests to verify the relevant domain functions. Common simulation layer to reproduce real operational environment. executed on

Continuous availability: from the shift paradigm to unmanned operation

slide-15
SLIDE 15

15

Results

Simulation – continous delivery (1)

Normal traffic condition (500 msg/s), timeout = 10.000 ms Kafka cluster rolling update 0 messages lost 0 timeout expired

Continuous availability: from the shift paradigm to unmanned operation

SIMUL.APP.01 : message latency (1 sec average)

slide-16
SLIDE 16

16

Results

Continuous availability: from the shift paradigm to unmanned operation

07 November 2017 – CMG Impact 2017

SIMUL.APP.02 : message latency (1 sec average)

Simulation – continous delivery (2)

Heavy traffic condition (2000 msg/s), timeout = 10.000 ms Kafka cluster rolling update 0 messages lost some timeout expired

slide-17
SLIDE 17

17

Results

Simulation – proactive monitoring

Continuous availability: from the shift paradigm to unmanned operation

Normal traffic condition (500 msg/s) average E2E processing time = 45 ms High vCPU load added to Message Processor nodes. T0-T1  below threshold T2-T3  exceed threshold

slide-18
SLIDE 18

18

Conclusions and perspective

Phased Approach

Bi-modal Data Center

Tool

Continuous availability: from the shift paradigm to unmanned operation

slide-19
SLIDE 19

Continuous availability: from the shift paradigm to unmanned operation.

Pietro Tiberi (pietro.tiberi@bancaditalia.it)

Thanks for your attention