FIXING T THE FLYING P PLAN ANE Major SAAS Upgrades by a - - PowerPoint PPT Presentation

fixing t the flying p plan ane
SMART_READER_LITE
LIVE PREVIEW

FIXING T THE FLYING P PLAN ANE Major SAAS Upgrades by a - - PowerPoint PPT Presentation

FIXING T THE FLYING P PLAN ANE Major SAAS Upgrades by a Production DevOps Team of 26 Introduction Calvin Domenico Jesse Campbell Director Sr. Software Engineer, Lead of Development Marie Hetrick Alastair Firth Manager of Hosting


slide-1
SLIDE 1
  • f 26

FIXING T THE FLYING P PLAN ANE

Major SAAS Upgrades by a Production DevOps Team

slide-2
SLIDE 2
  • f 26

Introduction

Calvin Domenico

Director

Marie Hetrick

Manager of Hosting

Elijah Aydnwylde

  • Sr. Sysadmin, Lead of Operations

Patrick McAndrew

  • Sr. Sysadmin, Lead of Infrastructure
2 Introduction

Jesse Campbell

  • Sr. Software Engineer, Lead of Development

Alastair Firth

Software Engineer

Brandon Arsenault

Project Manager

slide-3
SLIDE 3
  • f 26

The “Before” Environment

  • ~20 custom-developed services accessed by

10,000+ school districts nationwide

  • Software not designed for SaaS
  • Virtualized environment in Managed Hosting

datacenter limited visibility and prevented admin access to infrastructure

3 The “Before” Environment
slide-4
SLIDE 4
  • f 26

The “Before” Environment

Problem Scenario

■ Customers reporting networking issues ■ Troubleshooting isolates load balancer ■ MSP says it can't be

Solution

■ Bypass the load balancer

Cost

■ Lost customers ■ Man-weeks of troubleshooting and workarounds (attempts to work with MSP almost doubled this)

4 The “Before” Environment
slide-5
SLIDE 5
  • f 26
5

OP OPERA ERATORS ORS

can’t

OP OPERA ERATE E

if they can’t

SE SEE

slide-6
SLIDE 6
  • f 26

The Project

  • SOLVE the Managed Services problem without incurring the business

and man-hour costs of colocating

  • DESIGN a datacenter for the purpose of serving this

specific software as SaaS

  • PLAN up to 5x growth within 2 years, as well as upcoming changes to

the software (i.e. clustering)

  • PROOF the new datacenter in a local virtualized environment so that as

much of it as possible can be "ported" directly to the new hardware

6 The Project
slide-7
SLIDE 7
  • f 26
7

The Challenge:

DON DON’T T LAN AND THE P PLAN ANE

slide-8
SLIDE 8
  • f 26

The Challenge

  • One w

week o

  • f t

total d downtime for all operations

  • Six months maximum limit for datacenter

design, code development & implementation

  • Design, Build, Code, Upgrade, and Migrate all

at once!

8 The Challenge
slide-9
SLIDE 9
  • f 26
9

The

DEVEL VELOP OPME MENT T

slide-10
SLIDE 10
  • f 26

The Development

10 The Development: Requirements

Requirements

  • What to build?

■ Manage multiple layers

  • Virtual Infrastructure
  • Machine
  • Application
  • Data
  • Why should we build it?
slide-11
SLIDE 11
  • f 26

The Development

11 The Development: What Did We Build?

What Did We Build?

  • Automated Control engine for existing

technologies

■ NFS, Git, Puppet, VSphere, bash, perl

  • Unified control front-end
  • Extensible framework
  • No recovery: destroy and rebuild
  • Easy to pick up and create a

new complete stack

slide-12
SLIDE 12
  • f 26

The Development

12 The Development: The Team

The Team

  • Methodology
  • Mentality
  • Motivation
  • Personality
  • Ownership?
  • Who writes the spec?
slide-13
SLIDE 13
  • f 26

The Development

13 The Development: The Dev Environment

The Dev Environment

  • Tight schedule

■ Fast iterations

  • Design, Develop, Deploy, Destroy

■ Feature driven design

  • Communication

■ Oversight / insight

  • Single point of contact

■ Open access for devs ■ Appeasing stakeholders

  • Legitimate concerns

Ops Infra Devs Manager/Liason

Outside Stakeholders
slide-14
SLIDE 14
  • f 26
14

THE THEN

and

NO NOW W

slide-15
SLIDE 15
  • f 26

Then and Now

15 Then and Now: Time to Create and Deploy a Site

Time to Create and Deploy a Site

3–5

DAYS 24 HOURS

Vs.
slide-16
SLIDE 16
  • f 26

Then and Now

16 Then and Now: Time to Bring a Virtual Machine Online

Time to Bring a Virtual Machine Online

30–45

DAYS

1

HOUR

Vs.

$ Number of words required to get a Virtual Machine online $ then 23523 23523 words $ now 5 words ▋

slide-17
SLIDE 17
  • f 26

Then and Now

17 Then and Now: Time to Configure an Application Server

Time to Configure an Application Server

3

DAYS <5 HOURS, AUTOMATED

Vs.
slide-18
SLIDE 18
  • f 26

Then and Now

18 Then and Now: Time to Configure a Database Server

Time to Configure a Database Server

1

WEEK <5 HOURS, AUTOMATED

Vs.
slide-19
SLIDE 19
  • f 26

Then and Now

19 Then and Now: Time to Deploy a Patch (Hours)

Time to Deploy a Patch (Hours)

160

HOURS

12 Months Ago

40

HOURS

6 Months Ago

3

HOURS

Today

4,500

HOURS

18 Months Ago
slide-20
SLIDE 20
  • f 26

Then and Now

20 Then and Now: Time to Re-balance Database Layer

Time to Re-balance Database Layer

1.5

MONTHS OF OVERTIME

2 People 4/4

DECISION-MAK AKING/4 H HOURS R REVIEW

Automated

Vs.
slide-21
SLIDE 21
  • f 26

Then and Now

21 Then and Now: Time to Recover Our Entire Environment

Time to Recover Our Entire Environment

5+

WEEKS <24 HOURS

Vs.
slide-22
SLIDE 22
  • f 26
22

how did it all

COME T TOGETHE THER? ?

slide-23
SLIDE 23
  • f 26

How Did it All Come Together?

23 How Did it All Come Together?: Abstracting Enterprise Components

Abstracting Enterprise Components

  • Abstracting System and Software

Components

■ What are our Software Components?

  • Application Agents
  • Customer Databases

■ What are our System Components?

  • Application Servers
  • Database Servers
slide-24
SLIDE 24
  • f 26

How Did it All Come Together?

24 How Did it All Come Together?: Abstracting Harder

Abstracting Harder

  • What are the relationships between these

components?

  • How can they be abstracted?

■ Cluster

  • A selection of Customers grouped

together and handled by a single Agent

■ Node

  • An instance of a cluster running on an Application Server
  • What do these abstractions allow us

to infer by relation?

slide-25
SLIDE 25
  • f 26

How Did it All Come Together?

25 How Did it All Come Together?: Agile Development

Agile Development

  • Adaptable to

■ Unknown Performance and Needs ■ Changing Requirements

  • High Visibility provides

■ Decreased Risk ■ Increased Business Value

  • Collaborative Design promotes

■ Diverse Viewpoints ■ Shared Experience

slide-26
SLIDE 26
  • f 26
26

end