Microservices and DevOps Journey at Wix.com Aviran Mordo - - PowerPoint PPT Presentation

microservices and devops journey at wix com
SMART_READER_LITE
LIVE PREVIEW

Microservices and DevOps Journey at Wix.com Aviran Mordo - - PowerPoint PPT Presentation

Microservices and DevOps Journey at Wix.com Aviran Mordo www.linkedin.com/in/aviran Head of @aviranm http://www.aviransplace.com @aviranm @aviranm @aviranm Wix In Numbers Over 86M users Static storage is >2Pb of data 3 data centers +


slide-1
SLIDE 1

@aviranm

Aviran Mordo Head of

Microservices and DevOps Journey at Wix.com


www.linkedin.com/in/aviran @aviranm http://www.aviransplace.com

slide-2
SLIDE 2

@aviranm

slide-3
SLIDE 3

@aviranm

slide-4
SLIDE 4

@aviranm

Wix In Numbers

Over 86M users Static storage is >2Pb of data 3 data centers + 2 clouds (Google, Amazon) 2B HTTP requests/day 1200 people work at Wix

slide-5
SLIDE 5

@aviranm

Over 200 Microservices on Production

slide-6
SLIDE 6

@aviranm

Microservices - What Does it Take

Continuous Delivery DevOps Circuit Breaker Feature Flags Throttlers Monitoring Testing Message Bus RPC REST SLA Distributed Transactions Backward / Forward Compatibility Clustering Conway’s law Development / product lifecycle Boundary KISS YANGI LEAN

Do / Use Consider / Understand

slide-7
SLIDE 7

@aviranm

Microservices - What Does it Take

slide-8
SLIDE 8

@aviranm

How to Get There? (Wix’s journey)

http://gpstrackit.com/wp-content/uploads/2013/11/VanishingPointwRoadSigns.jpg

slide-9
SLIDE 9

@aviranm

http://p1.pichost.me/i/11/1339236.jpg

About 6 years ago

slide-10
SLIDE 10

@aviranm

Initial Architecture

One database Stateful login (Tomcat session), Ehcache, file uploads No consideration for performance, scalability and testing Intended for short-term use Tomcat, Hibernate, custom web framework

Lighttpd (file serving) MySQL DB Wix (Tomcat)

slide-11
SLIDE 11

@aviranm

The Monolithic Giant

One monolithic server that handled everything Dependency between features Changes in unrelated areas caused deployment of the whole system Failure in unrelated areas will cause system wide downtime

Lighttpd (file serving) MySQL DB Wix (Tomcat)

slide-12
SLIDE 12

@aviranm

Breaking the System Apart

https://upload.wikimedia.org/wikipedia/commons/6/67/Broken_glass.jpg
slide-13
SLIDE 13

@aviranm

slide-14
SLIDE 14

@aviranm

Concerns and SLA

Many feature request Lower performance requirement Lower availability requirement Write intensive Edit websites Not many product changes High performance High availability Read intensive View sites, created by Wix editor

slide-15
SLIDE 15

@aviranm

Mono-Wix

Phase 1

slide-16
SLIDE 16

@aviranm

Extract Public Service

Editor service (Mono-Wix) Public service

slide-17
SLIDE 17

@aviranm

Divide and Conquer

Editor service Public service Guideline: No runtime, deployment or data dependency

slide-18
SLIDE 18

@aviranm

Why 2 Monoliths? Baby Steps

Editor service Public service

Editor need fast development; (microservices => decoupling) Public needs stability; microservices => scalability /resilient

slide-19
SLIDE 19

@aviranm

Separation by Product Lifecycle

Decouple architecture => Decouple teams Deployment independence Areas with frequent changes

Editor service Public service

slide-20
SLIDE 20

@aviranm

Separation by Service Level

Scale independently Use different data store Optimize data per use case (Read vs Write) Run on different datacenters / clouds / zones System resiliency (degradation of service vs. downtime) Faster recovery time

Editor service Public service

slide-21
SLIDE 21

@aviranm

http://blogs.adobe.com/captivate/2011/03/training-adding-interactivity-to-elearning-courses-with-adobe-captivate-5.html/time-to-learn-clock
slide-22
SLIDE 22

@aviranm

Service Boundary

slide-23
SLIDE 23

@aviranm

Separation of Databases

Copy data between segments Optimize data per use case (read vs. write intensive) Different data stores

Public service Editor service

Copy necessary data

slide-24
SLIDE 24

@aviranm

Serialization

slide-25
SLIDE 25

@aviranm

Serialization / Protocol

Binary? JSON / XML / Text? HTTP?

Public service Editor service

slide-26
SLIDE 26

@aviranm

Serialization / Protocol - Tradeoffs

Readability? Performance? Debug? Tools? Monitoring? Dependency?

Public service Editor service

slide-27
SLIDE 27

@aviranm

API Transport/Protocol

slide-28
SLIDE 28

@aviranm

How to Expose an API

REST? RPC? SOAP?

Public service Editor service

slide-29
SLIDE 29

@aviranm

Wix’s Choices

REST HTTP

Public service Editor service

Binary JSON-RPC HTTP

slide-30
SLIDE 30

@aviranm

API Versioning

slide-31
SLIDE 31

@aviranm

API Versioning

Public service Editor service

Backward compatibility

Maybe here

API Schema /v1/v2

slide-32
SLIDE 32

@aviranm

A-Synchronous

slide-33
SLIDE 33

@aviranm

Which Queuing System to Use

Public service Editor service

Threads Kafka? RabbitMQ? ActiveMQ? ???

slide-34
SLIDE 34

@aviranm

Service Discovery

slide-35
SLIDE 35

@aviranm

Service Discovery

Public service Editor service

Configuration (DNS+LB) Zookeeper? Consul? Etcd? Eureka?

slide-36
SLIDE 36

@aviranm

Resilience

slide-37
SLIDE 37

@aviranm

What does the Arrow Mean?

Public service Editor service

slide-38
SLIDE 38

@aviranm

Failure Points = Network I/O

Public service Editor service

Retry policy Circuit breaker Throttlers

Be careful – you may cause downtime Retry only on idempotent operations

slide-39
SLIDE 39

@aviranm

Degradation of Service

Public service Editor service

Feature killer (Killer feature) Fallbacks Self healing

slide-40
SLIDE 40

@aviranm

Testing

slide-41
SLIDE 41

@aviranm

Test a Distributed System (at Wix)

Public service Editor service

Unit Test Integration Test Server E2E Automation

Client

slide-42
SLIDE 42

@aviranm

Distributed Logging

slide-43
SLIDE 43

@aviranm

Build visibility into service

slide-44
SLIDE 44

@aviranm

Ownership

slide-45
SLIDE 45

@aviranm

Team Work

Microservice is owned by a team You build it – you run it No microservice is left without a clear owner Microservice is NOT a library – it is a live production system

slide-46
SLIDE 46

@aviranm

What is the Right Size of a Microservice?

slide-47
SLIDE 47

@aviranm

The Size of a Microservice is the Size

  • f the Team That is Building it.

“Organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations” Conway, Melvin

slide-48
SLIDE 48

What did you Learn from Just 2 Services

  • Service boundary
  • Monitoring infrastructure
  • Serialization format
  • Synchronous communication protocol (HTTP/Binary)
  • Asynchronous (queuing infra)
  • Service SLA
  • API definition (REST/ RPC / Versioning)
  • Data separation
  • Deployment strategy
  • Testing infrastructure (integration test, e2e test)
  • Compatibility (backwards / forward)
slide-49
SLIDE 49

@aviranm

Continue to Extract More Microservices

slide-50
SLIDE 50

@aviranm HTML Editor Flash Editor

MSM

Private Media Public Media

Editor Segment Public Segment

Premium Services

List DB

App Builder

App Store App Market

Dashboard

Mailer TimeZone

Public HTML API Public API (Flash)

MSP

Public Server

HTML Renderer HTML SEO Renderer

Flash Renderer

Flash SEO Renderer

Sitemap Renderer

Robots.txt Renderer

User Server

Template Viewer

Contacts HUB Activity Site Members Store Mgr Comments Snapshoter User Pref Feed Me Shout-out Hotels

PETRI

Site Pref Dist Logger Slicer

eCom Renderer eCom Cart eCom Checkout eCom Catalog eCom Orders Payment Facade

Account Info HTML API HTML Embeder

Blog Mobile

Mostly writes 2 Data centers Db active-standby (preferably active-active) Performance < 2s 99% Serves mostly site builders Uptime > 99.9 Mostly reads >2 Data centers Db active-active(-active) Performance < 500ms 99% Serves mostly site viewers Uptime > 99.99

slide-51
SLIDE 51

@aviranm

When to Extract a New Microservice

slide-52
SLIDE 52

@aviranm

Microservice or Library?

Do I create deployment dependency? What is DevOps overhead (managing middleware) ? Who owns it? Does it have its own development lifecycle? Does it fit the scalability / availability concerns? Can a different team develop it?

I need time zone from an IP address

slide-53
SLIDE 53

@aviranm

Microservice has Ops, Library is Only Computational

slide-54
SLIDE 54

@aviranm

Which Technology Stack to Use

slide-55
SLIDE 55

@aviranm

Free to Chose?

Microservices gives the freedom to use a different technology stacks. Enables innovation

slide-56
SLIDE 56

@aviranm

Default to the Stack You Know how to Operate.

slide-57
SLIDE 57

@aviranm

Innovate on Non Critical Microservices and Take Full Responsibility for its Operation.

slide-58
SLIDE 58

@aviranm

Polyglotic System?

slide-59
SLIDE 59

@aviranm

Limit your Stack

Code reuse Cross cutting concerns (session, security, auditing, testing, logging…) Faster system evolution Development velocity

slide-60
SLIDE 60

@aviranm

http://wallpaperbeta.com/dogs_kiss_noses_animals_hd-wallpaper-242054/
slide-61
SLIDE 61

@aviranm

What else will you learn

  • Distributed transactions
  • System monitoring
  • Distributed traces
  • Tradeoff of a new microservice vs. extending an existing one
  • Deployment strategy and dependency
  • Handling cascading failures
  • Team building/splitting
slide-62
SLIDE 62

@aviranm

Summary

slide-63
SLIDE 63

@aviranm

Why Microservices Scale engineering Development Velocity Scale system

slide-64
SLIDE 64

@aviranm

Microservices is the First Post DevOps Architecture

slide-65
SLIDE 65

@aviranm

Every Microservice is a Overhead

slide-66
SLIDE 66

@aviranm

It is all about trade-off

slide-67
SLIDE 67

@aviranm

Microservices Guidelines & Tradeoffs

Each service has its own DB schema (if one is needed) Gain - Easy to scale microservices based on service level concerns Tradeoff – system complexity, performance Only one service should write to a specific DB table(s) Gain - Decoupling architecture – faster development Tradeoff – system complexity / performance May have additional read-only services that accesses the DB (not recommended) Gain - Performance gain Tradeoff - coupling Services are stateless Gain - Easy to scale out (just add more servers) Tradeoff - performance / consistency

slide-68
SLIDE 68

@aviranm

slide-69
SLIDE 69

@aviranm

Thank You

slide-70
SLIDE 70

@aviranm

Q&A

@aviranm http://www.aviransplace.com www.linkedin.com/in/aviran Aviran Mordo

Head of Engineering

http://goo.gl/32xOTt

http://engineering.wix.com @WixEng