Wix Architecture at Scale Aviran Mordo Head of Back-End Engineering - - PowerPoint PPT Presentation

wix architecture at scale
SMART_READER_LITE
LIVE PREVIEW

Wix Architecture at Scale Aviran Mordo Head of Back-End Engineering - - PowerPoint PPT Presentation

Wix Architecture at Scale Aviran Mordo Head of Back-End Engineering @ Wix @aviranm linkedin.com/in/aviran aviransplace.com Wix in Numbers Over 45,000,000 users 1M new users/month Static storage is >800TB of data 1.5TB new files/day 3


slide-1
SLIDE 1

Aviran Mordo Head of Back-End Engineering @ Wix

@aviranm linkedin.com/in/aviran aviransplace.com

Wix Architecture at Scale

slide-2
SLIDE 2
slide-3
SLIDE 3

Wix in Numbers

Over 45,000,000 users 1M new users/month Static storage is >800TB of data 1.5TB new files/day 3 data centers + 2 clouds (Google, Amazon) 300 servers 700M HTTP requests/day 600 people work at Wix, of which ~ 200 in R&D

slide-4
SLIDE 4

Initial Architecture

Built for fast development Stateful login (Tomcat session), Ehcache, file uploads No consideration for performance, scalability and testing Intended for short-term use Tomcat, Hibernate, custom web framework

Lighttpd (file serving) MySQL DB Wix (Tomcat)

slide-5
SLIDE 5

The Monolithic Giant

One monolithic server that handled everything Dependency between features Changes in unrelated areas of the system caused deployment

  • f the whole system

Failure in unrelated areas will cause system wide downtime

slide-6
SLIDE 6

Breaking the System Apart

slide-7
SLIDE 7

Concerns and SLA

Data Validation Security / Authentication Data consistency Lots of data Edit websites High availability High performance Lots of static files Very high traffic volume Viewport optimization Cacheable data Serving Media High availability High performance High traffic volume Long tail View sites, created by Wix editor

slide-8
SLIDE 8

Wix Segmentation

  • 1. Editor Segment
  • 3. Public Segment
  • 2. Media Segment

Networking

slide-9
SLIDE 9

Making SOA Guidelines

Each service has its own database (if one is needed) Only one service can write to a specific DB There may be additional read-only services that directly accesses the DB (for performance reasons) Services are stateless No DB transactions Cache is not a building block, but an optimization

slide-10
SLIDE 10
  • 1. Editor Segment
slide-11
SLIDE 11

Editor Server

Immutable JSON pages (~2.5M / day) Site revisions Active – standby MySQL cross datacenters

Editor Server MySQL Active Sites MySQL Archive

slide-12
SLIDE 12
slide-13
SLIDE 13

Protect The Data

Protect against DB outage with fast recovery = replication Protect against data poisoning/corruption = revisions / backup Make the data available at all times = data distribution to multiple locations / providers

slide-14
SLIDE 14

Browser Editor Server Static Grid

Notify Google Cloud Storage MySQL Active Sites MySQL Archive Notify

Saving Editor Data

Archive (Amazon) Archive (Google)

Save Page(s) 200 OK Upload Save Page DC replication Download Page MySQL Archive MySQL Active Sites

slide-15
SLIDE 15

Browser Editor Server Static Grid

Save Page(s) Save Page Upload Notify Download Page Google Cloud Storage MySQL Archive MySQL Active Sites MySQL Archive DC replication Notify

Self Healing Process

Archive (Amazon) Archive (Google)

MySQL Active Sites 200 OK

slide-16
SLIDE 16

No DB Transactions

Save each page (JSON) as an atomic operation Page ID is a content based hash (immutable/idempotent) Finalize transaction by sending site header (list of pages) Can generate orphaned pages, not a problem in practice

slide-17
SLIDE 17
  • 2. Media Segment
slide-18
SLIDE 18

Prospero – Wix Media Storage

800TB user media files 3M files uploaded daily 500M metadata records Dynamic media processing

  • Picture resize, crop and sharpen “on the fly”
  • Watermark
  • Audio format conversion
slide-19
SLIDE 19

Prospero

Eventual consistent distributed file system Multi datacenter aware Automatic fallback cross DC Run on commodity servers & cloud

slide-20
SLIDE 20

x36 Tx36 Tx32

Austin

Prospero – Wix Media Manager

get image.jpg First fallback Second fallback If not in CDN

Google Cloud

x36 Tx36 Tx32

Tampa CDN

slide-21
SLIDE 21
  • 3. Public Segment
slide-22
SLIDE 22

Public Segment Roles

Routing (resolve URLs) Dispatching (to a renderer) Rendering (HTML,XML,TXT)

Public Server HTML Renderer HTML SEO Renderer Flash Renderer Sitemap Renderer Robots.txt Renderer

www.example.com

Flash SEO Renderer

slide-23
SLIDE 23

Public SLA

Response time <100ms at peak traffic

slide-24
SLIDE 24

Publish A Site

Publish site header (a map of pages for a site) Publish routing table

Publish site header / routes

Editor Segment Public Segment

slide-25
SLIDE 25

Built For Speed

Minimize out-of-service hops (2 DB, 1 RPC) Lookup tables are cached in memory, updated every 5 minutes Denormalized data – optimize for read by primary key (MySQL) Minimize business logic

slide-26
SLIDE 26

How a Page Gets Rendered

Bootstrap HTML template that contains only data Only JavaScript imports JSON data (site-header + dynamic data) No “real” HTML view

slide-27
SLIDE 27

Offload rendering work to the browser

slide-28
SLIDE 28

The average Intel Core i750 can push up to 7 GFLOPS without overclocking

slide-29
SLIDE 29

Why JSON?

Easy to parse in JavaScript and Java/Scala Fairly compact text format Highly compressible (5:1 even for small payloads) Easy to fix rendering bugs (just deploy a new client code)

slide-30
SLIDE 30

Minimum Number of Public Servers Needed to Serve 45M Sites

4

slide-31
SLIDE 31

Public SLA Be Available 99.99999%

slide-32
SLIDE 32

Serving a Site – Sunny Day

Archive CDN Statics

Browser http://example.wix.com

Store HTML to cache HTTP Request Notify site view

LB Public Renderer

HTML Resources / Media HTTP Request

slide-33
SLIDE 33

Serving a Site – DC Lost

Archive CDN Statics

Browser http://example.wix.com

LB Public Renderer LB Public Renderer

Change DNS

HTTP Request

slide-34
SLIDE 34

Serving a Site – Public Lost

Archive CDN Statics

Browser http://example.wix.com

LB Public Renderer

Get Cached HTML Version HTML HTTP Request

slide-35
SLIDE 35

Living in the Browser

Archive CDN Statics

Browser http://example.wix.com

LB Public Renderer Editor

Fallback JSON / Media HTML HTTP Request Fallback

slide-36
SLIDE 36

Summary

Identify your critical path and concerns Build redundancy in critical path (for availability) De-normalize data (for performance) Minimize out-of-process hops (for performance) Take advantage of client’s CPU power

slide-37
SLIDE 37
slide-38
SLIDE 38

Aviran Mordo Head of Back-End Engineering @ Wix

@aviranm linkedin.com/in/aviran aviransplace.com

Q&A

http://goo.gl/Oo3lGr

slide-39
SLIDE 39