CS227 CS227 - Silvia Silvia Zuffi Zuffi - - Sunil Mallya - - - PowerPoint PPT Presentation

cs227 cs227
SMART_READER_LITE
LIVE PREVIEW

CS227 CS227 - Silvia Silvia Zuffi Zuffi - - Sunil Mallya - - - PowerPoint PPT Presentation

CS227 CS227 - Silvia Silvia Zuffi Zuffi - - Sunil Mallya - Sunil Mallya Slides credits: official Slides credits: official membase meetings membase meetings Schedule Overview silvia History silvia Data Model silvia


slide-1
SLIDE 1

CS227 CS227

  • Silvia

Silvia Zuffi Zuffi

  • Sunil Mallya

Sunil Mallya

Slides credits: official Slides credits: official membase meetings membase meetings

slide-2
SLIDE 2

2

Schedule

  • Overview silvia
  • History silvia
  • Data Model silvia
  • Architecture sunil
  • Transaction support sunil
  • Case studies silvia
slide-3
SLIDE 3

Overview, history and data model

slide-4
SLIDE 4

3

Overview: what is Membase?

  • A key-value distributed database optimized for

storing data behind web applications

  • Simple - Fast - Elastic (by design)
slide-5
SLIDE 5

3

Overview: before

Application Scales Out

Just add more commodity web servers

slide-6
SLIDE 6

3

Overview: with Membase

Membase Servers Web application server Application user

DATA CENTER ADMINISTRATOR CONSOLE

slide-7
SLIDE 7

3

Overview: after

Application Scales Out

Just add more commodity web servers

Database Scales Out

Just add more commodity data servers

slide-8
SLIDE 8

4

History

  • Membase was developed by NorthScale, founded by

several leaders of the memcached project

  • June 2010: NorthScale, and project co-sponsors Zynga

and NHN create a new project (membase.org).

  • February 8, 2011, Membase merged with CouchOne.The

merged project will be known as Couchbase

slide-9
SLIDE 9

4

History

QuickTime™ e un decompressore sono necessari per visualizzare quest'immagine.

James Phillips, senior Vice President

slide-10
SLIDE 10

5

History

  • Initial release March 2010
  • Stable release 1.6.4.1 28 Dec 2010
slide-11
SLIDE 11

6

Data Model

  • Key-value
  • Motivation: applications with natural keys to

access data (es.: username.birthday)

slide-12
SLIDE 12

7

Key-value

Key Value Data types: Byte[] Google protobuf Thrift Avro “Any customer can have a car painted any colour that he wants so long as it is black.”

slide-13
SLIDE 13

8

Operators and Programming Languages

  • GET/SET

– getl: get with an expiration time

  • Increment/Decrement
  • Append/Prepend
  • Practically every language and application framework is

supported (“memcapable”)

  • Data manager: written in C, C++
  • Cluster manager: Erlang/OTP
slide-14
SLIDE 14

9

Transactions

  • Based on CAS operations
  • Compare and Swap
  • special instruction that atomically

compares the content of a memory location

User 1 F a i l ! User 2 Success

slide-15
SLIDE 15

Architecture and transaction support

slide-16
SLIDE 16

10

What is the problem being solved ?

  • Highly interactive web apps
  • Small amount of data
  • Why doesn’t the traditional architecture

work ?

  • Is nosql “DB” really a DB ?
  • Can a Database do what a nosql-db does?

– If yes ? Why not use a database – What is it that is really different ?

  • De Normalized data
slide-17
SLIDE 17

10

Membase - A practical path to “NoSQL” adoption

slide-18
SLIDE 18

10

Physical Structures

  • CA type system: scale linearly and always

maintain consistency

  • Clustering based on Erlang OTP
  • Things are persistent, Data is written to

Disk.

slide-19
SLIDE 19

15

Elasticity

slide-20
SLIDE 20

16

Elasticity

slide-21
SLIDE 21

14

Elasticity

slide-22
SLIDE 22

11

Architecture

moxi

11211 11210

memcached protocol listener/sender membase storage engine

engine interface

memcapable 1.0 memcapable 2.0

http

REST management API/Web UI Heartbeat Process monitor Global singleton supervisor Configuration manager

  • n each node

Erlang/OTP

Rebalance orchestrator Node health monitor

  • ne per cluster

vBucket state and replication manager

HTTP distributed erlang erlang port mapper

DATA MANAGER CLUSTER MANAGER

slide-23
SLIDE 23

12

vBuckets

QuickTime™ e un decompressore sono necessari per visualizzare quest'immagine.

Any given vbucket will be in one of the following states on any given server:

http://blog.membase.com/scaling-memcached-vbuckets

slide-24
SLIDE 24

13

vBuckets mappings

slide-25
SLIDE 25

25

TAP

  • A generic, scalable method of streaming mutations

from a given server

– As data operations arrive, they can be sent to arbitrary TAP receivers

  • Leverages the existing memcached engine interface,

and the non-blocking IO interfaces to send data

  • Three modes of operation
slide-26
SLIDE 26

14

Replication & Failover

  • Multi-model replication support
  • Peer-to-peer replication support with underlying

architecture supporting master-slave replication

  • Configurable replication count
  • Balance resource utilization with availability

requirements

  • High-speed failover

Fast failover to replicated items based upon request

slide-27
SLIDE 27

Case sudies

slide-28
SLIDE 28

Where does Membase fit?

  • Online applications with a lot of users
  • Applications with growing datasets which

need quick access

slide-29
SLIDE 29

Users

  • Who uses Membase?
slide-30
SLIDE 30

Users: zynga

Social game leader – FarmVille, Mafia Wars, Café World Over 230 million monthly users

  • Membase Server

is the 500,000 ops-per-second database behind FarmVille and Café World

slide-31
SLIDE 31

Case Study: Ad targeting

Aol website

Target users based on what they have bought and the sites they have visited Target users based on registration information

slide-32
SLIDE 32

Case study: sharing network

slide-33
SLIDE 33

Case study: sharing network

450/mo million consumers ~850 thousand sites 50+ social channels

slide-34
SLIDE 34

Case study: targeting

Log Files Search Keywords Page Views Sharing Behavior HDFS Map/Reduce Content Analysis Taxonomy Ad Server User Membase

2 2

slide-35
SLIDE 35

Case Study: Ad targeting

  • Data management challenges :
  • to analyze billions of user-related events, presented as a

mix of structured and unstructured data, to infer demographic, psychographic and behavioral characteristics (“cookie profiles”)

  • make hundreds of millions of cookie profiles available to

their AD targeting platform fast

  • to keep the user profiles updated
slide-36
SLIDE 36

Case Study: Ad targeting

slide-37
SLIDE 37

Thanks