Eric Brewer Professor, UC Berkeley VP Infrastructure, Google QCon - PowerPoint PPT Presentation

NoSQL: Past, Present, Future Eric Brewer Professor, UC Berkeley VP Infrastructure, Google QCon SF November 8, 2012

Charles Bachman, 1973 Turing Award Integrated Datastore (IDS) (very) Early “No SQL” database

“Navigational” Database Tight integration between code and data Database = linked groups of records (“CODASYL”) Pointers were physical names, today we hash Programmer as “navigator” through the links Similar to DOM engine, WWW, graph DBs Used for its high performance, but … But hard to program, maintain Hard to evolve the schema (embedded in code) Wikipedia: “IDMS”

Why Relational? (1970s) Need a high-level model (sets) Separate the data from the code SQL is the (only) API Data outlasts any particular implementation because the model doesn’t change Goal: implement the top-down model well Led to transactions as a tool Declarative language leaves room for optimization

Also 1970s: Unix “The most important job of UNIX is to provide a file system” – original 1974 Unix paper Bottom-up world view Few, simple, efficient mechanisms Layers and composition “navigational” Evolution comes from APIs, encapsulation NoSQL is in this Unix tradition Examples: dbm (1979 kv), gdbm, Berkeley DB, JDBM

Two Valid World Views Systems View Relational View Top Down Bottom Up Build on top Clean model, ACID Transactions Evolve modules Two kinds of developers One kind of programmer DB authors Integrated use SQL programmers Values Values: Clean Semantics Good APIs Set operations Flexibility Range of possible programs Easy long-term evolution Venues: SIGMOD, VLDB Venues: SOSP, OSDI

NoSQL in Context Large reusable storage component Systems values: Layered, ideally modular APIs Enable a range of systems and semantics Some things to build on top over time: Multi-component transactions Secondary indices Evolution story Returning sets of data, not just values

How did I get here … l Modern cluster-based server (1995) – Scalable, highly available, commodity clusters – Inktomi search engine (1996), proxy cache (1998) l But didn't use a DBMS – Informix was 10x slower for the search engine – Instead, custom servers on top of file systems l Led to “ACID vs. BASE” spectrum (1997) – B asically A vailable, S oft State, E ventual Consistency – … but BASE was not well received … (ACID was sacred)

Genesis of the CAP Theorem l I felt the design choices we made were “right”: – Sufficient (and faster) – Necessary (consistency hinders performance/availability) l Started to notice other systems that made similar decisions: Coda, Bayou l Developed CAP while teaching in 1998 – Appears in 1999 – PODC keynote in 2000, led to Gilbert/Lynch proof l … but nothing changed (for a while)

CAP Theorem l Choose at most two for any shared-data system: – C onsistency (linearizable) – A vailability (system always accepts updates ) – P artition Tolerance l Partitions are inevitable for the wide area – => consistency vs. availability l I think this was the right phrasing for 2000 – But probably not for 2010

Things CAP does NOT say.. 1. Give up on consistency (in the wide area) • Inconsistency should be the exception • Many projects give up more than needed 2. Give up on transactions (ACID) • Need to adjust “C” and “I” expectations (only) 3. Don’t use SQL • SQL is appearing in “NoSQL” systems • Declarative languages fit well with CAP

CAP & ACID No partitions => Full ACID With partitions: Atomic : • Partitions should occur between operations (!) • Each side should use atomic ops Consistent : • Temporarily violate this (e.g. no duplicates?) Isolation : • Temporarily lose this by definition Durable : • Should never forfeit this (and we need it later)

Single-site transactions Atomic transaction, but only within one site No distributed transactions Google BigTable: Multi-column row operations are atomic … but that part of the row always within one site CAP allows this just fine: • Modulo no LAN partitions (reasonable) • Google MegaStore spans multiple sites • Slow writes • Paxos helps availability, but still subject to partitions

Focus on partitions Claim 1: partitions are temporary • Provide degraded service for a while • Then RECOVER Claim 2: can detect “partition mode” • Timeout => effectively partitioned • Commit locally? (A) => partition started • Fail? (C) • Retry just means postpone the decision a bit Claim 3: impacts lazy vs. eager consistency • Lazy => can’t recover consistency during partition • Can only choose A in some sense

Life of a Partition State: S Operations on S time l Serializable operations on state S l Available (no partitions)

Life of a Partition State: S State: S1 Operations on S time State: S2 Partition starts partition mode l Both sides available, locally linearizable … but (maybe) globally inconsistent l No ACID “I”: concurrent ops on both sides l No ACID “C” either (only local integrity checks)

Life of a Partition State: S State: S1 Operations on S time State: S2 Partition starts partition mode l Commit locally? l Externalize output? (A says yes) l Execute side effects? (launch missile?)

Life of a Partition State: S State: S1 State: S' ? State: S2 partition mode Partition ends Need “Partition Recovery” • Goal: restore consistency (ACID) • Similar to traditional recovery • Move to some self-consistent state • Roll forward the “log” from each side

Partition Recovery State: S State: S1 State: S' ? State: S2 partition mode 1) Merge State (S’) • Easy: last writer wins • General: S’ = f (S1 log, S2 log) // the paths matter 2) Detect bad things that you did • Side effects? Incorrect response? 3) Compensate for bad actions

Partition Recovery State: S State: S1 State: S' ? State: S2 partition mode Amazon shopping cart: 1) Merge by union of items 2) Only bad action is deleted item reappears

ATM “Stand In” Time l ATMs have “partition mode” – … chooses A over C – Commutative atomic ops: incr, decr – When partition heals, the end balance is correct l Partition recovery: – Detect : intermediate wrong decisions – Side effects (like “issue cash”) might be wrong – Exceptions are not commutative (below zero?) – Compensate via overdraft penalty l Bound “wrongness” during partition: (less A) – Limit deficit to (say) $200 l When you remove $200, “decr” becomes unavailable

Define your “Partition Strategy” 1) Define detection (start Partition Mode) 2) Partition Mode operation: Determine which operations can proceed • Can depend on args/access level/state • Simple example: no updates, read only • ATM: withdrawal allowed only up to $200 total 3) Partition recovery • Detect problems via joint logs • Execute compensations • Every allowed op should have a compensation • Calculate merged state (last)

Compensation Happens l Claim: Real world = weak consistency + delayed exceptions + compensation – Charge you twice => credit your account – Overbook an airplane => compensate passengers that miss out l This concept is missing from wide-area data systems – Except for some workflow l Compensating transactions can be human response – “We just realized we sent you two of the same item” – Should be logged just like any other xact

CAP 2010 CAP only NoSQL Disallows this area ! 100% BASE Availability BigTable ACID Dynamo Databases Sherpa 0% Transactions Eventual Single copy Consistency consistency Consistency

Summary l Net effect of CAP: – Freedom to explore a wide diverse space – Merging of systems and DB approaches l While there are no partitions: – Can have both A and C, and full ACID xact l Choosing A => focus on partition recovery – Need a before, during, and after strategy – Delayed Exceptions seem promising – Applying the ideas of compensation is open

Eric Brewer Professor, UC Berkeley VP Infrastructure, Google QCon - PowerPoint PPT Presentation

NoSQL: Past, Present, Future Eric Brewer Professor, UC Berkeley VP Infrastructure, Google QCon SF November 8, 2012 Charles Bachman, 1973 Turing Award Integrated Datastore (IDS) (very) Early No SQL database Navigational Database

Cheetah Conservation Fund Dr. Laurie Dr. Laurie Dr. Laurie Dr. Laurie Marker, Dr. Bruce Brewer

Bill Brewer BOARD OF DIRECTORS & FOUNDING MEMBERS Bill Brewer, Nevada Rural Housing Authority

Web Development Infrastructure David Brewer Lead Systems Developer Second Story Interactive

re Discovering Brewer the best Gold-Copper porphyry target in the USA east of the Rocky

re Discovering Brewer the best Gold-Copper porphyry target in the U.S.A. east of the Rocky

Apics Cscp Learning System with Instructor Led Student Slides 2015 By Curtis Brewer,Bose Greg P.

Technology for Developing Regions Eric Brewer Tier Group, UC Berkeley FAST Keynote FAST Keynote

Analysis of WWW Traffic in Cambodia and Ghana Bowei Du, Michael Demmer Eric Brewer Computer

AMP: Program-Context Specific Buffer Caching Feng Zhou, Rob von Behren, Eric Brewer University

CAP Twelve year Later: How the Rules Have Changed - Eric Brewer, UC Berkeley, 2012 &

WHY EVENTS ARE A BAD IDEA Rob von Behren, Jeremy Condit, Eric Brewer threaded servers failed

European Research Initiative on CLL (ERIC) 25-27 October, 2018 ERIC Meeting, Barcelona ERIC

2941 Fairview Park 2941 Fairview Park Eric Sobel 2941 Fairview Park PROJECT TEAM Eric Sobel

Eric Wahlforss CTO/SoundCloud GOTO Aarhus 2011 L O O C Eric Wahlforss CTO/SoundCloud GOTO

Truly Global Brewer September 28, 2016 1 Legal Disclaimer NOT FOR RELEASE, PUBLICATION OR

Harvey Dillon, Gitte Keidser, Teresa Ching, Matt Flax, Scott Brewer The HEARing CRC & The

Feasibility of Consistent, Feasibility of Consistent, Feasibility of Consistent, Feasibility of

NoSQL Introduction CS 377: Database Systems Recap: Data Never Sleeps

CAP Theorem From CAP 12 Years Later: How The Rules

Towards Human Interactive Proofs in the Text-Domain Richard Bergmair University of Derby in

Infrastructures for Cloud Computing and Big Data M Cloud support and Global strategies Antonio

IR: Information Retrieval FIB, Master in Innovation and Research in Informatics Slides by Marta

CAP for Networks Or: How to Stop Worrying and Embrace Failure= Aurojit Panda, Colin Scott, Ali

CAP Theorem Technologies for Scalable Distribu8on CS4230

Sambuz

Useful Links

Newsletter

Mail Us

Eric Brewer Professor, UC Berkeley VP Infrastructure, Google QCon - PowerPoint PPT Presentation

NoSQL: Past, Present, Future Eric Brewer Professor, UC Berkeley VP Infrastructure, Google QCon SF November 8, 2012 Charles Bachman, 1973 Turing Award Integrated Datastore (IDS) (very) Early No SQL database Navigational Database

Cheetah Conservation Fund Dr. Laurie Dr. Laurie Dr. Laurie Dr. Laurie Marker, Dr. Bruce Brewer

Bill Brewer BOARD OF DIRECTORS &amp; FOUNDING MEMBERS Bill Brewer, Nevada Rural Housing Authority

Web Development Infrastructure David Brewer Lead Systems Developer Second Story Interactive

re Discovering Brewer the best Gold-Copper porphyry target in the USA east of the Rocky

re Discovering Brewer the best Gold-Copper porphyry target in the U.S.A. east of the Rocky

Apics Cscp Learning System with Instructor Led Student Slides 2015 By Curtis Brewer,Bose Greg P.

Technology for Developing Regions Eric Brewer Tier Group, UC Berkeley FAST Keynote FAST Keynote

Analysis of WWW Traffic in Cambodia and Ghana Bowei Du, Michael Demmer Eric Brewer Computer

AMP: Program-Context Specific Buffer Caching Feng Zhou, Rob von Behren, Eric Brewer University

CAP Twelve year Later: How the Rules Have Changed - Eric Brewer, UC Berkeley, 2012 &amp;

WHY EVENTS ARE A BAD IDEA Rob von Behren, Jeremy Condit, Eric Brewer threaded servers failed

European Research Initiative on CLL (ERIC) 25-27 October, 2018 ERIC Meeting, Barcelona ERIC

2941 Fairview Park 2941 Fairview Park Eric Sobel 2941 Fairview Park PROJECT TEAM Eric Sobel

Eric Wahlforss CTO/SoundCloud GOTO Aarhus 2011 L O O C Eric Wahlforss CTO/SoundCloud GOTO

Truly Global Brewer September 28, 2016 1 Legal Disclaimer NOT FOR RELEASE, PUBLICATION OR

Harvey Dillon, Gitte Keidser, Teresa Ching, Matt Flax, Scott Brewer The HEARing CRC &amp; The

Feasibility of Consistent, Feasibility of Consistent, Feasibility of Consistent, Feasibility of

NoSQL Introduction CS 377: Database Systems Recap: Data Never Sleeps

CAP Theorem From CAP 12 Years Later: How The Rules

Towards Human Interactive Proofs in the Text-Domain Richard Bergmair University of Derby in

Infrastructures for Cloud Computing and Big Data M Cloud support and Global strategies Antonio

IR: Information Retrieval FIB, Master in Innovation and Research in Informatics Slides by Marta

CAP for Networks Or: How to Stop Worrying and Embrace Failure= Aurojit Panda, Colin Scott, Ali

CAP Theorem Technologies for Scalable Distribu8on CS4230

Sambuz

Useful Links

Newsletter

Mail Us

Bill Brewer BOARD OF DIRECTORS & FOUNDING MEMBERS Bill Brewer, Nevada Rural Housing Authority

CAP Twelve year Later: How the Rules Have Changed - Eric Brewer, UC Berkeley, 2012 &

Harvey Dillon, Gitte Keidser, Teresa Ching, Matt Flax, Scott Brewer The HEARing CRC & The