strong
play

STRONG WHEN BAD THINGS DO NOT COME IN THREES ZECHAO SHANG JEFFREY - PowerPoint PPT Presentation

MY WEAK CONSISTENCY IS STRONG WHEN BAD THINGS DO NOT COME IN THREES ZECHAO SHANG JEFFREY XU YU DISCLAIMER: NOT AN OLTP TALK HOW TO GET ALMOST EVERYTHING FOR NOTHING SHARED-MEMORY SYSTEM IS BACK shared data Fine-grained mini-jobs


  1. MY WEAK CONSISTENCY IS STRONG WHEN BAD THINGS DO NOT COME IN THREES ZECHAO SHANG JEFFREY XU YU

  2. DISCLAIMER: NOT AN OLTP TALK HOW TO GET ALMOST EVERYTHING FOR NOTHING

  3. SHARED-MEMORY SYSTEM IS BACK shared data • Fine-grained mini-jobs • Hard to batch reads mini-jobs • Low-latency in-place updates compute (transaction) • Hard to partition the data space in-place update • Applications • Machine learning (SGD and others) • Graph computing (Vertex-centric systems) • Streaming (S-Store) Serializability theory Atomic and Correct time isolation behavior

  4. SCALABILITY LATENCY DATA CONSISTENCY & THROUGHPUT JOB ISOLATION

  5. DO WE NEED IT? • Approach: remove data consistency controller • Pros: super-fast, yeeeeh! • Cons: could cause data consistency issues • HogWild! & Parameter Server & others • Correctness proofs rely on special properties • Convexity • Lipschitz-continuity • Bounded staleness • PBS: Probabilistic Bounded Staleness • Weak consistency actually provides strong semantics • Single key only Serializability Run the • Probabilistic theory Atomic and Correct transactions isolation Behavior like serial

  6. THE DATABASE WAY • Fewer assumptions, more applications • Non-convex (deep learning) • Discrete & combinational (graph problems) Serializability Run the theory Atomic and Correct transactions isolation Behavior like serial Run the Weak ? transactions consistency Behavior + anomalies This talk

  7. DATA CONFLICT GRAPH 2 1 3 • Each vertex represents a txn 5 6 • An edge if two txns share data 4 7 • Potential conflicts 8

  8. GOOD AND BAD 2 1 3 • Good No. 1: serial execution 5 6 4 7 8 1 2 3 4 5 6 7 8 time

  9. GOOD AND BAD 2 1 3 • Good No. 2: a nice scheduler • No direct edge in concurrent txns 5 6 4 7 7 3 8 6 3 8 6 7 8 5 4 5 4 1 2 1 2 time dependency graph

  10. GOOD AND BAD BD=2 • Bad: potential conflict BD=1 2 1 • Bad degree (for a transaction) 3 • # of potential conflict transactions • Concurrent 5 • Share same data (adjacent in graph) 6 4 7 1 2 3 4 5 6 7 8 3 6 BD=0 always 8 time 2 5 8 8 3 6 7 1 4 7 5 1 2 4 time time

  11. BAD DEGREE AND CORRECTNESSES MAX BAD CONCURRENCY TXN RESULTS DEGREE CONTROL SEMANTICS ACCURACY 0 NO SERIALIZABILITY CORRECT NO NO DON’T KNOW >0 YES SERIALIZABILITY CORRECT

  12. BAD THINGS DO NOT COME IN 3 (BN3) • BN3: bad degree ≤ 1 for all transactions MAX BAD CONCURRENCY TXN RESULTS DEGREE CONTROL SEMANTICS ACCURACY 0 NO SERIALIZABILITY CORRECT 1 (BN3) NO NO DON’T KNOW >1 YES SERIALIZABILITY CORRECT

  13. IS BN3 TRUE? 2 1 3 5 6 • Depends on 4 7 • Data conflict severity: the density of data conflict graph |"| 8 |#| $ • Job type GRAPH |V| |E| DENSITY • Access pattern (in 10 6 ) (in 10 6 ) (in 10 -4 ) NAME uk-2007-05 106 3,739 4.2 uk-2014 787 47,614 4.7 Web Graphs eu-2015 1,070 91,792 5.8 claw-2012 3,563 128,736 1.4 wise 59 265 4.0 Social Networks friendster 66 1,806 0.7 TPC-C New Order >1000

  14. BAD DEGREE DISTRIBUTION ≦ 1 BN3 (bad degree ≦ 1) 0BD (bad degree = 0) 1.00 1.00 0.99 0.99 0.98 0.98 Probability Probability 0.96 0.96 0.94 0.94 0.92 0.92 conflict graph density:10 -6 conflict graph density: 10 -6 conflict graph density:10 -5 conflict graph density: 10 -5 conflict graph density:10 -4 conflict graph density: 10 -4 0.90 0.90 1 2 4 8 1 3 6 1 2 5 1 1 2 4 8 1 3 6 1 2 5 1 6 2 4 2 5 1 0 6 2 4 2 5 1 0 8 6 2 2 8 6 2 2 4 4 Number of Cores Number of Cores

  15. WHAT GOOD IS BN3? THE TRANSACTIONS EXECUTED WITHOUT ANY CONSISTENCY MECHANISM IS UNDER SNAPSHOT ISOLATION (SI)

  16. PROOF: A TWO-STEP APPROACH 2 1 0. BN3 restricts the size of “mafia” 3 • Two crews (vertices) at most 1. Only two bad transactions case 5 6 bad edge • Proof by enumerating the type of edges 4 7 2. Other good transactions • Does not cause more cycles 8 • Adjacent (non-bad) vertices: behind or after • Non-adjacent vertices: none of their business

  17. BAD DEGREE AND CORRECTNESSES MAX BAD CONCURRENCY TXN RESULTS DEGREE CONTROL SEMANTICS ACCURACY 0 NO SERIALIZABILITY CORRECT SNAPSHOT 1 (BN3) NO WRITE-SKEW ISOLATION NO NO DON’T KNOW ANY YES SERIALIZABILITY CORRECT

  18. 256 cores 128 cores 10 -3 10 -3 Residual Residual No Consistency No Consistency 10 -4 10 -4 Read Uncommitted Read Uncommitted Read Committed Read Committed Serializability Serializability 2*10 -3 1*10 -4 6*10 -5 5*10 -5 2*10 -3 1*10 -4 6*10 -5 5*10 -5 Conflict graph density Conflict graph density 1.00 0.99 0.98 Probability “BN3-ness” 0.96 0.94 density:2*10 -3 0.92 density:1*10 -4 density:6*10 -5 density:5*10 -5 x: vary the conflict graph density 0.90 1 2 4 8 1 3 6 1 2 5 1 6 2 4 2 5 1 0 lines: vary isolation levels 8 6 2 2 4 y: residual after 50 iterations of Page Rank Number of Cores

  19. TAKE HOME MESSAGES • Life is not just all-or-nothing • Flawlessness costs a lot • It is possible to have almost everything for free • BN3: realistic assumption, practical conclusion • Some future works • Runtime: monitor the BN3-ness • BN3 as a new consistency level • Mixed concurrency control

  20. Thank you

  21. EXPERIMENTAL STUDIES (THROUGHPUT) 80 No Consistency No Consistency Read Uncommitted Read Uncommitted Throughout (* 10 6 ) Throughout (* 10 6 ) Read Committed Read Committed Serializability Serializability 60 60 40 40 2*10 -3 1*10 -4 6*10 -5 5*10 -5 2*10 -3 1*10 -4 6*10 -5 5*10 -5 Conflict graph density Conflict graph density

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend