Fragmented Log Structured Merge Trees (Part 1) Presented by Deepak - PowerPoint PPT Presentation

Apr 20, 2023 •226 likes •309 views

Pebble Db Key Value Store Using Fragmented Log Structured Merge Trees (Part 1) Presented by Deepak Varghese Pebble DB Overview High performance write-optimized key-value store Built using new data structure Fragmented Log-

Pebble Db – Key Value Store Using Fragmented Log Structured Merge Trees (Part 1) Presented by – Deepak Varghese
Pebble DB Overview  High performance write-optimized key-value store  Built using new data structure Fragmented Log- Structured Merge Tree  Fragmentation is done using guards  Helps in reducing the write amplification  Range search can be performed
Question 1 - “Figure 2 illustrates compaction in a LSM key - value store.” Please use the example to explain why compaction operation can be very expensive.  Multiple rewrites occurring when compaction is done on the LSM key value store.  This leads to high write amplification.
Question 2 - “ Instead of rewriting the sstable , FLSM’s compaction simply appends a new sstable fragment to the next level. ” Compared to the LSM -tree in-place rewriting, this appending is more efficient. However, what’s the tradeoff (any negative impact of the appending)?  Multiple sstables can have the same key and can have overlapping key ranges on the same level.  This would affect the read performance.  It would also lead to false positive cases.
Question 3 - “ FLSM performance is significantly impacted by how guards are selected.” Could you give a criterion of being good guards?  Guards should be able to separate ranges efficiently so that it doesn’t have multiple sstables.  The guards selected should be based on the higher density of keys.  Guards are selected based on guard probability value.  Guard probability is lowest at lower levels and increases as level number increases .
Question 4 - “ Guard probability gp(key,i) is the probability that key becomes a guard at level i .” Why is the probability a function of level number?  Choosing guards based on level number helps in having better key distribution between guards.  The intervals are better defined as levels increase.  As level number increases the number of keys and sstables increase. Hence we need more guards.
Question 5 - Guards are continuously generated with key insertions. And “We note that in many of the workloads that were tested, guard deletion was not required.” Could you solve the contradiction? What’s the consequence of not conducting guard-deletion operations in a store keeping admitting new keys?  Not performing guard deletion does not affect read and range search performance as get and range query operations skip over empty guards.  We can not have large number of guards when we have lower number of keys.  Deleting guards help in even distribution of the keys among them.  Thus consolidating data among fewer guards help in improving performance.
Questions ?

Recommend

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan

Scaling Log-Structured KV-Stores featuring Monkey and Dostoevsky SIGMOD17 / SIGMOD18 Niv Dayan Log-Structured KV-Stores Log-Structured KV-Stores Why Log-Structured KV-Stores? Why Log-Structured KV-Stores? fast writes Why Log-Structured

2.34k views • 198 slides

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1 ,

PebblesDB: Building Key-Value Stores using Fragmented Log Structured Merge Trees Pandian Raju 1 , Rohan Kadekodi 1 , Vijay Chidambaram 1,2 , Ittai Abraham 2 1 The University of Texas at Austin 2 VMware Research What is a key-value store?

1.53k views • 118 slides

Pebbles DB: Building Key-Value Stores using Fragmented Log- Structured Merge Trees(II) Peter

Pebbles DB: Building Key-Value Stores using Fragmented Log- Structured Merge Trees(II) Peter Sassaman Overview Sentinel Guards Guard Selection (MurmurHash) SSTables High Levels in memory (1) FLSM can be viewed

164 views • 12 slides

(142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4]) h 4 -h 10 /20160

142733/102960+(97411 h 2 )/73920-(1557 h 4 )/560+h 6 /15+h 8 /140+(31 h 10 )/20160 +h 12 /3960+h 14 /160160-Log[4]-h 4 Log[4]-2 h 2 Log[64]+(-1+h) 4 Log[1-h]+(1+h) 4 Log[1 +h] (142733/102960-Log[4])+(614851/73920-2 Log[64]) h 2 +(2329/1680-Log[4])

737 views • 44 slides

a Atg12 Rab9 (ER) F-USP13 Merge (Autophagy) F-USP13 Merge COX4 (Mito) F-USP13 Merge Mock HSV-1 b

a Atg12 Rab9 (ER) F-USP13 Merge (Autophagy) F-USP13 Merge COX4 (Mito) F-USP13 Merge Mock HSV-1 b BFP-USP13 BFP-USP13 FLAG-STING Rab9 (ER) FLAG-STING Merge Mock HSV-1 (4h) HSV-1 (8h) Supplementary Figure 1 USP13 and STING colocalize at ER. ( a )

384 views • 13 slides

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE

A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE A STRUCTURED L IFE by Kirk Gittings Note: Kirk Gittings has been photographing the prehistor- coast, but m aking a real living as an art photographer ic,

1.16k views • 8 slides

Merge Strategies for Merge-and-Shrink Masters Thesis Daniel Federau 13th February 2017

Merge Strategies for Merge-and-Shrink Masters Thesis Daniel Federau 13th February 2017 Introduction Merge-and-Shrink Evaluation of Merge Strategies Implementation Conclusion Motivation An important factor for the performance of

489 views • 22 slides

Mail Merge Internals Eilidh McAdam Mail Merge Mail merge fjlls a template from a

Mail Merge Internals Eilidh McAdam Mail Merge Mail merge fjlls a template from a datasource, producing a single or separate documents Datasource includes databases and spreadsheets Also used to generate labels and envelopes

345 views • 17 slides

Accelerating the merge phase of sort-merge join FPL 2019 The 29th International Conference on

Accelerating the merge phase of sort-merge join Accelerating the merge phase of sort-merge join FPL 2019 The 29th International Conference on Field-Programmable Logic and Applications Philippos Papaphilippou, Holger Pirk, Wayne Luk Dept. of

118 views • 11 slides

Merge Sort: Summary General algorithm: Basic analysis: Divide in half log(n) times,

Merge Sort: Summary General algorithm: Basic analysis: Divide in half log(n) times, merge log(n) times = 2log(n) Merging touches each value once, so it is linear At each level, there are n items to merge Complexity is

240 views • 11 slides

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica,

Chandra data reduction The CDFs Giorgio, Margherita, Elisabeta, Eleonora, Lazarus, Enrica, Laurel, Enrique, Fabrizio Log(Nh)=24 Log(Nh)=23.5 Log(Nh)=24 Log(Nh)=23.5 Log(Nh)=23.6 Log(Nh)=24 Log(Nh)=23. Log(Nh)=24 5 Log(Nh)=23.5 Zeng

315 views • 17 slides

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction

Structured Prediction Introduction What is structured prediction? CS 6355: Structured Prediction Our goal today To define a Structure and Structured Prediction 1 What are structures? 2 Examples of structured data? 3 Examples of structured

661 views • 34 slides

Model Merge Tooling: Whats New in EMF Diff/Merge for Neon ECLIPSECON FRANCE, 08/06/2016

Model Merge Tooling: Whats New in EMF Diff/Merge for Neon ECLIPSECON FRANCE, 08/06/2016 Model merging? + Transfer data Align (totally/partially) model subsets Report changes 2 EMF Diff/Merge Vision Merging =

416 views • 20 slides

News from Git in Eclipse Matthias Sohn (SAP) merge strategy extension point enables

News from Git in Eclipse Matthias Sohn (SAP) merge strategy extension point enables external merge strategy used by EMF Compare to provide model merge (Neon) JGit 4.0, EGit 4.1 EMF Compare provides model merge strategy Computes

660 views • 41 slides

SORTING Chapter 8 Comparison of Quadratic Sorts 2 1 12/6/2017 Merge Sort Section 8.7 Merge

12/6/2017 SORTING Chapter 8 Comparison of Quadratic Sorts 2 1 12/6/2017 Merge Sort Section 8.7 Merge A merge is a common data processing operation performed on two ordered sequences of data. The result is a third ordered sequence

993 views • 29 slides

overview merge sort heaps data structures and algorithms 2020 09 07 heapsort intuitively

overview merge sort heaps data structures and algorithms 2020 09 07 heapsort intuitively lecture 3 maintaining the max-heap property building a max-heap overview merge sort application tree merge sort be able to apply merge sort on

871 views • 8 slides

LOOPS Loops Loops Loops! How can we repeat a piece of code without having to write it out over

Session 6 LOOPS Loops Loops Loops! How can we repeat a piece of code without having to write it out over and over?! With loops! The Game Loop For Loop vs. While Loop For Loops While Loops Start/End are Known Start/End not Always Clear

301 views • 26 slides

Neutrino energy reconstruction in the DUNE far detector Nick Grant, Tingjun Yang 1 Updates

Neutrino energy reconstruction in the DUNE far detector Nick Grant, Tingjun Yang 1 Updates Will show updates for CC events with contained tracks. Looked at bias of reco energy as a function of true energy. Also looked at low tails in

468 views • 18 slides

CPSC 231 - Lab LOOPS Based on Ryan Henry's Slides Loooooooooooo...oooop Sometimes we need to do

CPSC 231 - Lab LOOPS Based on Ryan Henry's Slides Loooooooooooo...oooop Sometimes we need to do a job repeatedly as long as a specific condition is true. We use loops for this kind of jobs. Types of loops? Simple loops : Keep going if the

457 views • 10 slides

Zippered Polygon Meshes From Range Images Greg Turk and Mark Levoy Stanford University SIGGRAPH

Zippered Polygon Meshes From Range Images Greg Turk and Mark Levoy Stanford University SIGGRAPH 94 Presented by John Novatnack How Do We Construct 3D Meshes? Range Images Acquiring Range Images Image courtesy of

559 views • 12 slides

Runtime Complexity CS 331: Data Structures and Algorithms Michael Lee <lee@iit.edu> So far,

Runtime Complexity CS 331: Data Structures and Algorithms Michael Lee <lee@iit.edu> So far, our runtime analysis has been based on empirical data i.e., runtimes obtained from actually running our algorithms This data is very sensitive

633 views • 49 slides

Computational Geometry Lecture 8: Range trees 1 Computational Geometry Lecture 8: Range trees

Introduction 2D Range trees Degenerate cases Range trees Computational Geometry Lecture 8: Range trees 1 Computational Geometry Lecture 8: Range trees Introduction 2D Range trees Range queries Degenerate cases Database queries G.

878 views • 66 slides

ASSIGNMENT AND LOOPS CSSE 120 Rose-Hulman Institute of Technology Outline (some of chapters 2

As you arrive: 1. Start up your computer and plug it in 2. Log into Angel and go to CSSE 120 3. Do the Attendance Widget the PIN is on the board 4. Go to the course Schedule Page From your bookmark , or from the Lessons tab in Angel 5.

391 views • 25 slides

CS 356 Unit 3 IEEE 754 Floating Point Representation 3.2 Floating Point Used to represent

3.1 CS 356 Unit 3 IEEE 754 Floating Point Representation 3.2 Floating Point Used to represent very small numbers (fractions) and very large numbers Avogadros Number: +6.0247 * 10 23 Plancks Constant: +6.6254 * 10 -27

789 views • 33 slides