m onkey o ptimal n avigable key value store
play

M onkey: O ptimal N avigable Key -Value Store Niv Dayan, Manos - PowerPoint PPT Presentation

M onkey: O ptimal N avigable Key -Value Store Niv Dayan, Manos Athanassoulis, Stratos Idreos storage is cheaper inserts & updates price workload per GB time storage is cheaper inserts & updates price workload per GB time need


  1. model relax optimize lookup false positive rates = f(p 0 , p 1 …) cost 0 < p 2 < 1 0 < p 1 < 1 memory = f(p 0 , p 1 …) in terms of p 0 , p 1 footprint 0 < p 0 < 1

  2. Bloom filters … p 2 false positive p 1 rates p 0 lookup = ∑ p i cost

  3. memory Bloom filters footprint … p 2 false positive p 1 bits - ln(2) 2 entries rates false = e p 0 positive rate lookup = ∑ p i cost

  4. memory Bloom filters footprint … p 2 false positive p 1 false ln ( ) rates positive rate p 0 bits = - entries ln(2) 2 lookup = ∑ p i cost

  5. memory Bloom filters footprint … … p 2 bits(p 2 , N/T 2 ) false positive p 1 bits(p 1 , N/T ) rates p 0 bits(p 0 , N ) lookup = ∑ p i cost

  6. memory Bloom filters footprint … … p 2 bits(p 2 , N/T 2 ) false positive p 1 bits(p 1 , N/T) rates p 0 bits(p 0 , N) false positive rates ∑ ln( p i ) lookup c · N · = ∑ p i memory = - cost T i size ratio constant entries

  7. Bloom filters … p 2 optimize false positive p 1 rates p 0 ∑ ln( p i ) lookup c · N · = ∑ p i memory = - cost T i

  8. Monkey Bloom filters … e x p o p 0 /T 2 d n e e c false n r e t i a a positive p 0 /T s l e rates p 0

  9. State-of-the-Art Monkey Bloom filters Bloom filters … e x p p o p 0 /T 2 d n e s e c a false n r m e t i a a e p positive p 0 /T s l e rates p 0 p

  10. State-of-the-Art Monkey Bloom filters Bloom filters … … … < p 0 /T 2 < p false positive p p 0 /T < rates > p 0 p = ∑ p i = ∑ p lookup cost <

  11. State-of-the-Art Monkey Bloom filters Bloom filters … … … < p 0 /T 2 < p false positive p p 0 /T < rates > p 0 p = ∑ p i = ∑ p lookup cost < = O( log( N ) · e - M/N ) = O( e -M/N ) N | number of entries M | overall memory for Bloom filters

  12. State-of-the-Art Monkey Bloom filters Bloom filters … … … < p 0 /T 2 < p false positive p p 0 /T < rates > p 0 p = ∑ p i = ∑ p lookup cost < = O( log( N ) · e - M/N ) = O( e -M/N ) asymptotic win N | number of entries M | overall memory for Bloom filters lookup cost increases at slower rate as data grows

  13. Monkey Bloom filters … convergent p 0 /T 2 geometric false series positive p 0 /T rates p 0

  14. Monkey Bloom filters … p 0 /T 2 false positive p 0 /T rates p 0 - ln( p i ) ∑ c · entries · memory = T i

  15. Monkey Bloom filters … p 0 /T 2 false positive p 0 /T rates p 0 c · entries · - ln( lookup cost ) memory =

  16. Monkey Bloom filters … p 0 /T 2 false positive p 0 /T rates p 0 c · entries · - ln( lookup cost ) memory = model lookups vs. memory trade-off

  17. fixed memory existing lookup systems Problem 1: suboptimal filters allocation cost Problem 2: hard to tune update cost

  18. fixed memory x existing lookup systems Problem 1: suboptimal filters allocation cost Problem 2: hard to tune Pareto frontier x update cost

  19. x Bloom filters size lookups vs. memory lookup Problem 1: suboptimal filters allocation cost Problem 2: hard to tune x update cost

  20. x merge policy greed lookup lookups vs. updates Problem 1: suboptimal filters allocation cost t m h r a o Problem 2: hard to tune x u g h p u t x update cost

  21. M onkey: O ptimal N avigable Key -Value Store memory filters LSM-tree ad-hoc merge trade-offs policy observations: fixed false ? positive rates performance lookups updates log c lookup cost = ∑ p i lookup existing insights: cost Monkey sorted suboptimal array update cost optimize allocation updates vs. lookups steps: answer what-if asymptotically better design questions navigate memory vs. lookups

  22. M onkey: O ptimal N avigable Key -Value Store memory filters LSM-tree ad-hoc merge trade-offs policy observations: fixed false ? positive rates performance lookups updates log c lookup cost = ∑ p i lookup existing insights: cost Monkey sorted suboptimal array update cost optimize allocation updates vs. lookups steps: answer what-if asymptotically better design questions navigate memory vs. lookups

  23. Identify merge policy size ratio

  24. Identify Map merge policy lookups size ratio updates

  25. Identify Map merge policy sorted log LSM-tree lookups array size ratio updates

  26. Identify Map merge policy log lookups sorted size ratio array updates

  27. Identify Map Navigate merge policy workload hardware log lookups sorted size ratio array maximum updates optimal throughout

  28. Merge Policies Leveling Tiering write-optimized read-optimized

  29. Leveling Tiering read-optimized write-optimized

  30. Leveling Tiering read-optimized write-optimized T runs per level

  31. Leveling Tiering read-optimized write-optimized T runs per level merge & flush

  32. Leveling Tiering read-optimized write-optimized T runs per level

  33. Leveling Tiering read-optimized write-optimized T runs per level merge

  34. Leveling Tiering read-optimized write-optimized T runs per level T times bigger flush

  35. Leveling Tiering read-optimized write-optimized T runs per level T times bigger

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend