online bigtable merge compaction
play

Online Bigtable merge compaction Neal E. Young 1 Claire Mathieu - PowerPoint PPT Presentation

Online Bigtable merge compaction Neal E. Young 1 Claire Mathieu Carl Staelin Arman Yousefia CNRS Paris Google Haifa UC Riverside UCLA Northeastern University, September 17, 2015 1 funded by faculty re$earch award BIGTABLE data storage


  1. Online Bigtable merge compaction Neal E. Young 1 Claire Mathieu Carl Staelin Arman Yousefia CNRS Paris Google Haifa UC Riverside UCLA Northeastern University, September 17, 2015 1 funded by faculty re$earch award

  2. BIGTABLE — data storage at Google Maps, Search/Crawl, Gmail . . . use BIGTABLE to store data. I 24,500 Bigtable Servers I 1.2 million requests per second I 16 GB/s of outgoing RPC tra ffi c I over a petabyte of data just for Google Crawl and Analytics I these figures are from 2006 Similar to other “NoSQL” databases: Accumulo, AsterixDB, Cassandra, HBase, Hypertable, Spanner, . . . Used by Adobe, Ebay, Facebook, GitHub, Meetup, Netflix, Twitter, . . . “Log-structured merge tree” architecture — for high-volume, highly reliable, distributed, real-time data storage.

  3. BIGTABLE — implements dictionary data type operations supported by a Bigtable instance: I write (key, value) I read (key) — return most recent value written for key I . . . there’s more, but not today . . .

  4. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: –empty– file sequence

  5. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: (1 , a ) file sequence write (1 , a );

  6. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: (1 , a ) (2 , b ) file sequence write (1 , a ); write (2 , b );

  7. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: (1 , a ) (2 , b ) (3 , c ) file sequence write (1 , a ); write (2 , b ); write (3 , c );

  8. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: (1 , a ) (2 , b ) (3 , c ) (4 , d ) file sequence write (1 , a ); write (2 , b ); write (3 , c ); write (4 , d );

  9. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) | {z } from 1st flush write (1 , a ); write (2 , b ); write (3 , c ); write (4 , d ); flush ();

  10. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: (5 , e ) (6 , f ) (7 , g ) file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) | {z } from 1st flush write (1 , a ); write (2 , b ); write (3 , c ); write (4 , d ); flush (); write (5 , e ); write (6 , f ); write (7 , g );

  11. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) (5 , e ) (6 , f ) (7 , g ) | {z } | {z } from 1st flush from 2nd flush write (1 , a ); write (2 , b ); write (3 , c ); write (4 , d ); flush (); write (5 , e ); write (6 , f ); write (7 , g ); flush ();

  12. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) (5 , e ) (6 , f ) (7 , g ) (8 , h ) (9 , i ) | {z } | {z } | {z } from 1st flush from 2nd flush from 3rd flush write (1 , a ); write (2 , b ); write (3 , c ); write (4 , d ); flush (); write (5 , e ); write (6 , f ); write (7 , g ); flush (); write (8 , h ); write (9 , i ); flush ();

  13. BIGTABLE — writes and flushes write (key, value): 1. Store key/value pair in cache (e.g. hash table in RAM). Environment periodically forces flush of cache to new immutable disk file. Example cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) (5 , e ) (6 , f ) (7 , g ) (8 , h ) (9 , i ) | {z } | {z } | {z } from 1st flush from 2nd flush from 3rd flush Environment forces Flushes at arbitrary times.

  14. BIGTABLE — reads and compactions cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) (5 , e ) (6 , f ) (7 , g ) (8 , h ) (9 , i ) | {z } | {z } | {z } from 1st flush from 2nd flush from 3rd flush read (key): 1. Check cache for key. 2. If not found, check files (most recent first). ← cost = O (# files )

  15. BIGTABLE — reads and compactions cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) (5 , e ) (6 , f ) (7 , g ) (8 , h ) (9 , i ) | {z } | {z } | {z } from 1st flush from 2nd flush from 3rd flush read (key): 1. Check cache for key. 2. If not found, check files (most recent first). ← cost = O (# files ) compaction (): ← asynchronous background process, to reduce read costs Periodically select files to merge .

  16. BIGTABLE — reads and compactions cache: –empty– file sequence: (1 , a ) (2 , b ) (3 , c ) (4 , d ) (5 , e ) (6 , f ) (7 , g ) (8 , h ) (9 , i ) | {z } | {z } from 1st flush merge of 2nd and 3rd read (key): 1. Check cache for key. 2. If not found, check files (most recent first). ← cost = O (# files ) compaction (): ← asynchronous background process, to reduce read costs ← cost = O ( SIZE of merged files ) !! Periodically select files to merge . goals: (i) keep read costs low (ii) keep compaction costs low constraint: each merge must merge a contiguous subsequence of files

  17. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost.

  18. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = ∞ , problem is easy — never merge

  19. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = ∞ , problem is easy — never merge after flush 1 :

  20. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = ∞ , problem is easy — never merge after flush 1 : after flush 2 :

  21. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = ∞ , problem is easy — never merge after flush 1 : after flush 2 : after flush 3 : after flush 4 : . . . Total compaction cost = 0.

  22. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = 1, problem is easy — must merge everything each time after flush 1 :

  23. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = 1, problem is easy — must merge everything each time after flush 1 : after flush 2 : ← too many files!

  24. Bigtable Merge Compaction ( bmc ) — formal definition given: Sequence x 1 , x 2 , . . . , x n . ← x t is size of file resulting from flush t Integer k > 0. ← tuned to workload; typically 3–40. choose: Compactions. Ensure number of files never exceeds k . objective: Minimize total compaction cost. If k = 1, problem is easy — must merge everything each time after flush 1 : after flush 2 : ← compaction cost x 1 + x 2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend