borg block reorganization for self optimizing storage
play

BORG: Block-reORGanization for Self-optimizing Storage Systems - PowerPoint PPT Presentation

BORG: Block-reORGanization for Self-optimizing Storage Systems Medha Bhadkamkar Jorge Guerra Luis Useche Sam Burnett Jason Liptak Raju Rangaswami Vagelis Hristidis Florida International University March 9, 2009 1 / 33 Problem I/O is


  1. BORG: Block-reORGanization for Self-optimizing Storage Systems Medha Bhadkamkar Jorge Guerra Luis Useche Sam Burnett Jason Liptak Raju Rangaswami Vagelis Hristidis Florida International University March 9, 2009 1 / 33

  2. Problem ◮ I/O is the bottleneck � Legacy filesystems favor sequential access. � Realistic workloads are not necessarily sequential ◮ Proposed Solution � Co-locate data based on workload block access patterns � Improve sequentiality 2 / 33

  3. Workload Characteristics that motivate BORG ◮ Workloads � office - browser, OpenOffice applications, gnuplot, etc � developer - emacs, gcc, gdb, etc � Subversion (SVN) server - Sources and document repository � Web server - Department web server ◮ Workloads Statistics Summary Workload File System Total [GB] Total [GB] type size [GB] Reads Writes office 8.29 6.49 0.32 developer 45.59 3.82 10.46 SVN server 2.39 0.29 0.62 web server 169.54 21.07 2.24 3 / 33

  4. Non-uniform Access Frequency Distribution ◮ Frequently accessed data is usually a small portion of the entire data. ◮ Frequently accessed data is spread over entire disk area Workload File System Unique [GB] Unique [GB] Top 20% type size [GB] Reads Writes data access office 8.29 1.63 0.22 51.40 % 45.59 2.57 3.96 60.27 % developer SVN server 2.39 0.17 0.18 45.79 % 169.54 7.32 0.33 59.50 % web server 4 / 33

  5. Non-uniform Access Frequency Distribution Access Frequency The Opportunity Co-locating frequently accessed data can improve I/O performance. 5 / 33

  6. Workload Characteristics - Partial Determinism ◮ Non-sequential accesses repeat in a block access sequence Workload Partial type determinism office 65.42 % developer 61.56 % SVN server 50.73 % web server 15.55 % The Opportunity Using partial determinism information can improve sequentiality of accesses. 6 / 33

  7. Temporal Locality ◮ There is a substantial overlap in the working sets across days. All accesses Top 20% accesses 100 Data access overlap with Day 1 (%) 80 60 40 20 0 Day 1 Day 2 Day 3 Day 4 Day 5 Day 6 Day 7 Days of the week The Opportunity Using information of past I/O activity for optimizing layout can improve performance. 7 / 33

  8. BORG in a nutshell ◮ Uses block access patterns to identify hot block sequences in the workload. ◮ Reorganizes blocks in a separate B ORG OPT imized partition (BOPT) ◮ Assimilates write request in the partition ◮ Operates in the background ◮ Can be dynamically inserted or removed when required ◮ Is independent of filesystems ◮ Maintains consistency by maintaining a persistent page-level indirection map. 8 / 33

  9. System Architecture Application User Kernel VFS Page Cache File Systems (EXT3, JFS...) BORG Layer I/O Scheduler Device Driver Legend: Existing components New components 9 / 33

  10. System Architecture User−space components Application Analyzer Planner User Kernel VFS Page Cache I/O Layout Trace Plan File Systems (EXT3, JFS...) I/O BOPT−space Profiler Reconfigurator BORG Layer I/O Indirector I/O Scheduler Kernel−space components Device Driver Legend: Existing components New components 10 / 33

  11. System Architecture User−space components Application Analyzer Planner User Kernel VFS Page Cache I/O Layout Trace Plan File Systems (EXT3, JFS...) I/O BOPT−space Profiler Reconfigurator BORG Layer I/O Indirector I/O Scheduler Kernel−space components Device Driver Legend: Existing components New components 11 / 33

  12. I/O Profiler ◮ Each I/O operation logged with: � Temporal Attribute: Timestamp � Process-level Attributes: Process ID, name � Block-level attribute: Start LBA, length of I/O, Mode (R/W) Sample Trace [Timestamp] [PID] [Exec.] [StartLBA] [Size] [Mode] 705423195774700 5745 screen 6914207 32 R 705423259644748 5755 utempter 24379775 8 R 705423379492524 5755 utempter 24787567 8 R 705423421266908 5753 bash 7498311 24 R 705423454005104 5755 utempter 24793415 8 R 705423493292648 5753 bash 34543375 64 R 705423565122668 5766 stty 34543439 16 R ... ... ... ... ... ... 12 / 33

  13. System Architecture User−space components Application Analyzer Planner User Kernel VFS Page Cache I/O Layout Trace Plan File Systems (EXT3, JFS...) I/O BOPT−space Profiler Reconfigurator BORG Layer I/O Indirector I/O Scheduler Kernel−space components Device Driver Legend: Existing components New components 13 / 33

  14. Analyzer ◮ Builds a per-process directed, weighted graph ◮ Vertex is the per request LBA range (Start LBA, length) ◮ Edge is a temporal dependency between two ranges ◮ Weights represent frequency of access ◮ Graphs merged into a single master access graph Process graphs Master access graph after merging 1 r 1 :(0 , 3) r 1 :(0 , 1) s 1 :(6 , 1) s 1 :(1 , 6) 1 1 2 1 r 2 :(4 , 2) 1 r 1 , s 1 :(1 , 2) r 2 , s 1 :(4 , 2) r 3 :(8 , 1) 1 1 s 2 :(9 , 1) 1 1 r 3 :(8 , 2) s 1 :(3 , 1) r 3 , s 2 :(9 , 1) 14 / 33

  15. Planner ◮ Uses master access graph as input ◮ Chooses the most connected node for initial placement ◮ Chooses the node most connected to already placed node-set ◮ Places it depending on its direction of the connecting edge 10 8 A B C 7 5 9 6 1 2 7 2 7 9 G D E F 2 4 6 3 9 6 3 8 8 H I J F → H → J → A → G → C → B → E → D 15 / 33

  16. System Architecture User−space components Application Analyzer Planner User Kernel VFS Page Cache I/O Layout Trace Plan File Systems (EXT3, JFS...) I/O BOPT−space Profiler Reconfigurator BORG Layer I/O Indirector I/O Scheduler Kernel−space components Device Driver Legend: Existing components New components 16 / 33

  17. Reconfigurator 5. Reads from BOPT C’ D’ 1. Graph G 2. Current Plan Planner BOPT Space 3. Writes plan Reconfigurator W’ Source Dest. Leaving C’ C BOPT FS D FS 6. Writes to Space A FS C B 4. Reads plan Legend: BOPT Read Cache BOPT Write Buffer 17 / 33

  18. Reconfigurator 5. Reads from BOPT D’ 1. Graph G 2. Current Plan Planner BOPT D" Space 3. Writes plan Reconfigurator W’ 6. Writes to Source Dest. BOPT Leaving C’ C BOPT FS D FS Relocate D’ D" Space A BOPT BOPT C B 4. Reads plan Legend: BOPT Read Cache BOPT Write Buffer 18 / 33

  19. Reconfigurator 6. Writes to BOPT B’ 1. Graph G 2. Current Plan Planner BOPT D" Space 3. Writes plan Reconfigurator W’ Source Dest. Leaving C’ C BOPT FS D FS Relocate D’ D" Space A BOPT BOPT 5. Reads C Incoming B B’ FS block B 4. Reads plan FS BOPT Legend: BOPT Read Cache BOPT Write Buffer 19 / 33

  20. System Architecture User−space components Application Analyzer Planner User Kernel VFS Page Cache I/O Layout Trace Plan File Systems (EXT3, JFS...) I/O BOPT−space Profiler Reconfigurator BORG Layer I/O Indirector I/O Scheduler Kernel−space components Device Driver Legend: Existing components New components 20 / 33

  21. I/O Indirector B’ BOPT D" Space borg_map BOPT FS Read Dirty Block Block Request B’ B B’ 0 I/O B D Indirector C C’ 1 FS Space A C B Legend: BOPT Read Cache BOPT Write Buffer 21 / 33

  22. I/O Indirector B’ BOPT D" Space borg_map BOPT FS Read Dirty Block Block Request B B’ 0 I/O A C C’ 1 Indirector D FS A X Space A C B Legend: BOPT Read Cache BOPT Write Buffer 22 / 33

  23. I/O Indirector B’ BOPT D" Space borg_map W’ BOPT FS Write W’ Dirty Block Request Block B B’ 0 I/O W D Indirector C C’ 1 FS W W’ W’ 1 Space A C B Legend: BOPT Read Cache BOPT Write Buffer 23 / 33

  24. Evaluation Goals ◮ How effective is BORG? ◮ What are the overheads? ◮ When is it not effective? ◮ How sensitive is it to different parameters? Setup ◮ Metric - Total disk busy times ◮ 5 hosts with different configurations ◮ Linux 2.6.22 kernel ◮ reiserfs and ext 3 24 / 33

  25. Busy times for Webserver Setup ◮ Over 1.1 million requests to over 255,000 files in one week. ◮ BOPT size 8 GB, 4 Reconfigurations ◮ Evaluated BORG with cumulative and partial traces 3500 Vanilla 3000 Disk Busy Time (sec) BORG-C BORG-P 2500 2000 1500 1000 500 0 N 1 N 2 N 3 N 4 N 5 Phases Summary 14-35% reduction in busy times for cumulative and 5-39% for partial traces. 25 / 33

  26. Busy times for Webserver Setup ◮ Over 1.1 million requests to over 255,000 files in one week. ◮ BOPT size 8 GB, 4 Reconfigurations ◮ Evaluated BORG with cumulative and partial traces 700 Vanilla Disk Busy Time (sec) 600 BORG-C BORG-P 500 400 300 200 100 0 R 1 R 2 R 3 R 4 Phases Summary ◮ Busy times higher in reconfiguration phases due to copy overheads. 26 / 33

  27. BORG Overhead Setup ◮ Over 1.1 million requests to over 255,000 files in one week. ◮ BOPT size 8 GB, 4 Reconfigurations ◮ Cumulative and partial traces 30000 Reconfigurator 25000 Planner Analyzer 20000 Time (sec) 15000 10000 5000 0 C P C P C P C P R 1 R 2 R 3 R 4 Reconfigurations Summary ◮ Linear increase in planning and analysis overheads for cumulative traces. 27 / 33

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend