implementing collocation groups 1 about draper lab
play

IMPLEMENTING COLLOCATION GROUPS #1 About Draper Lab An - PDF document

IMPLEMENTING COLLOCATION GROUPS #1 About Draper Lab An independent, not-for-profit corporation dedicated to applied research, engineering development, education, and technology transfer Spun off from the Massachusetts Institute of


  1. IMPLEMENTING COLLOCATION GROUPS #1

  2. About Draper Lab • An independent, not-for-profit corporation dedicated to applied research, engineering development, education, and technology transfer – Spun off from the Massachusetts Institute of Technology in 1973 – Expertise in guidance, navigation and control systems – Early applications: U.S. Navy's Fleet Ballistic Missile Program and NASA's Apollo Program IMPLEMENTING COLLOCATION GROUPS #2

  3. Agenda • Why collocation groups? • ITSM code components • Additional tools • A process to move 40TB • Conclusions IMPLEMENTING COLLOCATION GROUPS #3

  4. Why do I want Collocation groups? • Number of nodes vs. number of slots 1. Nodes < slots; collocate by node or filespace 2. Nodes > slots; can't collocate • If collocate is on, no control of node mixing, still 1 mount per node during migration • Node size vs. tape capacity 1. Size > tape cap.; collocation fills tape 2. Size < tape cap.; collocation wastes tape • Collocation by group makes "supernodes" which work for both case 1's IMPLEMENTING COLLOCATION GROUPS #4

  5. Server Configuration • Sun v480 4 processors, Solaris 9 • Raw disk for db, log, backuppool, no raid • TSM server code at 5.3.1.3 DB size Number of Acronym Function Physical TB (pages) files Library LM 5,500 5,000 6 manager SS For servers 8,500,000 47,500,000 18 For SD 45,400,000 230,000,000 40 desktops For SD2 4,300,000 33,000,000 2 desktops IMPLEMENTING COLLOCATION GROUPS #5

  6. The starting SD server mess • Volumes – 417 to process – Average nodes / volume is 188 – Max is 713 – 25 are over 500 • Nodes – 1635 nodes – Average 48 volumes / node – Max is 132 – 25 are over 100 IMPLEMENTING COLLOCATION GROUPS #6

  7. New server commands • Def, del, upd, query collocgroup – names and describes the group • Def, del collocmember – adds a node to a group • Query nodedata – Very fast!! – Lists tapes which have files for a node or group, no separation by filespace • Upd stgpool colloc=group IMPLEMENTING COLLOCATION GROUPS #7

  8. The secret perl scripts • 4 scripts in the bin directory, not documented • Used only defgroups.pl – Analyzes ‘q occ’ data, creates define statements to build the groups – Execution • ./defgroups.pl id pwd domain size [execute] IMPLEMENTING COLLOCATION GROUPS #8

  9. Fix the defgroups.pl SQL • Eliminate stgpool subselect – Change stgpool subselect to in list, name your tape stgpool • Eliminate join between nodes & occupancy – check domain_name with a subselect • Eliminate check for a collocgroup – It is always null while implementing • Runtime drops from “beyond the limits of my patience” to 5 minutes IMPLEMENTING COLLOCATION GROUPS #9

  10. Using Query Nodedata • SQL generates a command for each node – Also 'q nodedata * stg=pool_name' • Run file from step1, direct output to 2 nd file – ‘q nodedata’ doesn’t have a corresponding SQL table (the very expensive volumeusage table is close) • Edit output to get only node name and volume name • Load into MySql • Analyze IMPLEMENTING COLLOCATION GROUPS #10

  11. Tools • MySQL desktop development server – Very handy to have! – No select for nodedata – Do complex joins without killing the server – http://www.mysql.com/ • UltraEdit editor – Sorting, column editing, hex editing – http://www.ultraedit.com/ IMPLEMENTING COLLOCATION GROUPS #11

  12. Preliminaries • Decide target number of tapes in each group – Convert it to 'size in megs' for defgroups.pl – Goal is 4 tapes – We compress at the client, so lto2 capacity is 200G, � 'size' is 800,000 • Run defgroups.pl on domain(s) • Execute the commands from defgroups.pl • 'Update stgpool <name> colloc=group' • Mark all current tapes readonly – Stops migrating to uncollocated filling tapes – Makes SQL easier • Have as many scratch tapes as groups IMPLEMENTING COLLOCATION GROUPS #12

  13. A process to minimize tape mounts • By turning on collocation by group, a move or reclaim within the tapepool will need an output tape mount for each collocgroup on the input tape. – Potentially very slow, stressful for the tape drives • Solution is to move data from tape to devt=file pools on disk where files are put into groups, then migrate back to tape. IMPLEMENTING COLLOCATION GROUPS #13

  14. Storage pools • 3 sequential pools on disk – seqdisk3, seqdisk4, seqdisk5 • 2 pools receive data from tapes – Seqdisk3 & 4 each have 2 69G volumes – Not collocated, moves don’t reconstruct • Seqdisk5 receives data from seqdisk3 & 4 – 170 8GB volumes on 10 146GB drives, each with its own file system. – Collocated by group, moves reconstruct IMPLEMENTING COLLOCATION GROUPS #14

  15. The schedules and scripts • Each script is executed every 10 minutes by a schedule – 6 similar schedules for each script • For script a, run at 00:00, 00:10, 00:20, etc. • T4_VOLUMES_ODD, move odd numbered volumes to seqdisk3 • T4_VOLUMES_EVEN, move even numbered volumes to seqdisk4 • T4_MOVES, moves seqdisk3 & 4 volumes to seqdisk5 • T4_MIGRATES, starts migration of seqdisk5 to tape • T4_VOLUMES_DIRECT move some tapes direct to tape IMPLEMENTING COLLOCATION GROUPS #15

  16. SQL to make the scripts • Use a file as a macro to create the script • The T4_VOLUMES* script has a prolog with logic – checks if backuppool migration is running, exit if yes – checks if SEQDISK3 is being used, exit if yes – checks for space in SEQDISK3, if yes then run • Run SQL to select odd/even volumes ordered by pct_utilized and append it to the file • For each volume, need 4 lines in the script – test if the volumes is still full or filling – goto script lines to issue a move command – issue the move command – exit IMPLEMENTING COLLOCATION GROUPS #16

  17. Other methods to move all that data • Direct tape to tape within the pool – Not as bad as I had feared! – Analyzed which tapes had the fewest groups on them and moved them tape to tape. • Of 278 tapes, 219 have 30 or more (42 max) • Move nodedata direct tapes to tapes – Move nodedata list-of-all-the-nodes-in-group – Need extra scratch tapes because source tapes aren't emptied quickly IMPLEMENTING COLLOCATION GROUPS #17

  18. The results so far • Started on Aug-5, results as of Sep-8 • Volumes – 160 to process – Average nodes / volume is 188 – Max is 485 – 10 are over 400 • Nodes – 1629 nodes – Average 22 volumes / node – Max is 63 – 4 are over 50 IMPLEMENTING COLLOCATION GROUPS #18

  19. Summary • Match your process to your resources – Does your disk write speed match your tape read speed? • The more groups you have, the longer a tape to tape move or reclaim will take. • Do 2 processes? – Few cg's on a tape, do tape to tape. – Lots of cg's on a tape, do tape to file. IMPLEMENTING COLLOCATION GROUPS #19

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend