content replication in i2 dsi using rsync
play

Content Replication in I2-DSI using Rsync+ Bert J Dempsey Debra - PDF document

Content Replication in I2-DSI using Rsync+ Bert J Dempsey Debra Weiss University of North Carolina at Chapel Hill dempsey@ils.unc.edu Multiple-site replication in I2-DSI http://dsi.internet2.edu/ 1 Replicating Channels Channel provider


  1. Content Replication in I2-DSI using Rsync+ Bert J Dempsey Debra Weiss University of North Carolina at Chapel Hill dempsey@ils.unc.edu Multiple-site replication in I2-DSI http://dsi.internet2.edu/ 1

  2. Replicating Channels Channel provider Content import from channel provider to master I2-DSI node Master Node Content Replication to all replication sites that carry the channel S3 clients S1 S2 clients Rsync+ for Content Replication Channel provider At M: Master-side rsync+ -F srcLatest/ src/ updates M (m)ftp updates at each of S1, S2, S3: Slave-side clients updates rsync+ -f src/ S1 S3 S2 Rsync is popular filesystem sync tool Rsync+ is our mods to enable local clients capture of update info for store-and-forward communication 2

  3. Server Experiment � Instrumented Mirror � Active Linux repository (8 GB, 25,000 files) � Twice daily synchronization � On dsi.ncni.net: � rsync+ -F : Perform master-side rsync+ processing between two local directories to create updates file � rsync+ -f : Use updates to perform slave-side rsync+ processing Content Change Patterns � Data here from 1-month Linux mirror � Update per 12-hour period � No files to change 13 of 60 periods (21%) � Average size of updated data (all periods) � 0.144% of aggregate archive � 0.104% under rsync+ � Maximum size of updated data � 2.42% of mirror 3

  4. Rsync+ processing cost 2 1.5 Run time as % of 1 rsync 0.5 0 1 3 5 7 9 11 13 15 Mirror updates (one per 12 hours) rsync+ -F rysnc+ -f Rsync+ Local Throughput runtime (sec) unnormalized tput normalized tput 30 25 Tput (Mbits/sec) 20 15 10 5 0 0 4 8 12 16 Mirror Update (12 hours/update) 4

  5. Network Throughput: ttcp experiment parameters Parameter Values File Size 5.45 MB dsi.ncni.net � Network Path ils.unc.edu (100 Mbit/s min) Concurrent ttcp 1,2,4,8,16,24,32 connections Receiver socket buffer 240 KB size (KB) Buffer Policy 240KB / 240KB shared Network Throughput: concurrent ttcp transmits 30 27.03944 25 Tput (Mbits/s) 20 15 10 9.62592 8.23088 6.29856 5 4.50688 2.8464 2.24 0 1 2 4 8 16 24 32 ttcp concurrent transmits 5

  6. Network experiments: setting socket buffer sizes dsi2ils sept14 (1 ttcp, avg. over 6 runs) 9000 8000 Tput (Kbytes/s) 7000 6000 5000 4000 3000 2000 1000 0 0 50000 100000 150000 200000 250000 300000 350000 Buffer Size (bytes) Network Throughput: concurrent ttcp tputs Tput, Buffer Policy 1 Tput, Buffer Policy 2 Aggregate Tput, Buffer Policy 1 Aggregate Tput, Buffer Policy 2 100 Tput (Mbits/s) 10 1 1 2 4 8 16 24 32 Concurrent ttcp transmits (avg over runs) 6

  7. Baseline Scalability Analysis using empirical inputs � Update of content � 0.1 % avg, 2.4 % maximum � Network tput � 8 Servers, thus 6.2 Mbits/sec to each � Server tput (local rsync actions) � Master: 11.4 Mbits/sec � Slave: 8.18 Mbits/sec Baseline Scalability Analysis: end-to-end update latency Content Updates Master Network Slave End-to- processing latency processing end Channel Avg latency latency update Size Max latency 10 GB 10 MB 7 secs 12.9 secs 9.7 secs 29.6 sec 240 MB 168 sec 309 sec 233 s 710 s 100 GB 100 MB 70 sec 129 sec 97 sec 296 sec 2.4 GB 28 min 51.5 min 38.8 min 118.3 min 1 TB 1 GB 11.7 min 21.5 min 16.1 min 49.3 min 24 GB 280.8 min 516 min 386.4 min 19.72 hrs 7

  8. Conclusions � Our work creates scalable design for filesystem- level tool for data synchronization � Current systems without tuning suggest O(100 GB) content can be handled for initial server set � For TB content, system advances will need to provide speed-ups � Tuning � Hardware � Distributed processing 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend