dynamic federations
play

Dynamic Federations Seamless aggregation of standard-protocol-based - PowerPoint PPT Presentation

Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar Daniel Becker Adrien Devresse Oliver Keeble Ricardo Brito da Rocha Alejandro Alvarez Credits to ShuTing Liao (ASGC) 1


  1. Dynamic Federations Seamless aggregation of standard-protocol-based storage endpoints Fabrizio Furano Patrick Fuhrmann Paul Millar Daniel Becker Adrien Devresse Oliver Keeble Ricardo Brito da Rocha Alejandro Alvarez Credits to ShuTing Liao (ASGC) 1 1

  2. 2 WLCG Computing Model Data Worker Worker Worker Worker App Cernvmfs Data Data EMI INFSO-RI-261611 18 Sept 2012 F.Furano - Dynamic federations

  3. 3 Storage Federations: Motivations • Currently data lives on islands of storage • catalogues are the maps • FTS/gridFTP are the delivery companies • Experiment frameworks populate the island • Jobs are directed to places where the needed data is • or should be ...... • Almost all data lives on more than one island • Assumption : • perfect storage ( unlikely to impossible) • perfect experiment workflow and catalogues ( unlikely ) • Strict locality has some limitations – a single missing file can derail the whole job • or series of jobs -> Failover to data on another island could help • Replica catalogues impose limitations, too – E.g. synchronization is difficult, performance too • Quest for direct, Web-like forms of data access • Great plus: other use cases may be fulfilled e.g. site EMI INFSO-RI-261611 caching, sharing storage amongst sites 18 Sept 2012 F.Furano - Dynamic federations

  4. Storage federations • What ’ s the goal? – Make different storage clusters be seen as one – Make global file-based data access seamless • How should this be done? – Dynamically • easy to setup/maintain • no complex metadata persistency • no DB babysitting (keep it for the experiment ’ s metadata) • no replica catalogue inconsistencies, by design – Light config constraints on participating storage – Using standards • No strange APIs, everything looks familiar • Global direct access to global data EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 3

  5. The basic idea We see this Aggregation All the metadata interactions are With 2 hidden /dir1 replicas NO persistency /dir1/file1 needed here, just efficiency and /dir1/file2 parallelism /dir1/file3 Storage/MD endpoint 1 Storage/MD endpoint 2 EMI INFSO-RI-261611 /dir1/file1 /dir1/file2 EMI INFSO-RI-261611 /dir1/file2 /dir1/file3 11

  6. Dynamic HTTP Federations • Federation – Simplicity, redundancy, storage/network efficiency, elasticity, performance – Dynamic: does everything on the fly, no DB • Focus on HTTP/DAV – Standard clients everywhere – One protocol for everything (WAN/LAN) – Transparent redirection • Use cases – Easy, direct job/user data access, WAN friendly – Access missing files after job starts – Friend sites can share storage – Cache integration (future) EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 2

  7. What is federated? • We federate (meta)data repositories that are ‘ compatible ’ – HTTP interface – Name space (modulo simple prefixes) • Including catalogues – Permissions (they don ’ t contradict across sites) – Content (same key or filename means same file [modulo translations]) • Dynamically and transparently discovering metadata – looks like a unique, very fast file metadata system – properly presenting the aggregated metadata views – redirecting clients to the geographically closest endpoint • Local SE is preferred • The system also can load a “ Geo ” plugin EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 4

  8. What is federated? • Technically TODAY we can aggregate: – SEs with DAV/HTTP interfaces – dCache, DPM • Future: Xrootd? EOS? Storm? – Catalogues with DAV/HTTP interfaces • LFC supported • Future: Experiment catalogues could be integrated – Cloud DAV/HTTP/S3 services – Anything else that happens to have an HTTP interface… • Caches – Native LFC and DPM databases EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 5

  9. Why HTTP/DAV? • It ’ s everywhere – A very widely adopted technology • It has the right features – Redirection, WAN friendly • Convergence – Transfers and data access – No other protocols required • We (humans) like browsers , they give an experience of simplicity – Open to direct access and integrated web apps EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 6

  10. DPM/HTTP • DPM has invested significantly in HTTP as part of the EMI project – New HTTP/DAV interface – Parallel WAN transfers – 3rd party copy – Solutions for replica fallback •“ Global access ” and metalink – Performance evaluations • Experiment analyses • Hammercloud • Synthetic tests • Root tests Dynamic Federations, Lyon, Sept 2012 7

  11. Demo • We have set up a stable demo testbed, using HTTP/DAV – Head node in DESY: http://federation.desy.de/myfed/ – a DPM instance at CERN – a DPM instance at ASGC (Taiwan) – a dCache instance in DESY – a Cloud storage account by Deutsche Telecom • The feeling it gives is surprising – Metadata performance is in avg higher than contacting the endpoints • We see the directories as merged, as it was only one system • There ’ s one test file in 3 sites, i.e. 3 replicas. – /myfed/atlas/fabrizio/hand-shake.JPG – Clients in EU get the one from DESY/DT/CERN – Clients in Asia get the one from ASGC • There’s a directory whose content is interleaved between CERN and DESY – http://federation.desy.de/myfed/dteam/ugrtest/interleaved/ • There’s a directory where all the files are in two places EMI INFSO-RI-261611 – http://federation.desy.de/myfed/dteam/ugrtest/all/ Dynamic Federations, Lyon, Sept 2012 10

  12. Example Client Frontend (Apache2+DMLite) Aggregator (UGR) Plugin DMLite Plugin DAV/HTTP Plugin HTTP LFC or DB SE SE EMI INFSO-RI-261611 SE SE LFC SE SE EMI INFSO-RI-261611 SE SE Plain Plain DAV/HTTP DAV/HTTP 18 Sept 2012 F.Furano - Dynamic federations 1 2

  13. Design and performance • Full parallelism – Composes on the fly the aggregated metadata views by managing parallel tasks of information location • Never stacks up latencies! • The endpoints are treated in a completely independent way – No global locks/serialisations! – Thread pools, prod/consumer queues used extensively (e.g. to stat N items in M endpoints while X clients wait for some items) • Aggressive metadata caching – The metadata caching keeps the performance high • Peak raw cache performance is ~500K->1M hits/s per core – A relaxed, hash-based, in-memory partial name space – Juggles info in order to always contain what ’ s needed • Keep them in an LRU fashion and we have a fast 1st level namespace cache – Stalls clients the minimum time that is necessary to juggle their information bits EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 15

  14. Server architecture Clients come and are distributed through: • different machines (DNS alias) • different processes (Apache config) Clients are served by the UGR. They can browse/stat or be redirected for action. The architecture is multi/manycore friendly and uses a fast parallel caching scheme EMI INFSO-RI-261611 13

  15. Name translation • A sophisticated scheme of name translation is a key to be able to federate almost any source of metadata – UGR implements algorithmic translations and can accommodate non algorithmic ones as well – A plugin could also query an external service (e.g. an LFC or a private DB) EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 14

  16. Design and performance • Horizontally scalable deployment – Multithreaded – DNS balanceable • High performance DAV client implementation – Wraps DAV calls into a POSIX-like API, saves from the difficulty of composing requests/responses – Performance is privileged: uses libneon w/ sessions caching – Compound list/stat operations are supported – Loaded by the core as a “ location ” plugin EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 16

  17. A performance test • Two endpoints: DESY and CERN (poor VM) • One UGR frontend at DESY • Swarm of test clients at CERN • 10K files in a 4-levels deep directory – Files exist on both endpoints • The test (written in C++) invokes Stat only once per file, using many parallel clients doing stat() at the maximum pace from 3 machines EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 17

  18. The result, WAN access EMI INFSO-RI-261611 18

  19. Another test, LAN, Cache impact EMI INFSO-RI-261611 18

  20. Another test, LAN, access patterns EMI INFSO-RI-261611 18

  21. Get started • Get it here: https://svnweb.cern.ch/trac/lcgdm/wiki/Dynafed s • What you can do with it: – Easy, direct job/user data access, WAN friendly – Access missing files after job starts – Friend sites can share storage – Diskless sites – Federating catalogues • Combining catalogue-based and catalogue-free data Dynamic Federations, Lyon, Sept 2012 19

  22. Next steps • Release our beta, as the nightlies are good • More massive tests, with many endpoints, possibly distant – We are now looking for partners • Precise performance measurements • Refine the handling of the ‘ death ’ of the endpoints • Immediate sensing of changes in the endpoints ’ content, e.g. add, delete – SEMsg in EMI2 SYNCAT would be the right thing in the right place • Some more practical experience (getting used to the idea, using SQUIDs, CVMFS, EOS, clouds,... <put your item here> ) EMI INFSO-RI-261611 Dynamic Federations, Lyon, Sept 2012 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend