managing scientific data with ndn
play

MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, - PowerPoint PPT Presentation

MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos NDNcomm 2015 Sept 28, 2015 Los Angeles, CA Supported by NSF #13410999 and NSF#1345236 Introduction


  1. MANAGING SCIENTIFIC DATA WITH NDN Chengyu Fan, Susmit Shannigrahi, Steve DiBenedetto, Catherine Olschanowsky, Christos Papadopoulos NDNcomm 2015 Sept 28, 2015 Los Angeles, CA Supported by NSF #13410999 and NSF#1345236

  2. Introduction  Scientific data is often very large and complex  Climate - CMIP5: 3.5 PB, CMIP6: 350PB-3EB  Physics - Atlas: 4 PB/Year  Astronomy, bioinformatics, others…  Science infrastructure  Cutting edge hardware but often incompatible domain software (ESGF, xrootd, etc.)  Complexity, replication, redundancy 1 1

  3. Our Project  Build and deploy software to evaluate NDN in scientific applications over a dedicated hardware infrastructure  Evaluate NDN in the context of:  Application services: publishing, discovery, retrieval, access control, load balancing, failover, caching, etc.  Network integration (OSCARS, SDN, etc.)  Metrics  Performance, reduced complexity, ease of deployment, interoperability, reuse, efficiency, routing, security/trust, etc. 2 2

  4. NDN Layer Structure host host UDP/IP UDP/IP 3

  5. NDN Layer Structure host host APP UDP/IP UDP/IP 4

  6. NDN Layer Structure host host APP router NDN UDP/IP UDP/IP 5

  7. NDN Layer Structure host host APP router NDN NDN NDN LINK ETH ETH UDP/IP UDP/IP Other Other 6

  8. NDN Layer Structure host host APP APP router NDN NDN NDN LINK ETH ETH UDP/IP UDP/IP Other Other 7

  9. NDN Layer Structure host host APP APP router NDN NDN NDN NDN LINK ETH ETH UDP/IP UDP/IP Other Other 8

  10. NDN Layer Structure host host APP APP router router NDN NDN NDN NDN LINK LINK ETH ETH UDP/IP UDP/IP Other Other 9

  11. Methodology  Investigate the use of NDN as a common platform for scientific data applications by:  Understanding data management challenges of various scientific domains  Developing and evaluating prototype applications that leverage NDN's features  Use prototypes to further drive NDN research 10 4

  12. First Step – Build a Catalog  Create a shared resource – a distributed, synchronized catalog of names over NDN  Provide common operations such as publishing, discovery, access control  Catalog only deals with name management, not dataset retrieval  Platform for further research and experimentation  Research questions:  Namespace construction, distributed publishing, key management, UI design, failover, etc.  Functional services such as subsetting  Mapping of name-based routing to tunneling services (VPN, OSCARS, MPLS) 11 5

  13. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage Consumer Catalog node 2 12 6

  14. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 (1)Publish Dataset names Publisher NDN Data storage Consumer Catalog node 2 13 6

  15. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage Consumer Catalog node 2 14 6

  16. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 (2) Sync changes Publisher NDN Data storage Consumer Catalog node 2 15 6

  17. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage Consumer Catalog node 2 16 6

  18. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage (3) Query for Dataset names Consumer Catalog node 2 17 6

  19. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage Consumer Catalog node 2 18 6

  20. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage (4) Retrieve data Consumer Catalog node 2 19 6

  21. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage (4) Retrieve data Consumer Catalog node 2 20 6

  22. Overview of Catalog Workflow Catalog node 1 Data storage Catalog node 3 Publisher NDN Data storage (4) Retrieve data Consumer Catalog node 2 21 6

  23. NDN-Science Testbed  NSF CC-NIE campus infrastructure award  10G testbed (courtesy of ESnet, UCAR, and CSU Research LAN)  Currently ~50TB of CMIP5, ~70TB of HEP data 22 7

  24. Demos  Search  Publication and Sync  Access control  Retrieval and failover 23 8

  25. Conclusions  IP encourages common host access, not common data access methods  Does not encourage interoperability at the application level  NDN has the potential to unify the service interface required by scientific applications  Science testbed and prototypes to test hypothesis and drive research and experimentation  Ready-to-try catalog, we invite you to try it with your data  Catalog is general, supports a variety of applications  Currently CMIP5 and HEP applications  UI for data search and retrieval. 24 9

  26. Our sponsors: NSF and ESnet Join us @ http://www.netsec.colostate.edu/mailman/listinfo/ndn-sci 25 10

  27. Backup Slides 11

  28. Current Example: xrootd Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Fragile, fairly complex middleware 27 12

  29. Current Example: xrootd Manager xrootd cmsd (a.k.a. Redirector) Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Fragile, fairly complex middleware 28 12

  30. Current Example: xrootd Manager Client xrootd cmsd (a.k.a. Redirector) Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Fragile, fairly complex middleware 29 12

  31. Current Example: xrootd Manager Client 4: Try open() at A xrootd cmsd (a.k.a. Redirector) Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Fragile, fairly complex middleware 30 12

  32. xrootd under NDN NDN Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Significantly reduced system complexity  Better service abstraction 31 13

  33. xrootd under NDN NDN Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Significantly reduced system complexity  Better service abstraction 32 13

  34. xrootd under NDN NDN Client Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Significantly reduced system complexity  Better service abstraction 33 13

  35. xrootd under NDN ? /my/file NDN Client Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Significantly reduced system complexity  Better service abstraction 34 13

  36. xrootd under NDN ? /my/file NDN Client Data Servers xrootd cmsd xrootd cmsd xrootd cmsd A /my/file B C /my/file  Significantly reduced system complexity  Better service abstraction 35 13

  37. Data Publication Catalog Publisher 1) Listening on /<catalog- prefix>/publish 36

  38. Data Publication Catalog Publisher 2) Generate NDN names for 1) Listening on /<catalog- datasets/services prefix>/publish 37

  39. Data Publication Catalog Publisher 2) Generate NDN names for 1) Listening on /<catalog- datasets/services prefix>/publish 3) Request publish 38

  40. Data Publication Catalog Publisher 2) Generate NDN names for 1) Listening on /<catalog- datasets/services prefix>/publish 3) Request publish 4) Fetch published name list 39

  41. Data Publication Catalog Publisher 2) Generate NDN names for 1) Listening on /<catalog- datasets/services prefix>/publish 3) Request publish 4) Fetch published name list 5) Authenticate the Data and validate data name against trust model 40

  42. Data Publication Catalog Publisher 2) Generate NDN names for 1) Listening on /<catalog- datasets/services prefix>/publish 3) Request publish 4) Fetch published name list 5) Authenticate the Data and validate data name against trust model 6) Share names with other catalogs 41

  43. Keys for ndn-atmos /cmip5/KEY Self-signed root key Site’s keys … /cmip5/lbl/KEY /cmip5/nwsc/KEY Application’s keys (Dataset names publishing) (NLSR) /cmip5/nwsc/<operator>/KEY /cmip5/lbl/<DataPublisher>/KEY /cmip5/nwsc/<router>/KEY 42 15

  44. Keys for ndn-atmos /cmip5/KEY Self-signed root key signs Site’s keys … /cmip5/lbl/KEY /cmip5/nwsc/KEY Application’s keys (Dataset names publishing) (NLSR) /cmip5/nwsc/<operator>/KEY /cmip5/lbl/<DataPublisher>/KEY /cmip5/nwsc/<router>/KEY 43 15

  45. Trust Model  Only namespace owners are allowed to publish data  Data provenance built into the data packet /PublisherA/publish Content Name Publisher A’s signature Signature - /PublisherA/publish/file/1 Data payload - /PublisherA/publish/file/2 + /PublisherA/publish/file/3 + /PublisherA/publish/file/4 Valid publish message 44 16

  46. Trust Model  Only namespace owners are allowed to publish data  Data provenance built into the data packet /PublisherA/publish /PublisherA/publish Content Name Publisher A’s signature Publisher A’s signature Signature - /PublisherA/publish/file/1 Data payload - /PublisherA/publish/file/2 - /PublisherB/publish/file + /PublisherA/publish/file/3 + /PublisherA/publish/file/4 Valid publish message Invalid publish message 45 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend