paradigm Inder Monga CTO, ESnet Division Deputy of Technology, - - PowerPoint PPT Presentation

paradigm
SMART_READER_LITE
LIVE PREVIEW

paradigm Inder Monga CTO, ESnet Division Deputy of Technology, - - PowerPoint PPT Presentation

Science Data and the NDN paradigm Inder Monga CTO, ESnet Division Deputy of Technology, Scientific Networking Division Lawrence Berkeley National Lab NDN Comm 2015 Experimental and observational science deals with big and small


slide-1
SLIDE 1

Science Data and the NDN paradigm

Inder Monga CTO, ESnet Division Deputy of Technology, Scientific Networking Division Lawrence Berkeley National Lab

NDN Comm 2015

slide-2
SLIDE 2

Experimental and

  • bservational science deals

with big and small instruments, and a lot of data!

2 Computing Sciences Area

slide-3
SLIDE 3

Experimental and

  • bservational science deals

with big and small instruments, and a lot of data!

2

  • Data volumes are increasing faster than

Moore’s Law

  • New algorithms and methods for analyzing

data

  • Infeasible to put a supercomputing center

at every experimental facility

Computing Sciences Area

slide-4
SLIDE 4

All too common process of discovery

3

slide-5
SLIDE 5

All too common process of discovery

3

slide-6
SLIDE 6

All too common process of discovery

3

slide-7
SLIDE 7

All too common process of discovery

3

slide-8
SLIDE 8

All too common process of discovery

3

slide-9
SLIDE 9

All too common process of discovery

3

slide-10
SLIDE 10

All too common process of discovery

3

slide-11
SLIDE 11

All too common process of discovery

3

slide-12
SLIDE 12

All too common process of discovery

3

slide-13
SLIDE 13

All too common process of discovery

3

slide-14
SLIDE 14

All too common process of discovery

3

slide-15
SLIDE 15

Extreme Data Science Facility (XDSF)

MS-DESI

ALS LHC JGI APS LCLS Other data- producing sources

  • 4 -

‘Superfacility’ Vision: A network of connected facilities, software and expertise to enable new modes of discovery

slide-16
SLIDE 16

ESnet

New Math Real-time analysis High performance Software Novel compute/data platforms Data mgmt. and sharing Program- mable network Extreme Data Science Facility (XDSF)

MS-DESI

ALS LHC JGI APS LCLS Other data- producing sources

  • 4 -

‘Superfacility’ Vision: A network of connected facilities, software and expertise to enable new modes of discovery

slide-17
SLIDE 17

ESnet is a dedicated mission network engineered to accelerate a broad range of science outcomes.

slide-18
SLIDE 18

ESnet is a dedicated mission network engineered to accelerate a broad range of science outcomes.

We do this by offering unique capabilities, and

  • ptimizing the network for data acquisition, data

placement, data sharing, data mobility.

slide-19
SLIDE 19

ESnet is designed for different goals than general Internet.

slide-20
SLIDE 20

ESnet is designed for different goals than general Internet.

slide-21
SLIDE 21

August 2015: 29.13 PB

Lots of data to move around

slide-22
SLIDE 22

August 2015: 29.13 PB

Lots of data to move around (contd.)

slide-23
SLIDE 23

High-level objectives for scientific data: alignment with NDN approach

9/29/2015 9

  • Radically simplify how scientific users manage, move and manipulate large,

distributed, science data repositories, but with high-throughput end2end

  • Abstract the storage and network capability and location dependence from

the user-data interaction

  • Enable the ability for users to specify and retrieve portions of data the

workflow needs

  • Create a secure, scalable framework based on integrated data management

and network transport

slide-24
SLIDE 24

Use Case #1

9/29/2015 10

Researchers from Berkeley Lab and SLAC conducted protein crystallography experiments at LCLS to investigate photoexcited states

  • f PSII, with near-real-time computational analysis at NERSC.

“Taking snapshots of photosynthetic water oxidation using femtosecond X-ray diffraction and spectroscopy,” Nature Communications 5, 4371 (9 July 2014)

50TB moved a night

slide-25
SLIDE 25

Use Case #2: LHCONE data – multiple replicas, global reach

slide-26
SLIDE 26

Use Case #3: International Climate Data

9/29/2015 12

slide-27
SLIDE 27

Use Case #3: International Climate Data

9/29/2015 12

slide-28
SLIDE 28

Use Case #3: International Climate Data

9/29/2015 12

slide-29
SLIDE 29

Perception of limitations of NDN motivating research questions

1. If I am moving 50TB of data through a single path, from an experiment to a storage facility, I really do not want to cache it at every intermediate NDN node

– What is the right strategy for allocating disk resources to caching? What if

  • ne data transfer consumes all cache resources or there is not enough

space? 2. What is the performance of the end-to-end data transfer? How can I get line rate throughput? 3. How do I leverage the knowledge of network capability in choosing the transfer path? How do I build in the knowledge of underlay into the NDN

  • verlay?

4. How do I leverage network programmability to do the above? 5. And many other questions….

9/29/2015 13

slide-30
SLIDE 30

Where are we at?

  • Collaboration with Christos and Colorado State – high-powered NDN devices

between three representative climate sites as a testbed

– Susmit working on answering some of the high-level objectives as described

  • HEP and ASCR interest in NDN from a research perspective – paper earlier this

year @ CHEP, and Phil will talk about next-steps right after

  • Interest in expanding a federation of high-powered NDN devices with the right

strategy for caching and data management

  • Combining NDN with SDN – we have a next-gen SDN testbed across US and

Europe – can we combine that to provide the right primitives for high- performance NDN?

– Lets do iterative experimentation and improvement!!!!!!!

9/29/2015 14

slide-31
SLIDE 31

ALBQ AMST ANL AOFA ATLA BNL BOIS BOST CERN CHIC DENV ELPA FNAL HOUS KANS LANL LBL LLNL LOND NASH NERSC NEWY ORNL PNNL PNWG SACR SAND SLAC STAR SUNN WASH

ESnet PE Router (2+)x10GE (n)x10GE Testbed Host Deployed SDN Testbed node locations Deployed SDN Testbed connectivity

  • verlay (using OSCARS circuits)

ESnet SDN Testbed

AMST CERN AOFA WASH STAR ATLA DENV LBL

August 2015 iDiscovery 2020 Inder Monga 15

Status Update:

  • Testbed deployed at all locations
  • QoS support verified, press release

next week

  • ENOS demo on Testbed @ SC
slide-32
SLIDE 32

Thank you!

  • Please feel free to email me with questions, comments or arrows at

imonga at es dot net

9/29/2015 16