LHCnet: Proposal for LHC Network infrastructure extending globally - PowerPoint PPT Presentation

LHCnet: Proposal for LHC Network infrastructure extending globally to Tier2 and Tier3 sites Artur Barczyk, Harvey Newman California Institute of Technology / US LHCNet LHCT2S Meeting CERN, January 13 th , 2011 1

THE PROBLEM TO SOLVE 2

LHC Computing Infrastructure WLCG in brief: WLCG in brief: • • 1 Tier-0 (CERN) • • 11 Tiers-1s; 3 continents • • 164 Tier-2s; 5 (6) continents Plus O(300) Tier Plus O(300) Tier-3s worldwide 3

CMS Data Movements (All Sites and Tier1-Tier2) 120 Days June-October 120 Days June-October 2.5 2 Throughput [GBy/s] Daily average total Daily average total Daily average Daily average 2 rates reach over T1-T2 rates reach reach 1.5 2 GBytes/s 1-1.8 GBytes/s 1.5 1 1 0.5 0.5 0 0 6/19 7/03 7/17 7/31 8/14 8/28 9/11 9/25 10/9 6/23 7/07 7/21 8/4 8/18 9/1 9/15 9/29 10/13 132 Hours 132 Hours Tier2-Tier2 ~25% 4 Throughput [GBy/s] Last Week of Tier1-Tier2 1 hour average: 1 hour average: to 3.5 GBytes/s Traffic 3 To ~50% 2 during Dataset Reprocessing & 1 Repopulation 0 10/7 10/6 10/8 10/9 10/10 4

Worldwide data distribution and analysis (F.Gianotti) Total throughput of ATLAS data through the Grid: 1 st January  November. MB/s per day 6 GB/s ~2 GB/s (design) Peaks of 10 GB/s reached Grid-based analysis in Summer 2010: >1000 different users; >15M analysis jobs The excellent Grid performance has been crucial for fast release of physics results. E.g.: ICHEP: the full data sample taken until Monday was shown at the conference Friday 5

Changing LHC Data Models • 3 recurring themes: – Flat(ter) hierarchy: Any site might in the future pull data from any other site hosting it. – Data caching: Analysis sites will pull datasets from other sites “on demand”, including from Tier2s in other regions • Possibly in combination with strategic pre-placement of data sets – Remote data access: jobs executing locally, using data cached at a remote site in quasi-real time • Possibly in combination with local caching • Expect variations by experiment 6

Ian Bird, CHEP conference, Oct 2010 Ian Bird, CHEP conference, Oct 2010 7

Remote Data Access and Local Processing with Xrootd (CMS)  Useful for smaller sites with less (or even no) data storage  Only selected objects are read (with object read-ahead). No transfer of entire data sets  CMS demonstrator: Omaha diskless Tier3, served data from Caltech and Nebraska (Xrootd) Strategic Decisions: Strategic Decisions: Remote Access vs Data Transfers Similar operations in Similar operations in ALICE for years Brian Bockelman, September 2010 Brian Bockelman, September 2010 8

Ian Bird, CHEP conference, Oct 2010 Ian Bird, CHEP conference, Oct 2010 9

Requirements summary (from Kors ’ document) • Bandwidth: – Ranging from 1 Gbps (Minimal site) to 5-10Gbps (Nominal) to N x 10 Gbps (Leadership) – No need for full-mesh @ full-rate, but several full-rate connections between Leadership sites – Scalability is important, • sites are expected to migrate Minimal  Nominal  Leadership • Bandwidth growth: Minimal = 2x/yr, Nominal&Leadership = 2x/2yr • “Staging”: – Facilitate good connectivity to so far (network-wise) underserved sites • Flexibility: – Should be able to include or remove sites at any time • Budget Neutrality: – Solution should be cost neutral [or at least affordable, A/N] 10

SOLUTION PROPOSAL 11

Lessons learned • The LHC OPN has proven itself, shall learn from it • Simple architecture – Point-to-point Layer 2 circuits – Flexible and scalable topology • Grew organically – From star to partial mesh – Open to several technology choices • each of which satisfies requirements • Federated governance model – Coordination between stakeholders – No single administrative body required – Made extensions and funding straight-forward • Remaining challenge: monitoring and reporting – More of a systems approach 12

Design Inputs • By the scale, geographical distribution and diversity of the sites as well as funding, only a federated solution is feasible • The current LHC OPN is not modified – OPN will become part of a larger whole – Some purely Tier2/Tier3 operations • Architecture has to be Open and Scalable – Scalability in bandwidth, extent and scope • Resiliency in the core, allow resilient connections at the edge • Bandwidth guarantees  determinism – Reward effective use – End-to-end systems approach • Operation at Layer 2 and below – Advantage in performance, costs, power consumption 13

Design Inputs, cont. • Most/all R&E networks (technically) can offer Layer 2 services – Where not, commercial carriers can – Some advanced ones offer dynamic (user controlled) allocation • Leverage as much as possible on existing infrastructures and collaborations – GLIF, DICE, GLORIAD, … • Last but not least: – This would be the perfect occasion to start using IPv6, therefore we should, (at least) encourage IPv6, but support IPv4 • Admittedly the challenge is above Layer 3 14

Design Proposal • A design satisfying all requirements: Switched Core with Routed Edge • Sites interconnected through Lightpaths – Site-to-site Layer 2 connections, static or dynamic • Switching is far more robust and cost-effective for high- capacity interconnects • Routing (from end-site viewpoint) is deemed necessary 15

Switched Core • Strategically placed core exchange points – E.g. start with 2-3 in Europe, 2 in NA, 1 in SA, 1-2 in Asia – E.g. existing devices at Tier1s, GOLEs, GEANT nodes, … • Interconnected through high capacity trunks – 10-40 Gbps today, soon 100Gbps • Trunk links can be CBF, multi- domain Layer 1/ Layer 2 links, … – E.g. Layer 1 circuits with virtualised sub-rate channels, sub-dividing 100G links in early stages • Resiliency, where needed, provided at Layer 1/ Layer 2 – E.g. SONET/SDH Automated Protection Switching, Virtual Concatenation • At later stage, automated Lightpath exchanges will enable a flexible “stitching” of dynamic circuits – See demonstration (proof of principle) at last GLIF meeting and SC10 16

One Possible Core Technology: Carrier Ethernet • IEEE standard 802.1Qay (PBB-TE) – Separation of backbone and customer network through MAC-in-MAC – No flooding, no Spanning Tree – Scalable to 16 M services • Provides OAM comparable to SONET/SDH – 802.3ag, end-to-end service OAM • Continuity Check Message, loopback, linktrace – 802.3ah, link OAM • Remote loopback, loopback control, remote failure indication • Cost Effective – e.g. NSP study indicates TCO ~43% lower for COE (PBB-TE) vs MPLS-TE • 802.1Qay and ITU-T G.8031 Ethernet Linear Protection Standard provides 1+1 and 1:1 protection switching – Similar to SONET/SDH APS – Works by Y.1731 message exchange (ITU-T standard) 17

Routed Edge • End sites (might) require Layer 3 connectivity in the LAN – Otherwise a true Layer 2 solution might be adequate • Lightpaths terminate on a site’s router – Site’s border router, or, preferably, – Router closest to the storage elements • All IP peerings are p2p, site-to-site – Reduces convergence time, avoids issues with flapping links • Each site decides and negotiates with which remote site it desires to peer (e.g. based on experiment’s connectivity design) • Router (BGP) advertises only the SE subnet(s) through the configured Lightpath 18

Lightpath termination • Avoid LAN connectivity issues when terminating lightpath at campus edge • Lightpath should be terminated as close as possible to the Storage Elements, but can be challenging if not impossible (support a dedicated border router?) • Or, provide a “local lightpath ” (e.g. a VLAN with proper bandwidth, or a dedicated link where possible); border router does the “stitching” 19

IP backup • Foresee IP routed paths as backup – End- site’s BR is configured for both default IP connectivity, and direct peering through Lightpath – Direct peering takes precedence • Works also for dynamic Lightpaths • For full dynamic Lightpath setup, dynamic end-site configuration through e.g. LambdaStation or TeraPaths will be used 20

Resiliency • Resiliency in the core is provided by protection switching depending on technology used between core nodes – SONET/SDH or OTN protection switching (Layer 1) – MPLS failover – PBB-TE protection switching – Ethernet LAG • Sites can opt for additional resiliency (e.g. where protected trunk links are not available) by forming transit agreements with other site – akin to the current LHC OPN use of CBF 21

Layer1 through Layer 3 22

Scalability • Assuming Layer 2 point-to-point operations, a natural scalability limitation is the 4k VLAN IDs • This problem is naturally resolved in – PBB-TE (802.3Qay), through MAC-in-MAC encapsulation Customer Ethertype Ethertype B-DA B-SA B-VID I-SID Frame incl. B-FCS 0x88A8 0x88E7 Header+FCS – dynamic bandwidth allocation with re-use of VLAN IDs • Only constraint is no two connections through the same network element to use the same VLAN 23

LHCnet: Proposal for LHC Network infrastructure extending globally - PowerPoint PPT Presentation

LHCnet: Proposal for LHC Network infrastructure extending globally to Tier2 and Tier3 sites Artur Barczyk, Harvey Newman California Institute of Technology / US LHCNet LHCT2S Meeting CERN, January 13 th , 2011 1 THE PROBLEM TO SOLVE 2 LHC

presentation Rzsa CNET CNET TF-NOC flash p US LHC US LHC Sndor US LHC US LHC Netw w

LHC An invitation to further reading. Mike Lamont CERN/AB 1 CERNs accelerators LHC 2 LHC

Energy Frontier LHC & HL-LHC Michael Begel April 16, 2019 Michael Begel LHC at CERN LHC

DYNES: DYnamic NEtwork System Artur Barczyk California Institute of Technology / US LHCNet

LHC Open Network Environment LHC Open Network Environment LHC ONE Artur Barczyk California

Lisa Randall, Harvard University Entering LHC Era Entering LHC Era Many challenges as LHC

BSM Searches: BSM Searches: From Tevatron to LHC From Tevatron to LHC LHC start-up

Victoria Dec. 14, 2011 ATLAS CMS TRIUMF Workshop on LHC Results TRIUMF Workshop on LHC

LHC status report LHC status report Massi Massi Isnotmax Isnotmax FERRO FERRO-LUZZI , LHC

Extending ns Extending ns In OTcl In C++ Debugging Padma Haldar USC/ISI 1 2 ns

Extending CSP with tests for availability Gavin Lowe Extending CSP with tests for availability

The SEEREN Initiative The SEEREN Initiative Extending the Network into SE Europe Extending the

What do we expect from LHC(b)? Tatsuya Nakada CERN and University of Lausanne 19-23.2.2001,

LHC Computing LHC Computing Nick Brook The LHC & experiments Requirements

Trigger and DAQ at LHC Trigger and DAQ at LHC C.Schwick Contents Contents INTRODUCTION The

Physics @ LHC (Physics @ TeV) Status of LHC/ATLAS/CMS and Physics explored at LHC

Multipath TCP How one little change can make: Google more robust your iPhone service

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

Ne NetBouncer uncer: A : Act ctiv ive D e Device and ice and Link Failure Lo Li

IPv6 route lookup performance and scaling Michal Kubeek SUSE Labs mkubecek@suse.cz

Goals of this lecture After this lecture you should be able to Basic network events

CS 598: Network Security Matthew Caesar January 29, 2013 1 Why secure data centers?

Communication Networks II www.kom.tu-darmstadt.de www.httc.de Multimedia Communications / QoS

Mitigating Attacks in Unstructured Multicast Overlay Networks Cristina Nita-Rotaru,Aaron Walters,

LHCnet: Proposal for LHC Network infrastructure extending globally - PowerPoint PPT Presentation

LHCnet: Proposal for LHC Network infrastructure extending globally to Tier2 and Tier3 sites Artur Barczyk, Harvey Newman California Institute of Technology / US LHCNet LHCT2S Meeting CERN, January 13 th , 2011 1 THE PROBLEM TO SOLVE 2 LHC

presentation Rzsa CNET CNET TF-NOC flash p US LHC US LHC Sndor US LHC US LHC Netw w

LHC An invitation to further reading. Mike Lamont CERN/AB 1 CERNs accelerators LHC 2 LHC

Energy Frontier LHC &amp; HL-LHC Michael Begel April 16, 2019 Michael Begel LHC at CERN LHC

DYNES: DYnamic NEtwork System Artur Barczyk California Institute of Technology / US LHCNet

LHC Open Network Environment LHC Open Network Environment LHC ONE Artur Barczyk California

Lisa Randall, Harvard University Entering LHC Era Entering LHC Era Many challenges as LHC

BSM Searches: BSM Searches: From Tevatron to LHC From Tevatron to LHC LHC start-up

Victoria Dec. 14, 2011 ATLAS CMS TRIUMF Workshop on LHC Results TRIUMF Workshop on LHC

LHC status report LHC status report Massi Massi Isnotmax Isnotmax FERRO FERRO-LUZZI , LHC

Extending ns Extending ns In OTcl In C++ Debugging Padma Haldar USC/ISI 1 2 ns

Extending CSP with tests for availability Gavin Lowe Extending CSP with tests for availability

The SEEREN Initiative The SEEREN Initiative Extending the Network into SE Europe Extending the

What do we expect from LHC(b)? Tatsuya Nakada CERN and University of Lausanne 19-23.2.2001,

LHC Computing LHC Computing Nick Brook The LHC &amp; experiments Requirements

Trigger and DAQ at LHC Trigger and DAQ at LHC C.Schwick Contents Contents INTRODUCTION The

Physics @ LHC (Physics @ TeV) Status of LHC/ATLAS/CMS and Physics explored at LHC

Multipath TCP How one little change can make: Google more robust your iPhone service

LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs

Ne NetBouncer uncer: A : Act ctiv ive D e Device and ice and Link Failure Lo Li

IPv6 route lookup performance and scaling Michal Kubeek SUSE Labs mkubecek@suse.cz

Goals of this lecture After this lecture you should be able to Basic network events

CS 598: Network Security Matthew Caesar January 29, 2013 1 Why secure data centers?

Communication Networks II www.kom.tu-darmstadt.de www.httc.de Multimedia Communications / QoS

Mitigating Attacks in Unstructured Multicast Overlay Networks Cristina Nita-Rotaru,Aaron Walters,

Energy Frontier LHC & HL-LHC Michael Begel April 16, 2019 Michael Begel LHC at CERN LHC

LHC Computing LHC Computing Nick Brook The LHC & experiments Requirements