LHCONE status and future Alice workshop Tsukuba, 7 th March 2014 - - PowerPoint PPT Presentation

lhcone status and future
SMART_READER_LITE
LIVE PREVIEW

LHCONE status and future Alice workshop Tsukuba, 7 th March 2014 - - PowerPoint PPT Presentation

LHCONE status and future Alice workshop Tsukuba, 7 th March 2014 Edoardo.Martelli@cern.ch CERN IT Department CH-1211 Genve 23 Switzerland www.cern.ch/i t 1 Summary - Networking for WLCG - LHCOPN - LHCONE - services - how to join -


slide-1
SLIDE 1

1

CERN IT Department CH-1211 Genève 23 Switzerland

www.cern.ch/it

LHCONE status and future

Alice workshop

Tsukuba, 7th March 2014 Edoardo.Martelli@cern.ch

slide-2
SLIDE 2

2

Summary

  • Networking for WLCG
  • LHCOPN
  • LHCONE
  • services
  • how to join
  • LHCONE in Asia
slide-3
SLIDE 3

3

Networking for WLCG

slide-4
SLIDE 4

4

Worldwide LHC Computing Grid WLCG sites:

  • 1 Tier0 (CERN)
  • 13 Tier1s
  • ~170 Tier2s
  • >300 Tier3s worldwide
slide-5
SLIDE 5

5

Planning for Run2

“The Network infrastructure is the most reliable service we have” “Network Bandwidth (rather than disk) will need to scale more with users and data volume” “Data placement will be driven by demand for analysis and not pre- placement”

Ian Bird, WLCG project leader

slide-6
SLIDE 6

6

Computing model evolution

Original MONARCH model Model evolution

slide-7
SLIDE 7

7

Technology Trends

  • Commodity Servers with 10G NICs
  • High-end Servers with 40G NICs
  • 40G and 100G interfaces on switches and routers

Needs for 100Gbps backbones to host large data flows >10Gbps and soon >40Gbps

slide-8
SLIDE 8

8

Role of Networks in WLCG

Computer Networks even more essential component of WLCG Data analysis in Run 2 will need more network bandwidth between any pair of sites

slide-9
SLIDE 9

9

LHCOPN LHC Optical Private Network

slide-10
SLIDE 10

10

What LHCOPN is:

Private network connecting Tier0 and Tier1s Reserved to LHC data transfers and analysis Dedicated large bandwidth links Highly resilient architecture

slide-11
SLIDE 11

11

A collaborative effort

Layer3: Designed, built and operated by the Tier0-Tier1s community Layer1-2: Links provided by Research and Education network providers: Asnet, ASGCnet, Canarie, DFN, Esnet, GARR, Geant, JANET, Kreonet, Nordunet, Rediris, Renater, Surfnet, SWITCH, TWAREN, USLHCnet

slide-12
SLIDE 12

12

Topology

█ = Alice █ = Atlas █ = CMS █ = LHCb edoardo.martelli@cern.ch 20131113

TW-ASGC

█ █

CA-TRIUMF

US-T1-BNL

US-FNAL-CMS

FR-CCIN2P3

█ █ █ █

ES-PIC

█ █ █

IT-INFN-CNAF

█ █ █ █

DE-KIT

█ █ █ █

NL-T1

█ █ █

UK-T1-RAL

█ █ █ █

NDGF

█ █

RRC-K1-T1

█ █ █

KR_KISTI

CH-CERN

█ █ █ █

slide-13
SLIDE 13

13

Technology

  • Single and bundled long distance 10G Ethernet

links

  • Multiple redundant paths. Star and Partial-Mesh

topology

  • BGP routing: communities for traffic engineering,

load balancing.

  • Security: only declared IP prefixes can exchange

traffic.

slide-14
SLIDE 14

14

LHCOPN future

  • The LHCOPN will be kept as the main network to

exchange data among the Tier0 and Tier1s

  • Links to the Tier0 may be soon upgraded to

multiple 10Gbps (waiting for Run2 to see the real needs)

slide-15
SLIDE 15

15

LHCONE

LHC Open Network Environment

slide-16
SLIDE 16

16

New computing model impact

  • Better and more dynamic use of storage
  • Reduced load on the Tier1s for data serving
  • Increased speed to populate analysis facilities

Needs for a faster, predictable, pervasive network connecting Tier1s and Tier2s

slide-17
SLIDE 17

17

Requirements from the Experiments

  • Connecting any pair of sites, regardless of the

continent they reside

  • Site's bandwidth ranging from 1Gbps (Minimal),

10Gbps (Nominal), to 100G (Leadership)

  • Scalability: sites are expected to grow
  • Flexibility: sites may join and leave at any time
  • Predictable cost: well defined cost, and not too

high

slide-18
SLIDE 18

18

LHCONE concepts

  • Serving any LHC sites according to their needs and

allowing them to grow

  • Sharing the cost and use of expensive resources
  • A collaborative effort among Research & Education

Network Providers

  • Traffic separation: no clash with other data transfer,

resource allocated for and funded by the HEP community

slide-19
SLIDE 19

19

Governance

LHCONE is a community effort. All stakeholders involved: TierXs, Network Operators, LHC Experiments, CERN.

slide-20
SLIDE 20

20

LHCONE services

L3VPN (VRF): routed Virtual Private Network -

  • perational

P2P: dedicated, bandwidth guaranteed, point-to- point links - development perfSONAR: monitoring infrastructure

slide-21
SLIDE 21

21

LHCONE L3VPN

slide-22
SLIDE 22

22

What LHCONE L3VPN is:

Layer3 (routed) Virtual Private Network Dedicated worldwide backbone connecting Tier1s, T2s and T3s at high bandwidth Reserved to LHC data transfers and analysis

slide-23
SLIDE 23

23

Advantages

Bandwidth dedicated to LHC data analysis, no contention with other research projects Well defined cost tag for WLCG networking Trusted traffic that can bypass firewalls

slide-24
SLIDE 24

24

LHCONE L3VPN architecture

  • TierX sites connected to National-VRFs or Continental-VRFs
  • National-VRFs interconnected via Continental-VRFs
  • Continental-VRFs interconnected by trans-continental/trans-oceanic links

Acronyms: VRF = Virtual Routing Forwarding (virtual routing instance)

Continental VRFs Continental VRFs

Transcontinental links

Continental VRFs Continental VRFs

Transcontinental links

National VRFs

Cross-Border links

TierXs

National links

National VRFs

Cross-Border links

TierXs

National links

National VRFs

Cross-Border links

TierXs

National links

TierXs

LHCONE

National VRFs TierXs

slide-25
SLIDE 25

25

Current L3VPN topology

ESnet USA

Chicago New York

BNL-T1

Internet2 USA

Harvard

CANARIE Canada

UVic SimFraU

TRIUMF-T1

UAlb UTor McGilU Seattle

TWAREN Taiwan

NCU NTU

ASGC Taiwan

ASGC-T1

KERONET2 Korea

KNU LHCONE VRF domain End sites – LHC Tier 2 or Tier 3 unless indicated as Tier 1 Regional R&E communication nexus Data communication links, 10, 20, and 30 Gb/s See http://lhcone.net for details.

NTU Chicago

NORDUnet Nordic

NDGF-T1a

NDGF-T1a NDGF-T1c

DFN Germany

DESY GSI

DE-KIT-T1

GARR Italy

INFN-Nap

CNAF-T1

RedIRIS Spain

PIC-T1

SARA Netherlands

NIKHEF-T1

RENATER France

GRIF-IN2P3 Washington

CUDI Mexico

UNAM

CC-IN2P3-T1

Sub-IN2P3 CEA

CERN Geneva

CERN-T1

SLAC GLakes NE MidW SoW Geneva KISTI Korea TIFR India

India Korea

FNAL-T1

MIT Caltech UFlorida UNeb PurU UCSD UWisc

UltraLight

UMich Amsterdam

GÉANT Europe

April 2012

credits: Joe Metzger, ESnet

slide-26
SLIDE 26

26

Status

Over 15 national and international Research Networks Several Open Exchange Points including NetherLight, StarLight, MANLAN, CERNlight and others Trans-Atlantic connectivity provided by ACE, GEANT, NORDUNET and USLHCNET ~50 end sites connected to LHCONE:

  • 8 Tier1s
  • 40 Tier2s

Credits: Mian Usman, Dante More Information: https://indico.cern.ch/event/269840/contribution/4/material/slides/0.ppt

slide-27
SLIDE 27

27

Operations

Usual Service Provider operational model: a TierX must refer to the VRF providing the local connectivity Bi-weekly call among all the VRF operators and concerned TierXs

slide-28
SLIDE 28

28

How to join the L3VPN

slide-29
SLIDE 29

29

Pre-requisites

The TierX site needs to have:

  • Public IP addresses
  • A public Autonomous System (AS) number
  • A BGP capable router
slide-30
SLIDE 30

30

How to connect

The TierX has to:

  • Contact the Network Provider that runs the

closest LHCONE VRF

  • Agree on the cost of the access
  • Lease a link from the TierX site to the closest

LHCONE VRF PoP (Point of Presence)

  • Configure the BGP peering with the Network

Provider

slide-31
SLIDE 31

31

TierX routing setup

  • The TierX announce only the IP subnets used

for WLCG servers

  • The TierX accepts all the prefixes announced by

the LHCONE VRF

  • The TierX must assure traffic symmetry: injects
  • nly packets sourced by the announced subnets
  • LHCONE traffic may be allowed to bypass the

central firewall (up to the TierX to decide)

slide-32
SLIDE 32

32

Symmetric traffic is essential

Beware: statefull firewalls discard unidirectional TCP connections

CERN LCG backbone CERN Campus backbone Internet LHCONE

Default Default TierX LCG destinations A l l d e s t i n a t i

  • n

s All CERN's destinations

Border Network

LCG host Campus host LCG host Stateful firewall Drops asymmetric TCP flows Stateless ACLs Campus host LHCONE host to LHCONE host CERN's LHCONE host to TierX not LHCONE host CERN's not LHCONE host to TierX's LHCONE host Default

CERN

CERN LCG destinations A l l d e s t i n a t i

  • n

s A l l d e s t i n a t i

  • n

s C E R N L C G d e s t i n a t i

  • n

s

TierX

slide-33
SLIDE 33

33

Symmetry setup

To achieve symmetry, a TierX can use one of the following techniques:

  • Policy Base Routing (source-destination routing)
  • Physically Separated networks
  • Virtually separated networks (VRF)
  • Scienze DMZ
slide-34
SLIDE 34

34

Scienze DMZ

http://fasterdata.es.net/science-dmz/science-dmz-architecture/

slide-35
SLIDE 35

35

LHCONE P2P Guaranteed bandwidth point-to-

point links

slide-36
SLIDE 36

36

What LHCONE P2P is (will be):

On demand point-to-point (P2P) link system over a multi-domain network Provides P2P links between any pair of TierX Provides dedicated P2P links with guaranteed bandwidth (protected from any other traffic) Accessible and configurable via software API

slide-37
SLIDE 37

37

Status

Work in progress: still in design phase Challenges:

  • multi-domain provisioning system
  • intra-TierX connectivity
  • TierX-TierY routing
  • interfaces with WLCG software
slide-38
SLIDE 38

38

LHCONE perfSONAR

slide-39
SLIDE 39

39

What LHCONE perfSONAR is

LHCONE Network monitoring infrastructure Probe installed at the VRFs interconnecting points and at the TierXs Accessible to any TierX for network healthiness checks

slide-40
SLIDE 40

40

perfSONAR

  • framework for active and passive network probing
  • developed by Internet2, Esnet, Geant and others
  • two interoperable flavors: perfSONAR-PS and

perfSONAR-MDM

  • WLCG recommended version: perfSONAR-PS
slide-41
SLIDE 41

41

Status

Endorsed by WLCG to be a standard WLCG service Probes already deployed in many TierXs. Being deployed in the VRF networks Full information:

https://twiki.cern.ch/twiki/bin/view/LCG/PerfsonarDeployment

slide-42
SLIDE 42

42

LHCONE-L3VPN in Asia

slide-43
SLIDE 43

43

Connectivity status

Only few sites connected to LHCONE-L3VPN in ASIA via ASGC or with direct link to the US or Europe Connectivity between ASIA and North America not scarce, but transit to Europe may not be adequate Un-coordinated effort

slide-44
SLIDE 44

45

Working together

ASCG is willing to share the use of their links to North-America and Europe with other Asian TierXs Anyone interested to connect to the Asian LHCONE or share their trans-continental links, please get in touch with us

slide-45
SLIDE 45

46

Anyway: You have to tune!

TCP Throughput <= TCPWinSize / RTT

Tokyo-CERN RTT (Round Trip Time): 280 ms Default Max TCPWinSize for Linux = 256KBytes ( = 2.048Mbit) Tokyo-CERN throughput <= 2.048Mb / 0.280sec = 7.31Mbps :-(

Remote TierXs must tune server and client TCP Kernel parameters to get decent throughput!

slide-46
SLIDE 46

47

LHCONE evolution

slide-47
SLIDE 47

48

LHCONE evolution

  • VRFs have started upgrading internal links and

links to TierXs to 100Gbps

  • VRFs interconnecting links will be upgraded to
  • 100Gbps. 100Gbps Transatlantic link being

tested.

  • Operations need to be improved, especially how

to support a TierX in case of performance issue

  • perfSONAR deployment will be boosted
slide-48
SLIDE 48

49

LHCONE evolution

  • LHCONE-P2P take off still uncertain
  • LHCONE-L3VPN must be better developed in

ASIA

slide-49
SLIDE 49

50

Conclusions

slide-50
SLIDE 50

51

Conclusions

  • New Computing Models will relay even more on

good and abundant network connectivity

  • TierXs need to improve their network

connectivity

  • LHCONE-L3VPN is a viable solution already

adopted by many Tier1/2s

slide-51
SLIDE 51

52

More information

Last LHCONE workshop: https://indico.cern.ch/event/289679/ LHCONE websites: http://lhcone.net https://twiki.cern.ch/twiki/bin/view/LHCONE/WebHome Weekly audio conference: Monday 14:30 GMT, alternating every second week architecture and operations Mailing lists: lhcone-operations@cern.ch lhcone-architecture@cern.ch

slide-52
SLIDE 52

53

Questions?