1
Trends in Wide Area Networking Phil DeMar (remotely) ISGC - - PowerPoint PPT Presentation
Trends in Wide Area Networking Phil DeMar (remotely) ISGC - - PowerPoint PPT Presentation
Trends in Wide Area Networking Phil DeMar (remotely) ISGC Symposium Wednesday, April 16, 2008 1 Hybrid networks & end-to-end data circuits What are hybrid networks? No single, concise definition My definition: Networks that
2
Hybrid networks & end-to-end data circuits
- What are hybrid networks?
- No single, concise definition
- My definition: Networks that mix underlying network technologies,
protocols, and/or administrative boundaries, to support network connections in a non-classic, routed-IP manner
- What is an end-to-end (E2E) circuit?
- Again, no single, concise definition
- My definition: a data circuit between two sites that appears at the IP
level to be a direct connection
- E2E circuits are established over hybrid networks
- Usually crossing multiple network boundaries & administrative domains
3
Why end-to-end circuits?
- Convergence of need, capability, & strategic direction
- And sometimes just because our stakeholders ask for them…
- Need:
- Emerging high impact data
movement needs for LHC
- Traffic projections call for rapid
increase in traffic levels
- Predictable network performance
requirements:
- Distributed experiment data storage
- Distributed analysis model
- Isolation of high impact traffic from general internet traffic
- Interactive applications don’t mingle well with bulk data transfers
4
Why end-to-end circuits? - Capability
- Emerging Research & Education (R&E) optical network infrastructure
- ffers flexible & economic growth options
- In US, Internet-2 and ESnet backbones based on leased fiber and dense
wave division multiplexing (DWDM) equipment
- Additional data channels just require adding DWDM transponders
- Facilitates wider & more flexible service offerings, such as E2E circuits
- Similar evolution of R&E networks occurring across the globe
- Site connectivity migrating along similar paths, where feasible
- Fermilab leased dark fiber to nearby StarLight international optical
network exchange in 2004
- Initial configuration: 1x10GE & 2x1GE channels
- Today, its a cornerstone of a metropolitan area network (MAN)
- Fermilab offsite network capacity = 8 x 10GE, with redundant fiber paths
- Fiber connectivity at StarLight offers wide scope of network connection options
5
Why end-to-end circuits? - Strategic Direction
- ESnet strategic direction:
- High bandwidth, scalable, reliable
production IP network service
- Very high-bandwidth network for
large scale science data flows
- ESnet Science Data Network
- MANs for National Labs local access
- The future:
- Routed IP service needs met by 10Gb/s
- SDN service grows to 20-60Gb/s
- MAN bandwidth follows SDN
6
Fermilab End-to-End Circuits
- Deployed starting in 2004
- Serve a wide spectrum of
experiments
- CMS Tier-2s = heavy users
- Implemented on multiple
technologies
- But based on end-to-end
layer-2 paths
- Usefulness has varied
- WestGrid & Apache Pt circuits decommissioned
- IP routed path performance was perfectly acceptable
7
Local Topology for Circuit Support
- Four 10GE channels for offsite traffic:
- One reserved for routed IP service
- One supports LHC traffic with CERN
- Two support E2E circuits to CMS sites
- Addtl infrastructure for legacy 1GE
connections
- Circuits based on end-to-end vLANs
- Direct peering with remote site
- Multiple provider domains is the norm
- Technology varies by domains involved
- Complexity higher than IP service
8
Rerouting E2E circuit Traffic
- Identify high impact traffic flows:
- Based on source/dest. netblock pairs
- Deploy alternate path border router for
E2E`circuits
- Peer directly across circuits, advertising only
source netblock
- Implement alternate forwarding:
- Policy route outbound on source/dest pairs:
- Coordinate similar policy of remote end
- Implement inbound policy routing on
alternate path border router
- Unexpected traffic rerouted to general IP path
9
Usefulness of E2E Circuits
- Traffic characterization shows
FNAL E2E circuit traffic dominates
- Particularly during peak traffic
periods
- July 2007 peak in excess of 2PB
- Equivalent to sustained 7Gb/s for
the month
- A few sites capable of 7-8Gb/s
- Ratio of circuit-based traffic to routed traffic also a reflection of
performance capability
- US Tier-2s (circuit-based) currently sustaining 2-3 Gb/s and higher
- European & Asian Tier-2s (routed) typically sustain sub-gigabit rates
- Performance tuning at longer round trip times still a bit of an art…
10 Static topology, dedicated link
Most of our E2E circuits today Scalability & cost concerns
Static topology, dynamically scheduled
Scheduling policy issues hinder use
Static topology, dynamic provisioning
Emerging services becoming increasingly available
Dynamic topology; dynamic provisioning
Nirvana, but a long way from being feasible
Types of End-to-End Circuits
Likely evolution of E2E circuit technology Likely evolution of E2E circuit technology We are somewhere in here
11
A Static Hybrid Network – The LHCOPN
- LHCOPN = LHC Optical Private
Network
- A dedicated 10Gb/s network for
LHC T0/T1 data movement
- Based on layer-2 technologies
- Routing only at the end points
- Pass-thru routing at CERN
facilitates Tier-1/Tier-1 transfers
- Redundancy can be difficult/costly
- Each LHCOPN circuit is a
concatenation of layer 2 links:
- Involves multiple
administrative domains
12
Emerging Dynamic Circuit Services
- Internet2 & ESnet now offering integrated dynamic circuit services
- Integrated control plane via functional Inter- Domain Controller (IDC)
- Independent, local domain controllers (DCs) manage within domains
- Dynamic circuit provisioning makes
network paths potentially a service
- Path setup & selection servers:
- Fermilab’s Lambda Station
- Brookhaven’s TeraPaths
- Dynamic circuit trials underway:
- FNAL/Univ. of Nebraska
- BNL/Boston Univ.
13
A path across DCN Fermilab
Lambda Station Server
University
- f Nebraska
Network infrastructure Network infrastructure
Routed R&E Network
Default network path
Internet2 DCS ESnet OSCARs
Lambda Station Server
Control plane
FtWatch request for circuit Inter-Lambda Station coordination Circuit call setup & teardown LAN reconfiguration to use circuit
FNAL / Nebraska Dynamic Circuit
IDC/ESNet IDC/Internet2
Flow data Flow analysis
14
Large-scale data recover via DCN
Shortly after deploying their
dynamic circuit, Nebraska lost their Tier-2 data cache
50TB Data recovered by data
transfer from FNAL Tier-1
Largely via Internet2/ESNet
Dynamic Circuit
Completed in 32 hours Graceful cutover to & fall back
from the dynamic circuit
15
Operational issues with circuits
E2E circuit failure modes are different than for IP service
They are more complex Impact of the failure may be severely felt elsewhere Operational failures can be “creative” and difficult to troubleshoot Monitoring infrastructure is critical, but requires new effort
Asymmetric paths will occur and will be difficult to detect
We’re working on flow data analysis to detect this
Unexpect consequences
for changes
UNL moves several T2
systems to a new subnet
ESnet IP path UNL CMS traffic
16
PerfSonar for Monitoring Capabilities
Motivation for PerfSonar
Need for integrated path monitoring capability across multiple
administrative domains
- Not specifically for E2E circuits, but definitely a stronger need
What is PerfSonar?
A global collaboration An architecture and a set of protocols Several interoperable software implementations
- Multi-Domain Monitoring (MDM) utility {Dante}
- PerfSonar-PS {US-based collaboration headed by Internet2}
A measurement infrastructure
17
PerfSonar Architecture
MP = measurement point
18
E2E Link Monitoring (CERN-LHCOPN-FNAL-001)
- LHCOPN link status monitored by E2ECU (Dante service)
- PerfSonar monitoring within each administrative domain
- Independently-managed Measurement Points (MPs)
- Centralized collection (E2ECU) of circuit status from each domain
- Location of faults identified & displayed
- Appropriate alarms & notifications sent out
19
PerfSonar Active Performance Monitoring
- High Level (near term) Objectives:
- Detect & react to changes in the underlying network
- Quickly identify network problems:
- Even if not noticeably impacting applications
- Distinguish between application level & network level problems
- Types of active measurements:
- One-way delay (OWAMP & Hades)
- Round trip time (Pinger)
- Bandwidth measurement (BWCTL)
- Path monitoring (Traceroute)
- Less interesting within E2E circuit environment
- Pending implementations:
- LHCOPN to be monitored by centrally-managed MDM appliance {GEANT}
- (Some) LHC Tier-1/Tier-2 paths with PerfSonar-PS
20
Winding It Up…
- End-to-end circuits have proven to be useful at FNAL
- At least for LHC/CMS high impact data movement
- There is certainly additional management & support cost involved
- Complexity is an obvious concern
- Scalability too…
- Additional monitoring infrastructure required
- I predict suspect that we will see a natural selection process play out
- What works & is worth the effort will remain and grow
- What doesn’t prove to be worth the effort will disappear
- Dynamic E2E circuit service presents an increasingly attractive option
for large scale data movement
- PerfSonar emerging as inter-domain network path monitoring tool
- May present opportunities to applications in the future