Network activity in EGEE-III SA2 Xavier Jeannin (CNRS/UREC) SA2 - - PowerPoint PPT Presentation

network activity in egee iii sa2
SMART_READER_LITE
LIVE PREVIEW

Network activity in EGEE-III SA2 Xavier Jeannin (CNRS/UREC) SA2 - - PowerPoint PPT Presentation

Enabling Grids for E-sciencE Network activity in EGEE-III SA2 Xavier Jeannin (CNRS/UREC) SA2 Activity Manager 7th NRENs and Grids Workshop (Dublin) 1/2 September 2008 www.eu-egee.org EGEE and gLite are registered trademarks EGEE-III


slide-1
SLIDE 1

EGEE-III INFSO-RI-222667

Enabling Grids for E-sciencE

www.eu-egee.org

EGEE and gLite are registered trademarks

Network activity in EGEE-III SA2

Xavier Jeannin (CNRS/UREC) SA2 Activity Manager 7th NRENs and Grids Workshop (Dublin) 1/2 September 2008

slide-2
SLIDE 2

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 2

Agenda

  • EGEE size and statistics
  • SA2 Network activity

– Technical Network Liaison Committee TNLC – EGEE Network Operations Center EGEE – EGEE-III Projects

LHCOPN support / operational Model Trouble matching and correlation Tools for troubleshooting Grid site networking needs Advanced network services IPv6 Trouble Ticket standardization

  • European Grid Initiative, National Grid Initiative

– Lesson learnt from EGEE – Network activity in EGI/NGI

  • Conclusion
slide-3
SLIDE 3

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 3

EGEE: the largest multi-disciplinary research Grid infrastructure in the world

50 100 150 200 250 300 avr.-04 juil.-04

  • ct.-04

janv.-05 avr.-05 juil.-05

  • ct.-05

janv.-06 avr.-06 juil.-06

  • ct.-06

janv.-07 avr.-07 juil.-07

  • ct.-07

janv.-08 avr.-08

  • No. Sites

20000 40000 60000 80000 avr.-04 juil.-04

  • ct.-04

janv.-05 avr.-05 juil.-05

  • ct.-05

janv.-06 avr.-06 juil.-06

  • ct.-06

janv.-07 avr.-07 juil.-07

  • ct.-07

janv.-08 avr.-08

  • No. Cores
slide-4
SLIDE 4

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 With the courtesy of Bob Jones SA2: Network activity in EGEE-III 4

Users and resources distribution

Feb’08

slide-5
SLIDE 5

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 With the courtesy of Erwin Laure SA2: Network activity in EGEE-III 5

Highlights of EGEE-II - Applications

  • >270 VOs from several

scientific domains

– Astronomy & Astrophysics – Civil Protection – Computational Chemistry – Comp. Fluid Dynamics – Computer Science/Tools – Condensed Matter Physics – Earth Sciences – Fusion – High Energy Physics – Life Sciences

  • Further applications

under evaluation Applications are moving from testing to routine and daily usage

slide-6
SLIDE 6

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

NA1 2% NA2 5% NA3 8% NA4 19% NA5 1% SA1 49% SA2 2% SA3 9% JRA1 5%

SA2: Network activity in EGEE-III 6

SA2 in EGEE-III

  • Total of 375 FTEs in EGEE-III

– 9010 person months (vs. 11165 PMs in EGEE-II; ~20% less) – Grand total combining funded and unfunded contributions

No difference for execution of program of work!

  • Network activity SA2 = 14 persons + TNLC, 159 PMs
slide-7
SLIDE 7

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

SA2 – EGEE-III

SA2 Global view

SA2: Network activity in EGEE-III 7

Support for the ENOC IPv6 (GARR, CNRS) Operational procedures (CNRS) LCG Support (CNRS) Operational tools and maintenance (RRC-KI, CNRS) Overall Networking coordination ENOC running TT exchange standard (GRNET) Advanced network services (GRNET) TNLC IPv6 (GARR, CNRS) Monitoring (DFN)s Site networking needs (RedIRIS) Troubleshooting (DFN)

slide-8
SLIDE 8

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 8

Technical Network Liaison Committee

  • Technical Network Liaison Committee – TNLC

– Facilitate cooperation between EGEE on the one hand and GÉANT2 and the NRENs on the other hand – CERN; CNRS, France; DANTE, UK - the GÉANT2 operator; RRC KI, Russia; DFN-Verein, Germany; GARR, Italy; GRNET, Greece; RedIRIS Spain...

  • Main themes

– Monitoring (E2ECU, monitoring LHCOPN/EGI) – Standardization of network trouble tickets (Assessment of the impact on the grid of a trouble ticket) – Advanced network services (AMPS/SLA, new network advanced services)

slide-9
SLIDE 9

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 June 2008 9

EGEE’08 conference

  • NRENs are invited to take part in the TNLC
slide-10
SLIDE 10

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Role of the ENOC

  • ENOC ensuring E2E connectivity for Grid sites
  • Assess the impact on the Grid of network trouble
  • Troubleshoot problems

– Provide support to users – Identify the faulty domain

  • Assess the network connectivity of the Grid sites

SA2: Network activity in EGEE-III 10

GÉANT2

NREN A

RC 1

Grid site 1

NREN B

RC 2

Grid site 2

Operated by DANTE Operated by NOC of NREN A Operated by NOC of NREN B Operated by NOC of RC2 Operated by NOC of RC1

ENOC ensuring E2E connectivity for Grid sites on the whole path

slide-11
SLIDE 11

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

The ENOC

– A single point of contact between EGEE and the NRENs where EGEE and the network can exchange operational information – A Network support unit in GGUS (trouble ticket system of EGEE)

SA2: Network activity in EGEE-III 11

  • Sites

GGUS Users Support Units

  • NRENs

GÉANT2

  • EGEE

Network

  • Sites
  • Sites

Sites

  • NRENs
  • NRENs

NRENs ENOC

  • Interface with network providers:

– Collect tickets from NRENs – Assess impact on the grid infrastructure – Forward to GGUS tickets that seem relevant

  • Interface with the EGEE user support:

– Receive tickets assigned to ENOC by the GGUS 1st level support – Troubleshoot them provided that the ENOC has access to suitable monitoring tools – Contact identified faulty domains or reassign ticket to the associated site if this is local network issue

slide-12
SLIDE 12

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 12

Assess the network connectivity of the Grid sites

  • Specific tools developed: Downcollector, see https://ccenoc.in2p3.fr/

100 200 300 400 500 600 700 800 900 1000 August 07 September October November December January 08 February March

Number of connectivity troubles detected on EGEE Grid certified sites sorted per supposed location WAN/MAN LAN / Non network (power…) Unknown Number of sites with at least one network trouble

282 Certified Grid Sites

slide-13
SLIDE 13

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667

Support of LHCOPN

SA2: Network activity in EGEE-III 13

http://ccenoc.in2p3.fr/ASPDrawer/

The LHC Optical Private Network

15 PB of data per year generated by the LHC

slide-14
SLIDE 14

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 14

Support of LHCOPN

  • SA2 objectives in LHCOPN context are:

– Define the operational Model

Define accurately responsibilities of each actor Ensure a problem resolution is not delayed by an unsuitable

  • perational model

Ensure the LHCOPN is well monitored

– Set up communication channels between this network and the EGEE Grid (scheduled downtimes, incidents etc.)

  • LHCOPN operational model:

– Federative Model, responsibility shared by Tiers 1 and Tier 0 – Approach: Define actors and their relationship, Where to find the information, The procedure

Every actor agrees on the operational model and are aware of their role and the procedure they should apply

– Draft: Operational model WIKI

slide-15
SLIDE 15

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 15

LHCOPN Operational model

slide-16
SLIDE 16

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 16

LHCOPN Operational model

slide-17
SLIDE 17

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 17

Trouble matching and correlation RRC-KI

  • Trouble matching and correlation for the ENOC

– From a discovered incident find the related network trouble ticket – Better trouble localisation – Different methods will be tested

  • First method

– Another monitoring tool (smoke ping) has been set up, located in Russia – The results of this tool and those from ENOC (Downcollector, Lyon) are matched up – The two tools are located in two different places in order to improve the knowledge of the network topology

slide-18
SLIDE 18

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 18

Network Operational Database

slide-19
SLIDE 19

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 19

Tools for troubleshooting DFN

  • Tools for efficient troubleshooting

– Launch test on demand from the Grid site under central server control: ping, traceroute, DNS lookup, nmap and bandwith measurements.

ENOC Local site light PerfSONAR’s sensor Central ENOC monitoring server 1 Grid site B 3 2 4 5 ENOC supervisor Site administrator Grid site A

slide-20
SLIDE 20

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 20

Tools for troubleshooting DFN

  • Active measure on demand, light weight PerfSONAR

version with a specific plug-in

  • Look for beta-tester sites
  • NRENs can take advantage of the deployment of this

software

– To troubleshoot their own grid nodes

slide-21
SLIDE 21

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 21

Grid site networking needs RedIRIS

  • Establish by an empirical way the site needs in term of

network needs according to type of

– Site (Tiers 0, 1, 2, 3) – Experiment computed in the site

  • Working plan

– Review of the status of Tier2 / Tier3 in Spain – Translate the requirements and needs to network parameters to be measured. – Brief review of different network performance and monitoring tools that tiers agree to deploy – Pilot / Service definition for deploying perfSONAR – Performance and monitoring tests definition – Tests phase, Results and conclusions.

slide-22
SLIDE 22

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III

Advanced network services GRNET

  • Enable access for applications to the advanced services provided

by the NRENs

  • SLA automation in multi-domain environment through AMPS

(Advance Multi-domain Provisioning )

– Overcome the lack of automated mechanisms

  • SLA monitoring in EGEE

– Automate the monitoring procedure and generate alarms. – perfSONAR

  • Investigate the new advanced network services soon available

– Dynamic lightpath?

22

slide-23
SLIDE 23

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 23

IPv6 follow-up GARR/CNRS

  • Set up all elements needed to handle IPv6 in EGEE

– Middleware, testbed

gLite internal dependencies, IPv6 compliance

  • DPM-LFC, BDII

External dependencies

  • Assessment of IPv6 compliance of external modules
  • Deep test for important external modules: Grid-FTP …

– Validation process of EGEE (SA3) – IPv6 knowledge dissemination

Training course, presentation

  • Assess and make available an operational EGEE IPv6

site - according to which IPv6 gLite modules are available

slide-24
SLIDE 24

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 24

Trouble ticket exchange CNRS/GRNET

  • Defined by the TNLC (GARR, GRNET, RCC-KI, SRCE)
  • Standard trouble tickets allow a better

– Location of the problem – Assessment of the impact of trouble on the grid

  • The translation can be done in

– The ENOC, central server translating NREN’s ticket into standard ticket – The NREN domain

  • Software will be soon available
  • The translator can easily be adapted to the requirement
  • f NRENs willing to deliver directly standard
  • Standard trouble tickets will benefit both to NRENs and

Grid project

slide-25
SLIDE 25

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 25

The European Grid Initiative

  • Must be no gap in the

support of the production grid

  • Need to prepare permanent Grid infrastructure
  • Coordinate the integration and interaction between National Grid

Infrastructures (NGIs)

  • Experimental/research task should switch to production phases
  • Establish at EGI level a sustainability collaboration between Grid

and Network people

  • A major stake for NRENs
slide-26
SLIDE 26

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 26

The Lesson learnt from EGEE

  • Future European Grid Initiative network activity:
  • Troubleshooting activity should be lowered at minimum (only big

issues)

  • Interaction (process, trouble sharing) and integration (operation design,

monitoring…) with the Grid are essentials at project level

  • Trouble ticket handling should be turned into a knowledge database

and used as a part of the quality network monitoring

  • Network monitoring is an open subject in EGI-NGI
  • The NGI/EGI will federate several grid projects and therefore handle

more sites and more networks

  • Future possibilities offered from networks to the Grid should not be

missed: Dynamic lightpath provisioning (Internet2, Phosphorus…), Ipv6 compliance

  • Network quality control should be fostered (statistics, MoU checking,

feedbacks to network providers…)

slide-27
SLIDE 27

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 27

Network activity in EGI/NGI

  • Network activity key objectives in EGI/NGI
  • Interface between the European Grid Infrastructure

and networks providers

  • Monitor the quality of networks used by Grid project:
  • Public: Educational and research network.
  • Private: Non educational network providers (commercial…)
  • Dedicated: LHCOPN, LHC Optical Private Network…
  • Ensure that application’s network requirements are

fulfilled / monitoring

  • Put new network technologies forward in the Grid

process.

slide-28
SLIDE 28

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 28

Conclusion

  • Trouble ticket standardization
  • Tools for troubleshooting

– Light weight PerfSONAR deployed on grid site

  • Network monitoring for EGI
  • Collaboration with NRENs around

– Specifics topics (Network monitoring of grid sites, trouble ticket, assessment of the impact of trouble on the grid) – through TNLC

  • Establish a future collaboration between NRENs and

NGI/EGI

slide-29
SLIDE 29

Enabling Grids for E-sciencE

EGEE-III INFSO-RI-222667 SA2: Network activity in EGEE-III 29

Thank you.