NREN N NOC TF-NOC preparation meeti ing Copenhagen May 3, 2010 - - PowerPoint PPT Presentation

nren n noc
SMART_READER_LITE
LIVE PREVIEW

NREN N NOC TF-NOC preparation meeti ing Copenhagen May 3, 2010 - - PowerPoint PPT Presentation

NREN N NOC TF-NOC preparation meeti ing Copenhagen May 3, 2010 Hvard Kusslid, NOC C-manager, UNINETT hk@un ninett.no UNINETT NOC , history UNINETT The Norw wegian research network t level) outsourced. Pre 2002: heldesk (1 st


slide-1
SLIDE 1

NREN

TF-NOC preparation meeti

Håvard Kusslid, NOC hk@un

N NOC

ing Copenhagen May 3, 2010

C-manager, UNINETT ninett.no

slide-2
SLIDE 2

UNINETT NOC

 UNINETT –The Norw  Pre 2002: heldesk (1st

UNINETT technicians a no engineer on duty ou g y

 Customers felt at dist

bypassing established ch bypassing established ch favourite contact direct

 Unsatisfactory worki

personell,

 24/7 inhouse NOC si

, history

wegian research network

t level) outsourced.

acted on tickets / requests, utside office hours. 2 tance / " in the dark", often hannels calling their hannels, calling their tly. ing conditions for key ince march 2002

slide-3
SLIDE 3

UNINETT NOC

S d d NOC H M d Standard NOC Hours: Mond

 Full monitoring of network a  2 persons on daily duty, pool  A mix of network engineers

developement engineers.

 Minimum 1 network enginee  Minimum 1 network enginee

Extended hours/ Weekend:

 Duty engineer on call for pri  Duty engineer on call for pri  Periodic status checking of n

responding to monitoring syste R i t f d t i

 Requirement for duty engine

knowledge and basic to fair sy

 A pool of 12 duty engineers

  • 24/7

d F id 08 00 16 00 day - Friday: 08:00 – 16:00 and services. l of 20 engineers. 3 and system support/ er on each shift initially. er on each shift initially. iority issues iority issues network and services and ems alarms. i d th t k eer: in-depth network ystem knowledge. doing one-week periods.

slide-4
SLIDE 4

NOC operators

All k

 All our network engineers p

including noc manager and ou

  • Services. Rotating staff, no de

g recently we had two permane week duty with one from the

 Recently extended the pool

persons with system backgro k d ) network dept).

 One “veteran” noc-operato

level of experience is more "u p participate as noc-operators, ur director of Network and signated noc operators. (Untill g p ( ent operators, alternating on e “pool”.) 4 l of noc-operators with

  • und. (Merged system and
  • r allways on duty untill the

uniform".

slide-5
SLIDE 5

NOC layout

 NOC centre with 3 work-s  Homegrown monitoring en

  • eg ow o to

g e for network and services.

 Dual-screen pc's (ubuntu) +

One overhead monitor wit

 One overhead monitor wit  One large-screen monitor f  Standalone environment for

generator)

Dedicated access-switches i

"floor" patch room and local floor patch room and local core switch in the main serve stations (two manned) nvironment for status display v o e t o status sp ay + place for personal laptop h permanent live view 5 h permanent live view for visualisations/ show-off r power (ups with backup n the NOC bypassing the net, connecting directly to net, connecting directly to er facility in the basement.

slide-6
SLIDE 6

NOC Tasks, overview

 Problem management, chan  Monitors network equipme

  • to s etwo

equ p e

 Monitors services (inhouse  Coordinates the distributio

Coordinates planned work

 Coordinates planned work

circuit providers and NORD

 Router configuration servic  Allocates and manages IP ad  DNS and registry services

w

ge management. ent and status. e t a status. and external systems).

  • n and updating of software

and outages with customers 6 and outages with customers, Unet. ces, our own and customers ddresses

slide-7
SLIDE 7

Problem manageme

The NOC performes the usua problem is occuring: p g

 problem identification, troub  notification, escalation (if n  problem resolution confirm  problem resolution, confirm

Problems and events are logg if follow up is needed a tick

 if follow-up is needed a tick We do not currently ticket ev

more..) i k / k l d b

 Ticket system/ knowledgeba Problem Management statist

but are allso "on the horizon"

ent

al steps when an outage or bleshooting necessary) ming status with customers 7 ming status with customers. ged in a daily watch log ket is logged ket is logged. verything (but need to ticket i b i h d ase is beeing researched. tics are not currently produced, ".

slide-8
SLIDE 8

Change managemen

The NOC coordinate netwo The NOC coordinate netwo maintenance, and assists staff customers with configuration Change review, approval, s

 Changes that will have a co

network or topology are revi responsible for the work beei

 Scheduled maintenance are

Scheduled maintenance are

 If the changes meet certain

during set service windows.

 Emergency changes are han  Emergency changes are han  Changes that need immedia

may be subject to short notice

nt

  • rk installations and
  • rk installations and

f in the field. We allso assist n changes on when asked. scheduling, and notification: nsiderable effect on the 8 iewed by the section ing done. notified in advance. notified in advance. criteria thay are performed ndled on a case by case basis ndled on a case-by-case basis. ate attention are notified but e or "post-event" notification.

slide-9
SLIDE 9

Supporting facilities

Test Lab: We have testing fac Test Lab: We have testing fac software before incorporating simulation/ troubleshooting. N Inventory system: "Next busi larger equipment (larger than C larger Juniper routers) Inhouse stock of smaller route and ship replacement equipme p p q p We allso collaborate with the u emergency replacement from t We have a basic home-grown We have a basic home-grown thresholds notifications for gb different connectors etc.

s

cilities for new hardware and cilities for new hardware and it in the network, and for Not run by the NOC. iness day" agreements for Cisco 6500-series and for the 9 ers/ switches - able to recreate ent from our own premises . p universities and have use of their spare equipment stock. inventory system with inventory system with bics, fiber patch cables with

slide-10
SLIDE 10

NOC T

  • ols:

/

 Home grown/ open source

net/ services: ZINO for netw Hobbit for system monitoring y g

 Home grown traffic enginee

Pymetric (metric adjustments

  • utages router failures)
  • utages, router failures)

 Home grown calendaring an

planning/ notification of scedu

 Open source ticketing syste

(RT, RT

  • IR)

 Home grown CMS system:

g y catalogue, customers/ vendor

  • perator instructions etc: KIN

f monitoring environment for work monitoring – snmp-based. g g ering and simulation tool: s, simulate effects of changes, 10 nd staffing tool for duty roster uled duty ems with some modifications Inventory, circuits, service y, , rs, service agreements, ND (web front, database)

slide-11
SLIDE 11

NOC «on campus»

Campus toolbox network m Campus toolbox - network m placed on the campus they (+3

  • manage. Run Debian linux, mai

 network management system  netflow analysis tool: Nfsen (  service monitor : Hobbit  tftp with RCS for switch and  radius-service for routers an

+ Measuring beacon server, dn The UNINETT NOC has the the servers, have spare servers Cfengine and backup enables u replaced server back in operat p p

tools

  • nitoring server physically
  • nitoring server, physically

30 deployed) are set to in features: m: NAV (including NfDump) 11 router configuration archive d switches, syslog server ns/mail/all-purpose-server

  • perational responsibility for

s in the case of a breakdown. us to fairly quickly get tion.

slide-12
SLIDE 12

What the NOC do n

We have separate helpdesks

 Internal IT (though persone

te a (t oug pe so e

 Registry operation (suppor  UNINETT FAS helpdesk –

systems for billing administra systems for billing, administra personell administration for t sector

 Abuse/ CERT (but same peo

between NOC and CERT)

 Sales / process of connnect

Sales / process of connnect we assist in support/ troubles

not run:

for: ell overlapping partly) e ove app g pa t y) t for NORID/ .no) FAS coordinates/ developes ative systems purchase wages 12 ative systems, purchase, wages, the researsh and educational

  • ple involved/ coordinating

ting new customers (though ting new customers (though shooting)

slide-13
SLIDE 13

NOC (re-)organizin

M d k d i

 Merged network and servic  Within this new departmen

department sections (exper p ( p

 Internal Helpdesk and NOC  Coordinating through depar

people beeing members of people beeing members of

 Establishing trouble-ticket q  Tickets that are not resolve

h ibl the responsible expert grou

 Handover from NOC queu

would be triggered by comp gg y p documentation or repeating

ng

d ce department nt we are organized in cross- rt groups) g p ) C are standalone sections rtment meetings, and by key several sections 13 several sections. queues per section. ed by NOC are distributed to ups. e to another section queue pexity, lacking or obsolete p y g g issues.

slide-14
SLIDE 14

Uninett NOC - futu

T f

Thus far we have not tickete

Best effort policies and a goo

  • ut of trouble so far.

 Demand (authorities / gove

and quality systems- Critical n 15 of our larger customers

 15 of our larger customers

joined rescources- implement

We are participating in this This developement is expect

and expectations towards ou explicitly implementing ITIL p y p g adresses the same needs and s we do and why we choose to

 Quality control system  Quality control system

ure

S A' ed everything, and no SLA's -

  • d track record have kept us

rment) for contingency plans national infrastructure. (University Colleges) have 14 (University Colleges) have ting FreeTIL - "ITIL light" project as observers. ted to further raise demands r NOC - If we are not we need to have system that y supports and documents what do it this way.

slide-15
SLIDE 15

Uninett NOC future

Q S i Quality system and Service ca Our Service catalogue is part

 the NOC is supporting and m

pp g portefolio of services, that hav referenced via flat-file "howto

We need to be able to naviga We need to be able to naviga

description, to operator instru

 Need to identify subsystems

Which customers are affecte

 Which customers are affecte

made regarding the service.

 Be able to review service hi

i / t d d t if issues/ trends and correct if ne

 Be able to manage the life-c

e - cont.

l atalogue:

  • f our CMS system.

monitoring an ever growing g g g ve grown beyond what can be

  • -documents"

ate from alarm to service 15 ate from alarm to service uctions and knowledgebase s a service is depending on ed and what promises have we ed and what promises have we story/ track record to spot ecessary. cycle of services

slide-16
SLIDE 16

Uninett NOC future

Main focus of our current NO

 Formalizing procedures for

management, including hando

 Implementing service agreem

p g g

 Integrating and developing o

service catalogue, ticketsystem these are connected in a logica these are connected in a logica

 The ambition is that the prod

documented as our quality con documented as our quality con Questions? (links and contact

e, wrap up..

C developement work: problem and change

  • ver of cases internally.

ments where applicable 16 pp

  • ur monitoring systems, CMS,

m/ knowledgebase so that al manner al manner. duct of this work can be ntrol system ntrol system. t data will be added)