03-Jun-2004
Abilene Observatory Datasets Matt Zekauskas, matt@internet2.edu - - PowerPoint PPT Presentation
Abilene Observatory Datasets Matt Zekauskas, matt@internet2.edu - - PowerPoint PPT Presentation
Abilene Observatory Datasets Matt Zekauskas, matt@internet2.edu 03-Jun-2004 Major Datasets, roughly size order Flow data (last 11bits of IP zeroed) Latency (one-way, 2*11^2 paths (v4, v6)) Near future: IGP (IS-IS updates/node)
03-Jun-2004 2
Major Datasets, roughly size order
Flow data (last 11bits of IP zeroed) Latency (one-way, 2*11^2 paths (v4, v6)) Near future: IGP (IS-IS updates/node) SYSLOG from router (internal at least, future) Router snapshots 1 and 5 min SNMP usage, errors Throughput (iperf, 1Gb limited, 2*2*11^2 sets (v4, v6, TCP, UDP)) [multicast]
03-Jun-2004 3
Common Theme
Summaries available via Web
- Graphs, Tables
- Time series (of summaries) via “web services”
Raw data in diverse formats, only by special request (often on HPSS / MSS)
- (Probably) not enough metadata kept
- Stored by date; some tarballs
- Recovery is a manual process
- Except router snapshots. Have all XML files.
Flow data gone after 30 days
03-Jun-2004 4
Other problems
Stuff is archived in many places (with different administrative hurdles)
- OSU, Columbus [RAID]
- IU, Bloomington [HPSS]
- Internet2 / Ann Arbor [RAID, tape]
Although all “available” off of http://abilene.internet2.edu/observatory it’s a twisty maze of Web links (and no access to archived data) Validation? Know some of flow data bad…
03-Jun-2004 5
Some future plans
New databases: IGP, BGP Looking to use a DHS grant to help clean up databases & improve access Hope to contribute to this effort
03-Jun-2004 6
URLs
http://abilene.internet2.edu/observatory
- Pointers to all measurements/sites/projects
- Particularly
http://abilene.internet2.edu/observatory/data-views.html
http://www.abilene.iu.edu/
- NOC home page. Weathermap, Proxy, SNMP measurements
http://netflow.internet2.edu/weekly/
- Summarized flow data
http://www.itec.oar.net/abilene-netflow/
- “Raw” – matricies; (Anon) feeds available on request
03-Jun-2004
Details on select datasets
03-Jun-2004 8
Flow data
Collected using Mark Fullmer’s flowtools
- 1/100 sampling ; may be losing some, bet don’t note
Stored (last 11 bits zeroed) on a RAID for 30 days Summaries stored forever
- http://www.itec.oar.net/abilene-netflow/
- http://netflow.internet2.edu/weekly/
Raw: access via rsync.
- One directory per day per router
- Files with 5 min chunks (so 288/day)
- Not necessarily available in real time, <=24 hr delay possible
03-Jun-2004 9
Latency
Our own owamp implementation
- 1/sec poisson ; full mesh among router nodes
Summaries
- Stored in mySQL
- Web displays, including graphical and “worst 10”
- XML/SOAP access
http://abilene.internet2.edu/ami/webservices.html using GGF Network Measurement WG schema…
Raw
- Directory per path per day
- Tar in to one large file per day, then compressed
- HPSS, no public access
03-Jun-2004 10
Router Snapshots
1/hr query of routers using XML & Junoscript; config, routing state, interface status… Raw
- Stored as compressed XML files forever
- Access via SOAP/XML or fetching files
- http://loadrunner.uits.iu.edu/~gcbrowni/Abilene/raw-data.html
Web page veneer
- http://loadrunner.uits.iu.edu/~gcbrowni/Abilene/
- Current status, and also old available by date
- Some processed more than others (including some into rrd
files and graphs [rrd files available in addition to graphs])
03-Jun-2004 11
5 min SNMP
Polled using custom software, stored into RRD files Access depends on link type
- Backbone from weathermap
- Access by clicking on map of router nodes
- Either way, end up with typical MRTG-style
Raw
- RRD files for current day off MRTG-style page
- RRD files stored per day (I believe) on HPSS
–No public access, but should be available on request
03-Jun-2004 12
03-Jun-2004 13
Questions
Brief description of data sets Tools used (?) Cataloging and Archiving the data Problems with the data Desirable database support Future plans
03-Jun-2004 14
Observatory
Publishing measurement data
- Stuff we collect for operations
- Stuff we collect for research
The ability for research projects to add their equipment, or run software on our platform
- Peer reviewed
- Why? Passive, collocation makes analysis easier
- AMP, PMA, Planetlab