1 1
Tracking the Evolution of Web Traffic: 1995-2003 Tracking the Evolution of Tracking the Evolution of Web Traffic: 1995-2003 Web Traffic: 1995-2003
http://www.cs.unc.edu/Research/dirt
The University of North Carolina at Chapel Hill Department of Computer Science The University of North Carolina at Chapel Hill The University of North Carolina at Chapel Hill Department of Computer Science Department of Computer Science
11 11th
th ACM/IEEE International Symposium on Modeling, Analysis and
ACM/IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS) Simulation of Computer and Telecommunication Systems (MASCOTS) Orlando, October 13 Orlando, October 13th
th, 2003
, 2003
Félix Hernández-Campos Kevin Jeffay
- F. Donelson Smith
2 2
Web Traffic Measurement and Web Traffic Measurement and Analysis at UNC-Chapel Hill Analysis at UNC-Chapel Hill
- In 1997, populating web traffic generators for
experimental networking research motivated a large- scale study of web traffic at UNC with three goals: Develop a light-weight methodology
– Based on passive measurement – Easy to maintain models up-to-date
Replace smaller-scale, quickly aging models
– Mah, 1995 data set – Crovella et. al, 1995 data set (revised with 1998 data)
Characterize the use of the HTTP protocol
– E.g., Use of persistent connections
- In 1997,
In 1997, populating web traffic generators populating web traffic generators for for experimental networking research motivated a large- experimental networking research motivated a large- scale study of web traffic at UNC with three goals: scale study of web traffic at UNC with three goals:
- Develop a light-weight methodology
Develop a light-weight methodology
– – Based on passive measurement Based on passive measurement – – Easy Easy to maintain models up-to-date to maintain models up-to-date
- Replace smaller-scale, quickly aging models
Replace smaller-scale, quickly aging models
– – Mah Mah, 1995 data set , 1995 data set – – Crovella Crovella et. al
- et. al, 1995 data set (revised with 1998 data)
, 1995 data set (revised with 1998 data)
- Characterize the use of the HTTP protocol
Characterize the use of the HTTP protocol
– – E.g. E.g., Use of persistent connections , Use of persistent connections
3 3
Web Traffic Measurement and Web Traffic Measurement and Analysis at UNC-Chapel Hill Analysis at UNC-Chapel Hill
- Our methodology and first results were published in
SIGMETRICS/Performance’01
– What TCP/IP Protocol Headers Can Tell Us About the Web
- Modeling aspect explored in a series of papers
– E.g., Variable Heavy Tails in Internet Traffic (with J.S. Marron)
» (Part I: Understanding Heavy Tails published in MASCOTS’02)
- In this talk, I will describe our approach and our
- bservation on the evolution of web traffic:
– Three data sets: 1999, 2001 and 2003 – Comparisons to Mah and Crovella et al.
- Our methodology and first results were published in
Our methodology and first results were published in SIGMETRICS/Performance SIGMETRICS/Performance’ ’01 01
– – What TCP/IP Protocol Headers Can Tell Us About the Web What TCP/IP Protocol Headers Can Tell Us About the Web
- Modeling aspect explored in a series of papers
Modeling aspect explored in a series of papers
– – E.g., Variable Heavy Tails in Internet Traffic E.g., Variable Heavy Tails in Internet Traffic (with J.S. (with J.S. Marron Marron) )
» » (Part I: (Part I: Understanding Heavy Tails Understanding Heavy Tails published in MASCOTS published in MASCOTS’ ’02) 02)
- In this talk, I will describe our approach and our
In this talk, I will describe our approach and our
- bservation on the evolution of web traffic:
- bservation on the evolution of web traffic:
– – Three data sets: 1999, 2001 and 2003 Three data sets: 1999, 2001 and 2003 – – Comparisons to Mah and Crovella Comparisons to Mah and Crovella et al. et al.
4 4
Methodology Methodology
Study of Web Content Consumers Study of Web Content Consumers
- We studied a large collection of users (~35,000) as
web content consumers
- We studied a large collection of users (~35,000) as
We studied a large collection of users (~35,000) as web content consumers web content consumers
- The only source of data for our study were packet
header traces
– Anonymized IP addresses – No HTTP headers
- The only source of data for our study were packet
The only source of data for our study were packet header traces header traces
– – Anonymized IP addresses Anonymized IP addresses – – No HTTP headers No HTTP headers
University of University of North Carolina North Carolina at Chapel Hill at Chapel Hill
Internet Internet
Web Servers Web Servers Web Servers Web Clients Web Clients Web Clients HTTP Requests HTTP Requests HTTP Responses HTTP Responses