A First Look at Modern Enterprise Traffic Ruoming Pang , Princeton - - PowerPoint PPT Presentation

a first look at modern enterprise traffic
SMART_READER_LITE
LIVE PREVIEW

A First Look at Modern Enterprise Traffic Ruoming Pang , Princeton - - PowerPoint PPT Presentation

A First Look at Modern Enterprise Traffic Ruoming Pang , Princeton University Mark Allman ( ICSI ), Mike Bennett ( LBNL ), Jason Lee ( LBNL ), Vern Paxson ( ICSI/LBNL ), and Brian Tierney ( LBNL ) The Question What does the traffic look like


slide-1
SLIDE 1

A First Look at Modern Enterprise Traffic

Ruoming Pang, Princeton University

Mark Allman (ICSI), Mike Bennett (LBNL), Jason Lee (LBNL), Vern Paxson (ICSI/LBNL), and Brian Tierney (LBNL)

slide-2
SLIDE 2

The Question

“What does the traffic look like in today’s enterprise networks?”

  • Previous work

– LAN traffic [Gusella 1990, Fowler et.al. 1991] – More recent work on individual aspects:

  • Role classification [Tan et.al. 2003],
  • Community of interest [Aiello et.al. 2005]
  • Wide area Internet traffic measurements

– First study: [Cáceres 1989] … when the size of Internet was ~130,000 hosts … about the size of a large enterprise network today

slide-3
SLIDE 3

Our First Look

  • Which applications account for most traffic?
  • Who is talking to whom?
  • What’s going on inside application traffic?

– Esp. ones that are heavily used but not well studied: Netware Core Protocol (NCP), Windows CIFS and RPC, etc.

  • How often is the network overloaded?

For all above, compare internal vs. wide area

slide-4
SLIDE 4

Trace Collection

  • Where: Lawrence Berkeley National Lab (LBNL)

– A research institute with a medium-sized enterprise network

  • Caveat: one-enterprise study

– “The traffic might look like …”

  • How: tapping links from subnets to the main

routers

  • Caveat: only traffic between subnets
slide-5
SLIDE 5

LBNL Trace Data

  • Five data sets
  • Over three months: Oct 2004 -- Jan 2005

1,558 1,561 2,088 2,102 2,531 Traced Hosts 1500 1500 68 68 1500 Snaplen 28M 22M 28M 65M 18M Packets 18 18 22 22 22 Subnets 1 hour 1 hour 1 hour 1 hour 10min Duration Jan 7, 05 Jan 6, 05 Dec 16, 04 Dec 15, 04 Oct 4, 04 Date D4 D3 D2 D1 D0

slide-6
SLIDE 6

LBNL Trace Data

  • Each trace covers a subnet
  • Lasts ten minutes or one hour

1,558 1,561 2,088 2,102 2,531 Traced Hosts 1500 1500 68 68 1500 Snaplen 28M 22M 28M 65M 18M Packets 18 18 22 22 22 Subnets 1 hour 1 hour 1 hour 1 hour 10min Duration Jan 7, 05 Jan 6, 05 Dec 16, 04 Dec 15, 04 Oct 4, 04 Date D4 D3 D2 D1 D0

slide-7
SLIDE 7

LBNL Trace Data

  • Two sets of subnets
  • 2,000 hosts traced per data set

1,558 1,561 2,088 2,102 2,531 Traced Hosts 1500 1500 68 68 1500 Snaplen 28M 22M 28M 65M 18M Packets 18 18 22 22 22 Subnets 1 hour 1 hour 1 hour 1 hour 10min Duration Jan 7, 05 Jan 6, 05 Dec 16, 04 Dec 15, 04 Oct 4, 04 Date D4 D3 D2 D1 D0

slide-8
SLIDE 8

LBNL Trace Data

  • Subnets are traced two at a time

– With four NIC’s on the tracing machine

1,558 1,561 2,088 2,102 2,531 Traced Hosts 1500 1500 68 68 1500 Snaplen 28M 22M 28M 65M 18M Packets 18 18 22 22 22 Subnets 1 hour 1 hour 1 hour 1 hour 10min Duration Jan 7, 05 Jan 6, 05 Dec 16, 04 Dec 15, 04 Oct 4, 04 Date D4 D3 D2 D1 D0

slide-9
SLIDE 9

LBNL Trace Data

  • Packets with full payloads allow application-level

analysis

1,558 1,561 2,088 2,102 2,531 Traced Hosts 1500 1500 68 68 1500 Snaplen 28M 22M 28M 65M 18M Packets 18 18 22 22 22 Subnets 1 hour 1 hour 1 hour 1 hour 10min Duration Jan 7, 05 Jan 6, 05 Dec 16, 04 Dec 15, 04 Oct 4, 04 Date D4 D3 D2 D1 D0

slide-10
SLIDE 10

Outline of This Talk

  • Traffic breakdown

– Which applications are dominant?

  • Origins and locality
  • Individual application characteristics
slide-11
SLIDE 11

Network Layer: Is IP dominant?

  • Yes, most packets (96-99%) are over IP

– Caveat: inter-subnet traffic only

  • Aside from IP: ARP, IPX (broadcast), etc.
slide-12
SLIDE 12

Transport Layer

  • Protocols seen:

– TCP, UDP, ICMP – Multicast: IGMP, PIM – Encapsulation: IP-SEC/ESP, GRE – IP protocol 224 (?)

  • Is UDP used more frequently inside

enterprise than over wide area Internet?

slide-13
SLIDE 13

TCP vs. UDP / WAN vs. Enterprise Breakdown by Payload Bytes

slide-14
SLIDE 14

Breakdown of the first data set (D0) (Bars add up to 100%)

slide-15
SLIDE 15

80% (or more) payloads are sent within the enterprise.

slide-16
SLIDE 16

Yes, UDP is used more frequently inside the enterprise.

slide-17
SLIDE 17

Breakdown by Flows

slide-18
SLIDE 18

Application Breakdown by Bytes

slide-19
SLIDE 19

Application Breakdown by Bytes

net-file: NFS, Netware Core Protocol

slide-20
SLIDE 20

Application Breakdown by Bytes

bulk: FTP, HPSS

slide-21
SLIDE 21

Application Breakdown by Bytes

windows: Port 135, 139, and 445

slide-22
SLIDE 22

Bars for each data set add up to 100%

slide-23
SLIDE 23

Internal Heavy-Weights

net-file: NFS NCP backup: Dantz Veritas

slide-24
SLIDE 24

WAN Heavy-Weights

WAN ≈ web + email

slide-25
SLIDE 25

Breakdown by Flows

name: DNS WINS misc: Calendar CardKey

slide-26
SLIDE 26

Summary of Traffic Breakdown

  • Internal traffic (vs. wide area)

– Higher volume (80% of overall traffic) – A richer set of applications

  • Traffic heavy-weights

– Internal: network file systems and backup – WAN: web and email

slide-27
SLIDE 27

Outline

  • Traffic breakdown
  • Origins and locality

– Fan-in/out distribution

  • Individual application characteristics
slide-28
SLIDE 28
slide-29
SLIDE 29

Half of hosts have no wide-area fan-out (in one hour).

slide-30
SLIDE 30

Internal fan-out has a fat tail.

slide-31
SLIDE 31

Most hosts have fan-in of no more than 10.

slide-32
SLIDE 32

Outline

  • Traffic breakdown
  • Origins and locality

– Fan-in/out distribution

  • Individual application characteristics
slide-33
SLIDE 33

Example Questions

  • Is there a big difference between internal

and wide area HTTP traffic?

  • How different are DNS and WINS

(netbios/ns)?

  • What does Windows traffic do?
slide-34
SLIDE 34

Internal HTTP traffic

Automated clients vs. the rest:

41% 30% 4% 66% 43% 42% All other clients Netware iFolder Google Devices Internal Scanners 9% 0.0% 0.0% 10% 0.2% 1% 48% 69% 96% 5% 8% 37% 1% 0.9% 0.1% 19% 49% 20% D4 D3 D0 D4 D3 D0 Bytes Requests

Automated clients dominate the traffic.

slide-35
SLIDE 35

DNS vs. WINS

  • Where do queries come from?

– DNS: both local and remote; most queries come from two mail servers – WINS: local clients only; queries are more evenly distributed among clients

  • Failure rate (excluding repeated queries)

– DNS: 11-21% – WINS: 36-50% (!)

slide-36
SLIDE 36

Windows Traffic

Port 139 Port 445 Port 135 Dynamic Ports CIFS/SMB NETBIOS DCE/RPC Endpoint Mapper File Sharing DCE/RPC Services (logon, msgr, etc.)

Port numbers don’t tell much…

LAN Browsing

slide-37
SLIDE 37

Windows Traffic

Port 139 Port 445 Port 135 Dynamic Ports CIFS/SMB NETBIOS DCE/RPC Endpoint Mapper File Sharing DCE/RPC Services (logon, msgr, etc.)

Application level analysis: Bro + binpac

LAN Browsing

slide-38
SLIDE 38

Windows Traffic Breakdown

  • Majority of CIFS/SMB traffic is for DCE/RPC

services

– Rather than file sharing

  • Majority of RPC traffic

– By request: user authentication (netlogon), security policy (lsarpc) and printing (spoolss) – By size: printing (spoolss)

slide-39
SLIDE 39

Not Covered in This Talk …

  • Characteristics of more applications

– Email – Network file systems: NFS and NCP – Backup – Further details about HTTP, DNS/WINS, and Windows traffic

  • Network congestion
slide-40
SLIDE 40

Conclusion

  • A lot is happening inside enterprise

– More packets sent internally than cross border – A number of applications seen only within the enterprise

  • Caveats

– One enterprise only – Inter-subnet traffic – Hour-long traces – Subnets not traced all at once

  • Header traces released for download!

– To come: traces with payloads (HTTP, DNS, …)

slide-41
SLIDE 41

The End

To download traces: http://www.icir.org/enterprise-tracing (or search for “LBNL tracing”)