Gigabit Ethernet Interface for the MSR Fred Kuhns Applied Research - - PowerPoint PPT Presentation

gigabit ethernet interface for the msr
SMART_READER_LITE
LIVE PREVIEW

Gigabit Ethernet Interface for the MSR Fred Kuhns Applied Research - - PowerPoint PPT Presentation

Gigabit Ethernet Interface for the MSR Fred Kuhns Applied Research Laboratory Washington University St. Louis Mo. Washington fredk@arl.wustl.edu WASHINGTON UNIVERSITY IN ST LOUIS Overview Two Example Ethernet Scenarios Relevant


slide-1
SLIDE 1

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

fredk@arl.wustl.edu

Gigabit Ethernet Interface for the MSR

Fred Kuhns Applied Research Laboratory Washington University

  • St. Louis Mo.
slide-2
SLIDE 2

2

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Overview

  • Two Example Ethernet Scenarios
  • Relevant Ethernet and ARP specifications

– General standards – ATM AAL5 and Ethernet Frame Formats – Ethernet Addressing – The Address Resolution Protocol

  • Routing in the MSR
  • GigE Interface protocol processing requirements

– packet processing – ARP

slide-3
SLIDE 3

3

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Ethernet Forwarding Scenario 1

Ethernet Switch Host

IP: 192.163.204.2 MAC: 08:00:20:7C:E3:25

Host

IP: 192.163.204.3 MAC: 08:00:20:7C:F2:45

Router

Port 0: IP: 192.163.204.4 MAC: 00:01:03:7C:23:03 Port 1: IP: 192.163.150.1 MAC: 00:01:03:7C:56:34

Ethernet Switch

Port 1: IP: 192.163.204.2 MAC: 00:00:5E:04:00:01

MSR

P1

Host

IP: 192.163.150.2 MAC: 00:40:33:A3:4C:04

P0 P1

Host

IP: 192.163.150.3 MAC: 08:00:20:54:6C:4A

P3

Use the Address Resolution Protocol to Map 192.168.204.2 to 08:00:20:7C:E3:25. Encapsulation datagram in Ethernet frame and send.

Destination Addr: 192.168.204.2 IP hdr data

Packet arrives with destination host on local

  • network. Output port must

map destination IP address to MAC address.

slide-4
SLIDE 4

4

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Ethernet Forwarding Scenario 2

Ethernet Switch Host

IP: 192.163.204.2 MAC: 08:00:20:7C:E3:25

Host

IP: 192.163.204.3 MAC: 08:00:20:7C:F2:45

Router

Port 0: IP: 192.163.204.4 MAC: 00:01:03:7C:23:03 Port 1: IP: 192.163.150.1 MAC: 00:01:03:7C:56:34

Ethernet Switch

Port 1: IP: 192.163.204.2 MAC: 00:00:5E:04:00:01

MSR

P1

Host

IP: 192.163.150.2 MAC: 00:40:33:A3:4C:04

P0 P1

Host

IP: 192.163.150.3 MAC: 08:00:20:54:6C:4A

P3

Forwards to final destination host

Destination Addr: 192.168.150.2 IP hdr data

Packet arrives with destination host NOT on locally attached network. Output port must send to the next hop router. Next hop router IP address must be used in the ARP request: Map 192.168.204.4 to 00:01:03:7C:23:03. Encapsulate datagram in Ethernet frame and send.

slide-5
SLIDE 5

5

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

What is required?

  • To simplify the overall system design, Ethernet

specific processing confined to Link Interface

  • Interface must:

– Bridge between ATM and Ethernet networks – Map IP addresses to corresponding Ethernet Addresses:

  • send ARP requests and maintain an ARP cache

– Respond to ARP requests from other hosts

  • send ARP replies in response to requests
  • We are not supporting

– IEEE 802.1P/Q: VLANS, priorities etc.

slide-6
SLIDE 6

6

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Related Specifications

  • RFC 1122 - Requirements for Internet Hosts

– Must send and receive using RFC-894 - compliant – Should receive RFC-1042 mixed with RFC-894 - we do not – May send using RFC-1042 - we do not – Must use ARP - compliant – Must flush out-of-date ARP cache entries - not compliant – Must prevent ARP floods - we only try once – Should have configurable ARP cache timeout - no – Should save at least one (latest) unresolved (by ARP) packet - no – Must report broadcasts to IP layer - compliant – IP layer Must pass TOS to link layer - via the header – Must Not report no ARP entry as “destination unreachable” - compliant

slide-7
SLIDE 7

7

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Related Specifications - continued

  • RFC-826 : Address Resolution Protocol

– Maps <protocol, address> to Ethernet address – Minor differences in suggested algorithm

  • RFC 1700 : Assigned Numbers - Now an online database

– Managed by the Internet Assigned Numbers Authority (IANA) – Ethertype values and IP to ethernet multicast address mapping

  • RFC-1812 : Requirements for IPv4 Routers

– Must not believe ARP reply if contains multicast or broadcast address - not compliant – Must be compliant with RFC 1122 - Partial

  • Support Ethernet V2 only

– RFC 894: IP encapsulation in Ethernet V2 - Supported – RFC 1042: IP encapsulation in 802.3 frames - Not Supported

slide-8
SLIDE 8

8

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

IP over ATM (rfc 791 and 2684)

I P Header AAL5 Trailer I P Datagram

Fragment of f set Version H- length TOS Total length I dentif ication f lags TTL protocol Header checksum Source Address Destination Address Options ?? I P data (transport header and transport data) AAL5 padding (0 - 40 byt es) CPCS- UU (0) CPCS- UU (0) Length (I P packet + LLC/ SNAP) CRC

slide-9
SLIDE 9

9

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Ethernet Frame Format

Transport Header Fragment of f set Version H- length TOS Total length I dentif ication Flags TTL Protocol I P Header checksum I P Source Address I P Destination Address Destination Address cont. Destination (6 B) Source Address cont. Source Address - (6 B) Ether Type (2 B) IP Header Ethernet Header IP Datagram

slide-10
SLIDE 10

10

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

IP Encapsulation in Ethernet Frames

FCS (4) Data (46-1500) type 0800 src address (6) dst address (6) len (2) src address (6) dst address (6) FCS (4) Data (38 - 1492) DSAP AA SSAP AA ctl 03 Org Code 00 type 0800 802.2 LLC 802.2 SNAP 802.2 LLC/SNAP

  • Ethernet frame size: 64 - 1518 Bytes
  • if type ≤ 1500, then IEEE frame, otherwise Ethernet V2.

Ethernet Encapsulation, RFC 894 - Support IEEE 803.2/802.2 encapsulation, RFC 1042 - Do Not Support

0 ≤ ≤ ≤ ≤ len ≤ ≤ ≤ ≤ 1500

Pad (0-46) Pad (0-46)

slide-11
SLIDE 11

11

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Ethernet Assigned Numbers

  • RFC1700 obsoleted by online database at IANA:

– http://www.iana.org/assignments/ethernet-numbers

  • Ethernet Address - 6 octets:

– 3 high-order octets = Organizationally Unique Identifier (OUI) – 3 low-order octets = the interface number

  • Multicast bit = lsb of the MSB (xxxx xxx1)

– first byte odd => multicast or broadcast – first byte even => unicast address – multicast address = ((OUI | 0x0100) << 24) & Group_ID

  • Ethernet Broadcast: FF:FF:FF:FF:FF:FF
slide-12
SLIDE 12

12

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

IP and Ethernet Multicast

  • IANA has allocated address block with OUI = 00:00:5E

– Used for unicast addresses for ”IETF standard track protocols “ – Half of Multicast addresses reserved for IP, remaining for “special use”. Leaves 23 bits for multicast addresses:

  • 01:00:5E:00:00:00 to 01:00:5E:7F:FF:FF

– Could use this block for our interface, see ethernet numbers

  • IP Multicast

– Class D address, 0xE0000000 + 28 Bit Group ID – 224.0.0.0 to 239.255.255.255 (0xE0000000 - 0xEFFFFFFF)

  • IP to Ethernet Mapping

– RFC1112 - Host Extensions for IP Multicasting – Non-unique mapping: 28 bit IP group to 23 bit Ethernet group

  • 32 IP multicast groups per mapped ethernet multicast address.
slide-13
SLIDE 13

13

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Multicast: IP to Ethernet Mappings

  • Network Byte Ordering, Internet Standard Bit order:

(Big-Endian)

0000 0001 0000 0000 0101 1110 0xxx xxxx xxxx xxxx xxxx xxxx

47 24

Multicast Bit Internet Bit

MSB LSB lsb msb

1110 xxxx xxxx xxxx xxxx xxxx xxxx xxxx

Class D (Multicast) Not Used in IP-to-Ethernet Mapping Block of Ethernet Multicast Address

8 LSB 23 bits

slide-14
SLIDE 14

14

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

IP Broadcast

  • No Direct Impact on GigE Interface
  • IP Broadcast : default, we will not forward directed

broadcasts.

– limited versus:

  • {-1, -1}. Must not be forwarded, Destination address only

– Directed broadcast:

  • {Network-Number, -1}, destination address only.

– Subnet Directed Broadcast:

  • {Network-Number, Subnet-Number, -1}

– Directed Broadcast to all subnets:

  • {Network-Number, -1, -1}
slide-15
SLIDE 15

15

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Unicast - We can use the IANA Block

0000 0000 0000 0000 0101 1110 0000 0100 xxxx xxxx xxxx xxxx

47 23

Multicast Bit set to 0

MSB LSB

IANA Block of Ethernet Addresses

16 bits ARL Interface Number

WUARL MAC: 00:00:5E:04:XX:XX

slide-16
SLIDE 16

16

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

ARP Frame

Destination Address (6B) Source Address (6B) Ether Type (2B) Hardware Address Space (2B) Protocol Address Space (2B) Byte length of Hardware address = 6 (1B) Byte length of Protocol address = 4 (1B) Hardware Address of Sender (6 B) Protocol Address of Sender (4 B) Hardware Address of Destination (6 B) Protocol Address of Destination (4 B) Operation Code 1|2(2B)

slide-17
SLIDE 17

17

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

ARP Message Formats

ARP Request

type 806 src address <eth-A> dst address ff:ff:ff:ff:ff:ff FCS xx has 1 pas 0800 hl 6 pl 4

  • p

01 sha <eth-A> spa <ip-A> tha <??> tpa <ip-B> type 806 src address <eth-B> dst address <eth-A> FCS xx has 1 pas 800 hl 6 pl 4

  • p

02 sha <eth-B> spa <ip-B> tha <eth-A> tpa <ip-A>

ARP Reply

Host B Eth <eth-B> Reply (02) Request (01) Host A Eth <eth-B> Host A IP <ip-A> Host B IP <ip-A>

Ethernet Header (14 B)

pad pad

ARP Message (28 Bytes for Request or Reply) Ethernet Data - Pad with zeros to 46 Bytes FCS (4B) Ethernet Frame with ARP Request/Reply - 64 Bytes

18 Byte Pad

slide-18
SLIDE 18

18

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

SPC

shim update shim demux

WUGS ...

40 ... 47 (out port +40)

...

40 ... 47 (in port + 40)

Ingress Egress

From previous hop router

  • r endsystem

add shim rem shim FIPL shim proc.

FPX FPX SPC

shim demux shim update

Outbound VC = SPI + 50 0 <= SPI<= 3

Link Interface Link Interface

IP eval: IP processing for FPX.

  • 1. Broadcast and Multicast

destination address

  • 2. IP options
  • 3. Packet not recognized

Current VCI Support: 1) 8 Ports (PN) 2) 4 sub-ports (SP)

FIPL IP proc plugins

GiGE interface will use all four Sub-Port identifier (i.e. four VCI values) GiGE Interface will

  • nly send on one VCI

value (currently = 50)

FPX_VCI FPX_VCI

Packet Routing, SPC and FPX

FIPL IP proc plugins

slide-19
SLIDE 19

19

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Routing in the MSR

  • Route tables must map a given destination address to output

port and sub-port identifier (i.e. Virtual Interface Number or VIN).

– Route table entry: {prefix/length, Output_VIN}

  • 192.168.204.0/24, 41 (Port 1, Subport 1)
  • Output_VIN = {Port # (10 bits), Sub-port # (6 bits)}.
  • At input port, packet is sent to the indicated output port:

– VCI = 40 + Port number

  • At output port, the sub-port is mapped to an output VCI

value:

– VCI = 50 + Sub-port Identifier

slide-20
SLIDE 20

20

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

MSR Routing Example

Control Processor Switch Fabric

. . .

Output Port Proc. Flow/Route Lookup

  • Dist. Q. Ctl.
  • Dist. Q. Ctl.

Flow Lookup Input Port Proc. Flow/Route Lookup

  • Dist. Q. Ctl.
  • Dist. Q. Ctl.

Flow Lookup

SPC SPC

IP hdr data

packet received at input port

IP hdr data (4, 0)

remove shim, calculate VCI for subport 0: VCI = 50 + 0 = 50

IP hdr data

Send to next hop/endsystem route lookup returns <port = 4, subport = 0>

IP hdr data IP hdr data (4, 0)

insert shim with OutVIN, send on VCI = 40 + 4 = 44

slide-21
SLIDE 21

21

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Supporting Ethernet

  • We can leverage the sub-port identifier to facilitate

IP to ethernet address resolution.

– if packet received on VCI = 50 (subport 0), then use the IP destination address in the header – otherwise (subports 1-3), lookup VCI value in a table to

  • btain the next hop IP address.
  • Once we have the IP address we must map it to the

corresponding Ethernet address.

– We can then implement a simple version of ARP in the GigE interface card.

slide-22
SLIDE 22

22

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

GigE Link Interface

ARP Table (Simplified)

MAC IP IP1 MAC1 IPM MACM ... ...

Pkt VC = 50 Endsystem, Broadcast

  • r Multicast address

if VC != 50, Lookup VC in VIN table returns IP used for ARP lookup Send to pkt->dst if bcast or mcast map to eaddr else unicast resolve w/ARP

IP Header data

AAL5 trailer

IP Header data Ethernet

Add Ethernet header using the derived destination address and our corresponding source address. Software creates VIN table at boot time by writing to interface.

To Next Hop

  • r

Endstation

To a next hop router NH #0 = Base + 1 = 51 NH #1 = Base + 2 = 52 NH #2 = Base + 3 = 53

VIN Table (Simplified) Map multicast

  • r broadcast to

ethernet address If ARP table lookup fails, send ARP request to broadcast address, drop packet. No retries are made. No ARP entry aging!

From FPX/SPC

2 NhIP2 Entry NhIP 1 NhIP1 NhIP0

slide-23
SLIDE 23

23

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

GigE Link Interface

Base VC

to FPX/SPC

if ( An ARP packet) update Mapping in ARP table if not for us then drop if (ARP Request) swap source and target info set operation to Reply set ether header send reply else if (An IPv4 packet) remove ethernet “stuff” add AAL5 trailer/padding send on default Base VC else drop packet

IP Header data

AAL5 trailer

From Next Hop or Endstation

IP Header data Ethernet ARP Ethernet

ARP Table (Simplified)

MAC IP IP1 MAC1 IPM MACM ... ...

slide-24
SLIDE 24

24

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Some Details

  • Packet Received on ATM interface:

– If received on VCI 50 (i.e. the base VCI) then

  • Map IP destination in header (ip->dst_addr) to ethernet MAC address.
  • Unicast uses ARP table, multicast and broadcast use appropriate mapping.

– Otherwise (VCI = {51, 52 or 53}),

  • lookup VC in VIN table: Table entry index = RX_VC - Base_VC - 1.

For example, packet received on VCI = 53, Index = 53 - 50 - 1 = 2

  • ARP the resulting Next Hop IP address.

– This permits a simple mechanism for “directing” traffic to a

  • gateway. This allows us to support directed broadcast and provides

a convenient mechanism for testing.

  • Packet Received on Ethernet interface:

– if IPv4 then send all (unicast, multicast and broadcast) to input port processor on VCI 50 (i.e. the Base VCI)

slide-25
SLIDE 25

25

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

ARP Cache

  • IP Address = Network_Prefix.Host or simply Net.Host

– Assume a prefix length of at least 24 bits, leaves 8 bits for the host – An interface can have at most 3 unique IP addresses

  • Interface may communicate with at most 256 hosts per network
  • Implement ARP cache as a table with 768 entries (3 * 256)
  • See next slide

VIN Table

Prefix Mask Local IP Address Next Hop IP Address Mask0 MyIP0 NH0 Mask1 MyIP1 NH1 Mask2 MyIP2 NH2 Entry Number 1 2 Ethernet IP IP0,0 ... ... IP0,255 Ether0,255 Ether0,0 IP1,0 ... ... IP1,255 Ether1,255 Ether1,0 IP2,0 ... ... IP2,255 Ether2,255 Ether2,0

ARP Table Net 0 Net 1 Net 2

Net 0 = Mask0 & MyIP0 Net 1 = Mask1 & MyIP1 Net 2 = Mask2 & MyIP2

slide-26
SLIDE 26

26

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

‘get next packet’: // received frame from ATM interface if (RX_VC == Base_VC) ipdst = ip->dst_addr; else ipdst = VIN_Table[RX_VC- Base_VC - 1].NextHop // ipdst == IP Address of host we must send packet to // determine network, Using the VIN table for (i = 0; i < 3; i++) { // i corresponds to the Network Number (0 - 2) if ((ipdst & Maski) == (MyIPi & Maski)) { index = (i << 8) | ((ipdst & ~Maski) & 0xffffff00) break; } if i == 3 ; drop packet, goto get next packet // lookup in ARP table if (ArpTable[index].EtherAddress != 00:00:00:00:00:00) { construct ethernet frame send packet goto ‘get next packet’ } else { send ARP Request for ipdst drop packet, goto ‘get next packet’}

Implementing the ARP Table

VIN Table

Ethernet IP IP0,0 ... ... IP0,255 Ether0,255 Ether0,0 IP1,0 ... ... IP1,255 Ether1,255 Ether1,0 IP2,0 ... ... IP2,255 Ether2,255 Ether2,0

ARP Table

index

Prefix Mask Local IP Address Next Hop IP Address Mask0 MyIP0 NH0 Mask1 MyIP1 NH1 Mask2 MyIP2 NH2 Entry Number 1 2 don’t need to store IP address

slide-27
SLIDE 27

27

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Notes and Issues

  • GigE Control Interface for Software configuration.
  • 1. Reset interface to defaults
  • 2. Clear ARP cache
  • 3. Read ARP table
  • 4. Read VIN table
  • 5. Read ethernet address
  • 6. set VIN table entries and other registers
  • Set BASE VC (currently 50)
  • Set Entries in the VIN table
  • Add static ARP entries
slide-28
SLIDE 28

28

Washington

WASHINGTON UNIVERSITY IN ST LOUIS

Fred Kuhns - 6/19/2002

Hardware and Status

  • Software Simulation completed
  • Hardware implementation and status: Dave ...