Switch ON Ayush Jain (aj2672) Donovan Chan(dc3095) Shivam - - PowerPoint PPT Presentation

switch on
SMART_READER_LITE
LIVE PREVIEW

Switch ON Ayush Jain (aj2672) Donovan Chan(dc3095) Shivam - - PowerPoint PPT Presentation

Switch ON Ayush Jain (aj2672) Donovan Chan(dc3095) Shivam Choudhary(sc3973) An extremely efficient Hardware Switch!! Architecture Userspace generates packets. Input module sorts and places in RAM(s). Scheduler avoids


slide-1
SLIDE 1

Switch ON

Ayush Jain (aj2672) Donovan Chan(dc3095) Shivam Choudhary(sc3973)

slide-2
SLIDE 2

An extremely efficient Hardware Switch!!

“ “

slide-3
SLIDE 3

Architecture

➔ Userspace generates packets. ➔ Input module sorts and places in RAM(s). ➔ Scheduler avoids collisions between packets. ➔ Buffer stores data in

  • utput RAM(s).
slide-4
SLIDE 4

Hardware Communication Protocol

Header Format
slide-5
SLIDE 5

Hardware Scheduler - Single Input Queue

➔ Source & Destination modelled as 4 RAM(s) each. ➔ Individual scheduling to prevent collision. ➔ Greedy,no optimization for head of line blocking.

slide-6
SLIDE 6

➔ Modeled as 16 RAMs at the input. (Parallel Packet Switch) ➔ Destination still modelled as 4 RAMs ➔ Prevents HOL, hence improves throughput. ➔ Requires additional hardware complexity and storage.

Hardware Scheduler - PPS

slide-7
SLIDE 7

PPS vs Single Input Queue

slide-8
SLIDE 8

PPS vs Single Input Queue (contd.)

Worst-case Performance Average Performance Best-case Performance ➔ Better average case performance for PPS. ➔ Higher variance. ➔ Same for PPS and Single Input Queue. ➔ Such a case is theoretically less probable to occur.

slide-9
SLIDE 9

Input Signals - wren, wraddress, data, rden, rdaddress. Output Signals - q (Data occurs after one clock cycle)

Timing Diagrams - RAM(altsync)

slide-10
SLIDE 10

Case: Signals for different output ports.

Timing Diagrams - Scheduler

slide-11
SLIDE 11

Case: Signals for same output ports.

Timing Diagrams - Scheduler

slide-12
SLIDE 12

Timing Diagrams - Full Suite

slide-13
SLIDE 13

Validator

slide-14
SLIDE 14

DEMO

slide-15
SLIDE 15

Results

slide-16
SLIDE 16

Performance Constraints

  • RAMs take up three clock cycles to change from one read location to

another.

  • Transfers are restricted to 32 bits at a time because of ioctl calls.
  • Can also increase performance if we increase the number of parts at

the cost of hardware complexity.

slide-17
SLIDE 17

Lessons Learned

It is named hard-ware for a reason.

Timing diagrams save time.

Simulations may be far from reality.

You will often reduce to hard problems in polynomial time.

Documentations need a lot of work.

slide-18
SLIDE 18

Future Work

Analyze and compare the performance for a maximum bipartite matching solution.

Implement DMA.

Produce test results for greater amount of data and different scenarios.

Interface with Ethernet ports and test with real network.

slide-19
SLIDE 19

Thank You!!

Code available on Github: https://github.com/shivamchoudhary/SwitchON https://github.com/shivamchoudhary/SwitchONHW