NetFPGA Summer Course Presented by: Noa Zilberman Yury Audzevich - - PowerPoint PPT Presentation

netfpga summer course
SMART_READER_LITE
LIVE PREVIEW

NetFPGA Summer Course Presented by: Noa Zilberman Yury Audzevich - - PowerPoint PPT Presentation

NetFPGA Summer Course Presented by: Noa Zilberman Yury Audzevich Technion August 2 August 6, 2015 http://NetFPGA.org Summer Course Technion, Haifa, IL 2015 1 NetFPGA SUME HARDWARE Summer Course Technion, Haifa, IL 2015 2 Outline


slide-1
SLIDE 1

Summer Course Technion, Haifa, IL 2015

1

NetFPGA Summer Course

Presented by: Noa Zilberman Yury Audzevich Technion August 2 – August 6, 2015

http://NetFPGA.org

slide-2
SLIDE 2

Summer Course Technion, Haifa, IL 2015

2

HARDWARE

NetFPGA SUME

slide-3
SLIDE 3

Summer Course Technion, Haifa, IL 2015

3

Outline

  • High Level Block Diagram
  • FPGA
  • Memory Subsystem
  • Serial Interfaces
  • Storage
  • Configuration
  • Clocks
  • Status Indications
  • Power
  • Misc
slide-4
SLIDE 4

Summer Course Technion, Haifa, IL 2015

4

Block Diagram

slide-5
SLIDE 5

Summer Course Technion, Haifa, IL 2015

5

FPGA

slide-6
SLIDE 6

Summer Course Technion, Haifa, IL 2015

6

FPGA- Virtex-7 690T

  • Virtex-7 FPGA introduced in 2012
  • 28nm process
  • 690K Logic cells
  • 866K CLB FF
  • 52Mb RAM
  • 3600 DSP slices
  • 3x PCIe Gen 3

Hard cores

  • 850 I/O
  • 36 GTH transceivers
slide-7
SLIDE 7

Summer Course Technion, Haifa, IL 2015

7

Virtex 7 CLB

  • CLB – Configurable Logic Block
  • The main logic resource
  • Usually assigned without user intervention
  • Each CLB contains:

– 2 slices – 8 LUTs (6 inputs) – 16 Flip Flops – 2 Arithmetic and carry chains – 256b distributed RAM – 128b shift registers

  • Refer to Xilinx’s UG474
slide-8
SLIDE 8

Summer Course Technion, Haifa, IL 2015

8

Memory Subsystem

slide-9
SLIDE 9

Summer Course Technion, Haifa, IL 2015

9

Memory Interfaces

  • DRAM:

2 x DDR3 SoDIMM 1866MT/s, 4GB (supports up to 32GB)

  • SRAM:

3 x 9MB QDRII+, 500MHz

slide-10
SLIDE 10

Summer Course Technion, Haifa, IL 2015

10

DRAM

  • Dynamic RAM
  • Based on capacitors, holding charge
  • SDRAM – Synchronous DRAM
  • DDR – Double Data Rate

– Two data transactions is every clock cycle

  • Rising edge & falling Edge

bit row select

slide-11
SLIDE 11

Summer Course Technion, Haifa, IL 2015

11

DDR SDRAM – Prefetch Buffer

  • Fetching a single data word takes time…

– Any additional data word on the same row comes with minimal “cost”

  • Idea: with every access, read several

adjacent data words

– Without individual column request

  • A prefetch buffer holds the fetched words

until they are transmitted

  • Prefetch buffer depth is typically the ratio

between core memory frequency and I/O frequency

11

slide-12
SLIDE 12

Summer Course Technion, Haifa, IL 2015

12

DDR SDRAM

  • DDR3 SDRAM - Prefer buffer size is 8n

– Example:

  • Clock rate is 800MHz
  • Data rate is 1600Mbps x bus width
  • Core rate is 200MHz
  • DIMM – Dual In-line Memory Module

– Replaced SIMM – Single In-line Memory module – DIMM has separate electrical contacts on each side of the module.

  • SO-DIMM

– Small Outline DIMM – Usually used in mobile computers

12

slide-13
SLIDE 13

Summer Course Technion, Haifa, IL 2015

13

DRAM Modules

  • Consumer and networking applications typically use

DRAM devices (components).

  • Computing applications typically use DRAM modules.
  • DIMM – Dual In-line Memory Module

– Replaced SIMM – Single In-line Memory module – DIMM has separate electrical contacts on each side of the module.

  • SO-DIMM

– Small Outline DIMM – Usually used in mobile computers

13

slide-14
SLIDE 14

Summer Course Technion, Haifa, IL 2015

14

DRAM Frequency Errata

  • Xilnx currently has an errata for MIG 7

series DDR3

  • AR#59167
  • Triggered by aggressive data patterns

(PRBS23)

  • Caused by loss on the channel and skew

with the FPGA

  • Workaround: max data rate is 1700MT/s
slide-15
SLIDE 15

Summer Course Technion, Haifa, IL 2015

15

15

SRAM

  • Static RAM
  • Based on transistors (Flip-Flops)
  • Saving state
  • Less dense and more expensive than DRAM
slide-16
SLIDE 16

Summer Course Technion, Haifa, IL 2015

16

QDR SRAM

  • QDR – Quad Data Rate

– Synchronous – Separate busses for Write and Read – Each bus – Double data rate – Total of 4 transactions per clock – 500MHz  2000MT/s – Constant latency

  • QDR II+

– Uses QVLD signal for sampling

  • Rather than free running clock
slide-17
SLIDE 17

Summer Course Technion, Haifa, IL 2015

17

QDR SRAM – Burst Length

  • Similar concept as DRAM burst length
  • Valid options: 2 or 4

– Part number specific

  • BL=2

– Can access a different address every clock – Ideal for short queries (e.g. lookups)

  • BL=4

– Can change address every 2 clocks – Achieves higher frequency – Supports half the number of entries of BL=2

  • For the same SRAM density
  • This is a design trade-off
slide-18
SLIDE 18

Summer Course Technion, Haifa, IL 2015

18

QDR’s Bank Sharing

  • QDR A and QDR B Share Bank 17

– For controls

  • Xilinx MIG currently does not support bank

sharing

  • Manual manipulation of the PHY is required

in order to use

– Calling for a contributed project

slide-19
SLIDE 19

Summer Course Technion, Haifa, IL 2015

19

DRAM vs. SRAM

DRAM SRAM Density High Low Latency Variable Constant High Low Bandwidth High High Effective bandwidth Varies, <100% 100%

  • Usage examples:

– Output queues – Lookup tables – Storing buffer descriptors

slide-20
SLIDE 20

Summer Course Technion, Haifa, IL 2015

20

Serial Interfaces

slide-21
SLIDE 21

Summer Course Technion, Haifa, IL 2015

21

Serial Interfaces

  • Used for data transfer at high rates
  • GTH Transceiver (Transmitter/Receiver)
  • 13.1Gb/s

– Speed grade: -3

  • FPGA selection

– GTH vs. GTZ

  • 13.1Gb/s vs. 28.05Gb/s

– I/O vs. Serial I/F Rate

  • I/O equals RAM

– RAM won

slide-22
SLIDE 22

Summer Course Technion, Haifa, IL 2015

22

Host Interface

  • PCIe Gen. 3
  • x8 (only)

– x4 requires changes to the clock circuitry

  • Hardcore IP
slide-23
SLIDE 23

Summer Course Technion, Haifa, IL 2015

23

Front Panel Ports

  • 4 SFP+ Cages
  • Directly connected to

the FPGA

  • Supports 10GBase-R

transceivers (default)

  • Also Supports

1000Base-X transceivers and direct attach cables

slide-24
SLIDE 24

Summer Course Technion, Haifa, IL 2015

24

Expansion Interfaces

  • FMC HPC connector

– VITA-57 Standard – Supports Fabric Mezzanine Cards (FMC) – 10 x 12.5Gbps serial links

  • QTH-DP

– 8 x 12.5Gbps serial links

  • 12.5Gb/s is the validation rate
  • Actual performance depends on the full channel

– Insertion loss, return loss, cross talk,….

slide-25
SLIDE 25

Summer Course Technion, Haifa, IL 2015

25

Serial Interfaces

  • Summary:

– 4 transceivers connect to SFP+ – 8 transceivers connect to PCIe – 10 transceivers connect to FMC – 8 transceivers connect to QTH – 2 transceivers connect to SATA (see later)

  • Total: 32

– 4 transceivers are unused

  • Transceivers are grouped in quads, with shared clocking
  • 2 unused transceivers on SATA quad and 2 on FMC last quad
slide-26
SLIDE 26

Summer Course Technion, Haifa, IL 2015

26

STORAGE

slide-27
SLIDE 27

Summer Course Technion, Haifa, IL 2015

27

Storage

  • 128MB FLASH
  • 2 x SATA connectors
  • Micro-SD slot
  • Enable standalone
  • peration
slide-28
SLIDE 28

Summer Course Technion, Haifa, IL 2015

28

FLASH

  • Non Volatile RAM (NVRAM)
  • NOR based

– High reliability (vs. NAND FLASH)

  • Can read “single” data
  • Write:

– Erase blocks (write ‘1’) – Write ‘0’

slide-29
SLIDE 29

Summer Course Technion, Haifa, IL 2015

29

FLASH – SUME

  • Using a parallel FLASH

– 16 bit wide

  • Used to store the FPGA’s image

– Loaded upon power up

  • Using 2 FLASH devices in parallel

– To achieve PCIe required configuration time

  • Additional storage space available for more

bitstream files and user defined purposes

slide-30
SLIDE 30

Summer Course Technion, Haifa, IL 2015

30

SATA

  • 2 on board SATA connectors
  • SATA-III compatible (6Gb/s)
  • Connects to standard HDD/SDD

– Use standard SATA cables

  • Uses 2 transceivers

– One per connector

  • Enables the stand-alone computing unit
  • peration
slide-31
SLIDE 31

Summer Course Technion, Haifa, IL 2015

31

Micro-SD

  • SD – “Secure Digital”
  • Non volatile memory device
  • Uses a parallel interface:

– 4 bit data – 1 bit command – …and a protocol

  • Supports UHS-I

– But not UHS-II

  • Supports SC, HC and XC class cards
  • Located at the reverse side (print side) of

the board

slide-32
SLIDE 32

Summer Course Technion, Haifa, IL 2015

32

CONFIGURATION

slide-33
SLIDE 33

Summer Course Technion, Haifa, IL 2015

33

FPGA Configuration

  • FPGA configuration data is stored in files called

bitstreams

– Have the .bit file extension.

  • Stored in dedicated CMOS Configuration

Latches (CCL)

  • Defines the FPGA’s logic functions and circuit

connections

  • Remains valid until:

– Erased – Power down

Reset does not affect the FPGA configuration!

slide-34
SLIDE 34

Summer Course Technion, Haifa, IL 2015

34

FPGA Configuration

  • Multiple ways to configure the FPGA:
  • 1. Through the JTAG chain, using USB-JTAG

– J16, labelled PROG

  • 2. Through the JTAG chain, using 14-pin

JTAG header

– J9

  • 3. From the FLASH

– Loading one of four possible bitstream files

slide-35
SLIDE 35

Summer Course Technion, Haifa, IL 2015

35

FPGA Configuration

slide-36
SLIDE 36

Summer Course Technion, Haifa, IL 2015

36

CPLD

  • CPLD – Complex Programmable Logic Device
  • Non-volotile
  • A combination of

– Programmable AND/OR array – Macrocells

  • Macrocells

– Functional blocks – Perform combinatorial or sequential logic – True or complement – Varied feedback path

slide-37
SLIDE 37

Summer Course Technion, Haifa, IL 2015

37

CPLD

  • An example of Xilinx CPLD (block

diagram):

Source: http://www.xilinx.com/cpld/

slide-38
SLIDE 38

Summer Course Technion, Haifa, IL 2015

38

CPLD

  • CoolRunner II XC2C512

– 512 macro cells

  • Same JTAG chain as the FPGA
  • Used as an interface converter between

FLASH and FPGA

– CPLD  2xFLASH: Master BPI (16 bit) – CPLD  FPGA: 32bit SelectMap

  • See UG470
  • Goal: Respond to PCI enumeration commands

within 200 milliseconds of power up

slide-39
SLIDE 39

Summer Course Technion, Haifa, IL 2015

39

Clocks

slide-40
SLIDE 40

Summer Course Technion, Haifa, IL 2015

40

Clocks

Clock Name Frequency Common Usage

FPGA_SYSCLK 200MHz General Purpose, Shared with source with QDR devices QDRII_SYSCLK 200MHz Used by MIG for QDRIIA, QDRIIB QDRIIC_SYSCLK 200MHz Used by MIG for QDRIIC SATA_SYSCLK 150MHz Used by SATA transceivers SFP_CLK 156.25MHz (Configurable) Shared by transceivers of SFP+ and QTH DDR3_SYSCLK 233.33MHz Shared by MIG for DDR3A, DDR3B PCIE-CLK 100MHz Used by PCI-Express Core USER_CLK 156.25MHz (Configurable) User defined purposes FMC_GBT_CLK0

  • Used by FMC transceivers (From FMC)

FMC_GBT_CLK1

  • Used by FMC transceivers (From FMC)

FMC_CLK0

  • Used by FMC card (From FMC)

FMC_CLK1

  • Used by FMC card (From FMC)
slide-41
SLIDE 41

Summer Course Technion, Haifa, IL 2015

41

Clocks

  • Special clocks:
  • Configurable – through I2C:

– USER clock – SFP (& QTH) clock

  • External Clock

– PCIe clock – FMC clocks

  • Recovered Clock

– SFP (& QTH) clock

slide-42
SLIDE 42

Summer Course Technion, Haifa, IL 2015

42

Miscellaneous

slide-43
SLIDE 43

Summer Course Technion, Haifa, IL 2015

43

Status and Indications

  • 10G ports indication LEDs

– 4x Green – 4x Yellow – Programmable

  • 5 Programming status LEDs

– Controlled by CPLD

  • LD5 – No valid bitstream in indicated section
  • LD6 – Fallback LED (load bitstream from default

boot section)

  • LD7-LD9 – Indicate current boot section
  • 2 General purpose LEDs
slide-44
SLIDE 44

Summer Course Technion, Haifa, IL 2015

44

Power

  • Power Supply

– 12V – From a dedicated power connector

  • Not through PCIe connector
  • On-Off connector
  • 6 major power rails

– Different voltage: 1.0, 1.2V, 1.5V, 1.8V, 3.3V – Several dedicated, isolated, power rails

  • Maximum power consumption ~150W

– Typical power consumption is much lower

slide-45
SLIDE 45

Summer Course Technion, Haifa, IL 2015

45

Power

slide-46
SLIDE 46

Summer Course Technion, Haifa, IL 2015

46

Power Control

  • On board power sequencing and

supervision

slide-47
SLIDE 47

Summer Course Technion, Haifa, IL 2015

47

Sensors

  • Per power rail:

– Output voltage – Current consumption – Temperature

  • Can set alerts (fault or warning):

– Over/under voltage – Over/under current – Over temperature – Timeout reaching or exceeding a certain voltage

  • Using I2C (PMBus)
slide-48
SLIDE 48

Summer Course Technion, Haifa, IL 2015

48

Conclusion

slide-49
SLIDE 49

Summer Course Technion, Haifa, IL 2015

49

Nick McKeown, Glen Gibb, Jad Naous, David Erickson,

  • G. Adam Covington, John W. Lockwood, Jianying Luo, Brandon Heller, Paul

Hartke, Neda Beheshti, Sara Bolouki, James Zeng, Jonathan Ellithorpe, Sachidanandan Sambandan, Eric Lo

Acknowledgments (I)

NetFPGA Team at Stanford University (Past and Present): NetFPGA Team at University of Cambridge (Past and Present): Andrew Moore, David Miller, Muhammad Shahbaz, Martin Zadnik Matthew Grosvenor, Yury Audzevich, Neelakandan Manihatty-Bojan, Georgina Kalogeridou, Jong Hun Han, Noa Zilberman, Gianni Antichi, Charalampos Rotsos, Marco Forconesi, Jinyun Zhang, Bjoern Zeeb All Community members (including but not limited to): Paul Rodman, Kumar Sanghvi, Wojciech A. Koszek, Yahsar Ganjali, Martin Labrecque, Jeff Shafer, Eric Keller , Tatsuya Yabe, Bilal Anwer, Yashar Ganjali, Martin Labrecque, Lisa Donatini, Sergio Lopez-Buedo Kees Vissers, Michaela Blott, Shep Siegel, Cathal McCabe

slide-50
SLIDE 50

Summer Course Technion, Haifa, IL 2015

50

Acknowledgements (II)

Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these materials do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project. This effort is also sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA8750-11-C-0249. This material is approved for public release, distribution unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government.