27 March 2018 Mikael Arguedas and Morgan Quigley - - PowerPoint PPT Presentation

27 march 2018 mikael arguedas and morgan quigley
SMART_READER_LITE
LIVE PREVIEW

27 March 2018 Mikael Arguedas and Morgan Quigley - - PowerPoint PPT Presentation

27 March 2018 Mikael Arguedas and Morgan Quigley USB3 Camera Separate devices: USB3 USB Host Camera (prototypes 0-3) USB2 IMU Unified


slide-1
SLIDE 1

27 March 2018 Mikael Arguedas and Morgan Quigley

slide-2
SLIDE 2
slide-3
SLIDE 3
slide-4
SLIDE 4
slide-5
SLIDE 5

slide-6
SLIDE 6

○ ○

slide-7
SLIDE 7
slide-8
SLIDE 8

USB Host Camera Camera IMU USB Host FPGA Imager Imager IMU FPGA Imager Imager IMU PCIe root

Unified system: (prototypes 6+) Separate devices: (prototypes 0-3) Unified camera: (prototypes 4-5)

USB3 USB3 USB2 USB3 PCIe

slide-9
SLIDE 9
slide-10
SLIDE 10
  • Global-shutter 1.3 MPix imagers, 20cm baseline
  • FPGA+DRAM+USB3 on daughterboard
  • InertialSense µIMU-2
slide-11
SLIDE 11

Imagers FPGA Imagers DRAM USB3 PHY Any PC Imagers FPGA Imagers PCIe root on SBC (NVIDIA TX2) PCIe PHY 40 40 / 30 / 8 / 12 / 30 /

USB3-based design PCIe-based design PCIe x2 bandwidth is similar to (low-end) FPGA DRAM bus !

Artix-7: 16 bit @ 400 MHz DDR = ~13 Gbit - overhead PCIe Gen 2 x2 = 8 Gbit full-duplex DRAM buffer required since USB3 = 4Gbit - scheduling Imagers capable of ~2 Gbit pixel rate (each)

slide-12
SLIDE 12
  • Designed around TX2
  • FPGA is PCIe endpoint
  • Self-contained computer vision: "just supply power"
slide-13
SLIDE 13
slide-14
SLIDE 14
slide-15
SLIDE 15

https://cad.onshape.com/documents/12b7e4d13bded8c95b2b0603/w/4fe61ad3cda4bc70ee895fc7/e/e46ce3bf3f9ed56eb291d5e2

  • Imager active area is not centered
  • Use 3d-printed lens holders
  • PLA is OK. Carbon-fiber is better
  • Heat-set inserts for mounting + lens lock
slide-16
SLIDE 16
  • Stereo systems need to be very stiff
  • PCB is clamped to carbon-fiber tube
slide-17
SLIDE 17
slide-18
SLIDE 18
  • always-on MCU waits for TX2 boot
  • During TX2 boot: MCU loads stage-1 FPGA image
  • TX2-FPGA PCIe link established
  • TX2 loads stage-2 FPGA image over PCIe
  • sensors initialized over PCIe MMIO

TX2 Imager MCU IMU

UART PCIe

FPGA

SPI

Imager

slide-19
SLIDE 19

TX2 FPGA DRAM PCIe Image sensors DMA write arbiter trigger IMU

sync

decimate PCIe

Image timing via IMU sync

slide-20
SLIDE 20

TX2

FPGA DRAM

PCIe

image sensors DMA arbiter

PCIe

deserialize decode framing FIFO fix column

  • rdering
slide-21
SLIDE 21

Extreme close-up of typical indoor navigation feature (sprinkler pipe joint)

slide-22
SLIDE 22

corner 7x7 circle

slide-23
SLIDE 23

corner 7x7 circle (discretized)

slide-24
SLIDE 24

corner 7x7 circle (discretized) "unrolled" discretized circle

slide-25
SLIDE 25

"unrolled" circle subtract center threshold 1) Find (in parallel) if there is a contiguous sequence

  • f >= 9 pixels above threshold value.

2) For non-max suppression, find (in parallel) "how far" the sequence is above the threshold

slide-26
SLIDE 26

Imagers produce 4 pixels per clock. Solution: search in parallel.

slide-27
SLIDE 27
  • An example of FPGA reducing latency in (simple) pixel-wise operations
  • 100's of operations per clock: 8-bit subtractions, comparisons, etc
  • Deterministic timing, keeps up with pixel rate
  • Many other algorithms are FPGA-friendly: pyramids, gradients, ...
slide-28
SLIDE 28

TX2 FPGA DRAM

CPU

PCIe SPI Image sensors register file pixel array

link train

deserializer serializers 4

clock sync data

ADC

image and c

/

image decoders

IMU BRAM

SPI sequencer

pixel FIFOs pixel FIFOs DMA write arbiter

trigger

SPI

IMU

sync decimate

pixel FIFOs

corner detectors

image stats

control register BRAM

PCIe

GPU

slide-29
SLIDE 29
slide-30
SLIDE 30

Imager

  • Initialization

○ allocate PCIe-visible RAM block ○ configure FPGA core, imager SPI registers, IMU registers

  • Every frame

○ FPGA writes pixels via DMA to TX2 RAM, sends interrupt ○ kernel re-syncs CPU caches ○ kernel unblocks userland thread in ROS node ○ ROS node copies image into ROS message, sends it downstream Imagers FPGA kernel module ROS driver

image consumer nodes image consumer nodes image consumer nodes

TX2 RAM

DMA MSI PCIe

slide-31
SLIDE 31

sensor hardware actuator hardware

  • Dynamic distributed message-passing framework
  • Huge collection of open-source nodes
  • Tools to parameterize, configure, and debug nodes
slide-32
SLIDE 32
slide-33
SLIDE 33

prevented by encryption prevented by authorization camera driver vision node camera evil data sniffing publish evil image data

evil node

downstream nodes

slide-34
SLIDE 34

https://github.com/osrf/tensorflow_object_detector

slide-35
SLIDE 35

https://github.com/osrf/tensorflow_object_detector

slide-36
SLIDE 36
slide-37
SLIDE 37

SSD USB GbE GPU USB (etc)

PCIe root

"Traditional" system

  • all peripherals on PCIe
  • PCIe cannot reset / re-enumerate
  • PCIe devices ready within 100ms

PCIe root

TX2-based system

  • nly the FPGA hangs off PCIe
  • PCIe kernel driver can be reloaded
  • FPGA configure/reconfigure at any time
  • elaborate "fast" configuration not needed

Flash USB GbE GPU USB

TX2 (Tegra) SoC System Fabric

FPGA

slide-38
SLIDE 38
  • all connectors on same side
  • no configuration MCU
  • FPGA upgrade (?)
  • stack boards to reduce footprint
slide-39
SLIDE 39
slide-40
SLIDE 40

For more information: http://open.vision.computer Morgan Quigley morgan@openrobotics.org