Silicon retina technology Tobi Delbruck Inst. of Neuroinformatics , - - PowerPoint PPT Presentation

silicon retina technology
SMART_READER_LITE
LIVE PREVIEW

Silicon retina technology Tobi Delbruck Inst. of Neuroinformatics , - - PowerPoint PPT Presentation

Silicon retina technology Tobi Delbruck Inst. of Neuroinformatics , University of Zurich and ETH Zurich Sensors Group sensors.ini.uzh.ch Sponsors: Swiss National Science Foundation NCCR Robotics project, EU projects SEEBETTER and VISUALISE ,


slide-1
SLIDE 1

1

Sensors Group sensors.ini.uzh.ch

Sponsors: Swiss National Science Foundation NCCR Robotics project, EU projects SEEBETTER and VISUALISE, Samsung, DARPA

Silicon retina technology

Tobi Delbruck

  • Inst. of Neuroinformatics, University of Zurich and ETH Zurich
slide-2
SLIDE 2

2

Sponsors: Swiss National Science Foundation NCCR Robotics, EU projects CAVIAR, SEEBETTER, VISUALISE, Samsung, DARPA, University of Zurich and ETH Zurich

sensors.ini.uzh.ch inilabs.com

slide-3
SLIDE 3 Golf-guides.blogspot.com

Conventional cameras (Static vision sensors) output a stroboscopic sequence of frames

Muybridge 1878 (150 years ago)

Good

Compatible with 50+years of machine vision Allows small pixels (1um for consumer, 3-5um for machine vision)

Bad

Redundant output Temporal aliasing Limited dynamic range (60dB)

Fundamental “latency vs. power” trade-off

3

slide-4
SLIDE 4

100M photoreceptors 1M output fibers carrying max 100Hz spike rates 180dB (109) operating range >20 different “eyes” Many GOPs computing 3mW power consumption Output is sparse, asynchronous stream of digital spike events

4

The Human Eye as a digital camera

slide-5
SLIDE 5

This talk has 4 parts

  • Dynamic Vision

Sensor Silicon Retinas

  • Simple object

tracking by algorithmic processing of events

  • Using probabilistic

methods for state estimation

  • “Data-driven” deep

inference with CNNs

5

slide-6
SLIDE 6

DVS (Dynamic Vision Sensor) Pixel

logI

photoreceptor

I

ON OFF comparators (ganglion cells) threshold

Brightness change events Event reset

change amplifier (bipolar cells)

Lichtsteiner et al., ISSCC 2007, JSSC 2009

From Rodieck 1998

6

±𝛦log𝐽

slide-7
SLIDE 7

Edmund 0.1 density chart Illumination ratio=135:1

780 lux 5.8 lux

780 lux

ON events

5.8 lux

DVS pixel has wide dynamic range

ISSCC 2007

slide-8
SLIDE 8

Using DVS for high speed (low data rate) imaging Data rate <1MBps “Frame rate” equivalent to 10 kHz but 100x less data (10 kHz image sensor x 16k pixels = 160 MBps)

ISSCC 2007

slide-9
SLIDE 9

DAVIS (Dynamic and Active Pixel Vision Sensor) Pixel

logI

photoreceptor

I

ON OFF comparators (ganglion cells) threshold

Change events Event reset

change amplifier (bipolar cells)

Brandli et al., Symp VLSI, JSSC 2014

From Rodieck 1998 Intensity reset Intensity value

9

±𝛦log𝐽

slide-10
SLIDE 10

10

slide-11
SLIDE 11

DVS/DAVIS +IMU demo

11

Brandli, Berner, Delbruck et al., Symp. VLSI 2013, JSSC 2014, ISCAS 2015

Start DAVIS Demo

slide-12
SLIDE 12

DAVIS (Dynamic and Active Pixel Vision Sensor) Pixel

logI

photoreceptor

I

ON OFF comparators (ganglion cells) threshold

Change events Event reset

change amplifier (bipolar cells)

Brandli et al., Symp VLSI, JSSC 2014

From Rodieck 1998 Intensity reset Intensity value

12

±𝛦log𝐽

slide-13
SLIDE 13

13

DAVIS346

Bias generator AER DVS asynch. event readout

APS col-parallel ADCs and scanner

180nm CIS 346x260 18.5um pixel DAVIS 8mm

slide-14
SLIDE 14

14

Important layout considerations

  • 1. Post layout

simulations to minimize parasitic coupling

  • 2. Shielding parasitic

photodiodes

slide-15
SLIDE 15

Event threshold matching measurement

Experiment: Apply slow triangle wave LED stimulus to entire array, measure njmber of events that pixels generate

Conclusion: Pixels generate 11±3 events per factor 3.3 contrast. Since ln(3.3)=1.19 and 1.19/11=0.11, contrast threshold=11% ± 4%

slide-16
SLIDE 16

latency jitter time

Conclusion: Pixels can have minimum latency of about 12us under bright illumination. But “real world” latencies are more like 100us-1ms.

Measuring DVS pixel latency

Experiment: Stimulate small area of sensor with flashing LED spot, measure response latencies from recorded event stream

slide-17
SLIDE 17

DVS pixel has built-in temperature compensation

17

Photoreceptor VpT ln(Ip) Threshold onT ln(Ion/Id) Since photoreceptor gain and threshold voltage both scale with absolute temperature T, it cancels out

Nozaki, Delbruck 2017 (unpublished)

slide-18
SLIDE 18

Integrated bias generator and circuit design enables

  • peration over extended temperature range

18

Nozaki, Delbruck 2017 (submitted)

slide-19
SLIDE 19

350nm 90nm 180nm 350/180nm

Global shutter APS Rolling shutter consumer APS

DVS pixel size trend

20

https://docs.google.com/spreadsheets/d/1pJfybCL7i_wgH3qF8zsj1JoWMtL0zHKr9eygikBdElY/edit#gid=0

slide-20
SLIDE 20

Event camera silicon retina developments

21

DVS/DAVIS CeleX ATIS/CCAM DVS

Commercial entities Inilabs (Zurich) – R&D prototoypes Insightness (Zurich) – Drones and Augmented Reality Samsung (S Korea) – Consumer electronics Pixium Vision (Paris) – Retinal implants Inivation (Zurich) – Industrial applications, Automotive Chronocam (Paris) - Automotive Hillhouse (Singapore) - Automotive

slide-21
SLIDE 21

22

Neuromorphic sensor R&D prototypes Open source software, user guides, app notes, sample data Shipped devices based on multiproject wafer silicon to 100+ organizations

Founded 2009

www.iniLabs.com

Run as not-for-profit

slide-22
SLIDE 22

23

  • Dynamic Vision Sensor

Silicon Retinas

  • Simple object tracking

by algorithmic processing of events

  • Using probabilistic

methods for state estimation

  • “Data-driven” deep

inference with CNNs

slide-23
SLIDE 23

Tracking objects from DVS events using spatio-temporal coherence

24/30

  • 1. For each event, find nearest cluster
  • If event within a cluster, move

cluster

  • If event not within cluster, seed

new cluster

  • 2. Periodically prune starved clusters,

merge clusters, etc (lifetime mgmt)

Advantages

  • 1. Low computational cost (e.g. <5% CPU)
  • 2. No frame memory (~100 bytes/object).
  • 3. No frame correspondence problem

Litzenberger 2007

slide-24
SLIDE 24

Robo Goalie

25

Delbruck et al, ISCAS 2007, Frontiers 2013

slide-25
SLIDE 25

Using DVS allows 2 ms reaction time at 4% processor load with USB bus connections

26

slide-26
SLIDE 26

This talk has 4 parts

  • Dynamic Vision

Sensor Silicon Retinas

  • Simple object

tracking by algorithmic processing of events

  • Using probabilistic

methods for state estimation

  • “Data-driven” deep

inference with CNNs

27

slide-27
SLIDE 27

Simultaneous Mosaicing and Tracking with DVS

28

Hanme Kim, A. Handa, … Andy J. Davison, BMVC 2014.

slide-28
SLIDE 28

29

Hanme Kim, A. Handa, … Andy J. Davison, BMVC 2014.

Simultaneous Mosaicing and Tracking with DVS

slide-29
SLIDE 29

Goal: To do event-based, semi-dense visual odometry

  • We want to estimate State vector 𝑡 (camera pose, visual

scene spatial brightness gradients and sensor event thresholds) using Bayesian filtering from the events 𝑓: 𝑞 𝑡 𝑓

  • Sensor likelihood 𝑞 𝑓 𝑡

is modeled as mixture of inlier Gaussian distribution and outlier uniform distribution

  • A tractable posterior 𝑟 𝑡 𝑓 ≈ 𝑞(𝑡|𝑓) is approximated by

Kullback-Leibler (KL) divergence

  • Leads to closed-form update equations in the form of a

classical Kalman filter, thus computationally efficient (unlike particle filtering)

  • G. Gallego
  • E. Mueggler D. Scaramuzza

(submitted to PAMI, 2016)

slide-30
SLIDE 30

Towards event-based, semi-dense SLAM: 6-DOF pose estimation

  • G. Gallego et al., PAMI (submitted 2016).
slide-31
SLIDE 31

This talk has 4 parts

  • Dynamic Vision

Sensor Silicon Retinas

  • Simple object

tracking by algorithmic processing of events

  • Using probabilistic

methods for state estimation

  • “Data-driven” deep

inference with CNNs

32

slide-32
SLIDE 32

Demo - RoShamBo

33

slide-33
SLIDE 33

RoShamBo CNN architecture

Conv 5x5 16x60x60

Total 18MOp (~9M MAC) Compute times: On 150W Core i7 PC in Caffe: 2ms On 1W CNN accelerator on FPGA: 8ms

Paper Scissors Rock Background

64x64 DVS 2D rectified histogram of 2k events (0.1Hz – 1kHz rate)

MaxPool 2x2 16x30x30 Conv 3x3 32x28x28 Conv 1x1 + MaxPool 2x2 128x1x1 MaxPool 2x2 32x14x14 Conv 3x3 64x12x12 MaxPool 2x2 64x6x6 Conv 3x3 128x4x4 MaxPool 2x2 128x2x2 240x180 DVS “frames”

Conventional 5-layer LeNet with ReLU/MaxPool and 1 FC layer before output.

I.-A. Lungu, F. Corradi, and T. Delbruck, “Live Demonstration: Convolutional Neural Network Driven by Dynamic Vision Sensor Playing RoShamBo,” in 2017 IEEE Symposium on Circuits and Systems (ISCAS 2017), Baltimore, MD, USA, 2017.

slide-34
SLIDE 34

RoShamBo training images

I.-A. Lungu, F. Corradi, and T. Delbruck, “Live Demonstration: Convolutional Neural Network Driven by Dynamic Vision Sensor Playing RoShamBo,” in 2017 IEEE Symposium on Circuits and Systems (ISCAS 2017), Baltimore, MD, USA, 2017.

slide-35
SLIDE 35

36

  • A. Aimar, E. Calabrese, H. Mostafa, A. Rios-Navarro, R. Tapiador, I.-A. Lungu, A. Jimenez-Fernandez, F.

Corradi, S.-C. Liu, A. Linares-Barranco, and T. Delbruck, “Nullhop: Flexibly efficient FPGA CNN accelerator driven by DAVIS neuromorphic vision sensor,” in NIPS 2016, Barcelona, 2016.

slide-36
SLIDE 36

Conclusions

  • 1. The DVS was developed by following a neuromorphic approach of

emulating key properties of biological retinas

  • 2. The wide dynamic range and sparse, quick output make these sensors

useful in real time uncontrolled conditions

  • 3. Applications could include vision prosthetics, surveillance, robotics

and consumer electronics

  • 4. The precise timing could improve learning and inference
  • 5. The main challenges are to reduce pixel size and to develop effective
  • algorithms. Only industry can do the first but academia has plenty of

room to play for the second.

  • 6. Event sensors can nicely drive deep inference. There is a lot of room

for improvement of deep inference power efficiency at the system level!

37