Hardwa rdware re-acc acceler elerated ated CC CCD re D reado - - PowerPoint PPT Presentation

▶

Nov 01, 2023 94 likes •217 views

Welcome Hardwa rdware re-acc acceler elerated ated CC CCD re D reado adout ut sm smear ar co correc rection tion for or Fa Fast st Sol olar ar Pol olarimete arimeter Stefan Tabel Walter Stechele and Korbinian Weikl Chair

SLIDE 1

Hardwa rdware re-acc acceler elerated ated CC CCD re D reado adout ut sm smear ar co correc rection tion for

r Fa

Fast st Sol

ar Pol

larimete

arimeter

IEEE E ASAP 2 P 2017

Monda day y July y 10th 0th, Sessi sion

n 3: Image

ge Process cessing ing

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

Welcome

Stefan Tabel and Korbinian Weikl Semiconductor Laboratory

f the Max Planck Society,

Munich, Germany Walter Stechele Chair for Integrated Systems, Technical University of Munich, Munich, Germany

SLIDE 2

Related projects

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

Fast Solar Polarimeter (FSP) Full custom camera Solar ground-based observations 1m solar telescope SUNRISE On a stratosphere balloon Same image quality as satellites Lower costs Can we install FSP on SUNRISE? No, readout smear will hinder the post-facto correction of image jitter. An online correction can solve this problem…

SLIDE 3

Readout smear models for the FSP camera

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

 Ground-based observations  Constant scene  4 polarization states  Circularly appearing images  Accumulation and inversion

S: smeared column Y: unsmeared column k: time index

 General solution for corrected image column  Not constant scene  For a jittered balloon flight  2 x 1024 half-columns  512 pixel/half-column  400 images/second  1 hour burst length  How to compute?

δ: relative transfer-time α : relative switching-time

SLIDE 4

Optimization of the algorithm

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

 Quadratic complexity  Undefined length of series  Convergent  Correction via n successors only  Approximation with fake assumption of periodicity  Matrix becomes circulant  The inverse of a circulant matrix is circulant  Matrix-vector multiplication with a circulant matrix is a convolution  A block of a circulant matrix is of Toeplitz type  Each Toeplitz matrix can be extended to a circulant matrix

1) 4) 5) 3) 2)

SLIDE 5

Design space exploration 1

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

 Study and single unit => no ASIC  FPGA instead of CPU / GPGPU:

 Power dissipation in the stratosphere  10G Ethernet peripherals on-chip  No need for hosts

=> Focus on Xilinx FFT cores:  Uint16 image data should be transformed using a 31 bit fixed-point transform  The correction needs to be done in single precision floating point  Choose a mixed-model with n є [4:6] Twiddle factor width is 24 bit <=

SLIDE 6

Design space exploration 2

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

 NetFPGA SUME offers QDR II+ SRAM  6.7 gbps Ethernet stream  209 M samples per hemisphere  Requirements:  Rotation of the image  Parallelization  FFT, multiplication, IFFT  Degrees of freedom  DDR3 vs. QDR II+ => simple design for feasibility study targeting a single unit camera  Sequential vs. parallel algorithm => parallel version is always fast, slightly more expensive in logic, can be built in before the RAM, and can be easily configured to different depths of correction  Order of RAM and FFT => FFT before RAM would increase memory costs  Tasks  Use one RAM-module per hemisphere, rotate image during write access  Readout of parallel image-data  Parallel fixed-point FFT  Cast to single precision floating point, multiply with constants, cast to fixed-point, IFFT  Interface 10Gig Ethernet

SLIDE 7

Memory and logic design

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

 209 M pix./sec. @ 225 MHz write  Write single pixel for image rotation  Row index @ LSB for column access during read burst  Each word serves as ring buffer for image bursts  A crossbar is necessary at read side  n times higher throughput @ read  n parallel and synchronous inputs  Correction values are constant (ROM)  Synchronous calculations  Higher throughput than in stream  FFT modules are extended with typecasts  One FFT module transforms 2 signals

SLIDE 8

Parallelization

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

1 stream  Throughput and capacity require one RAM per hemisphere  Parallel algorithm forces temporal multiplexing on two logic pipelines per RAM (zero insertion)  Sequential variant can be built with lower logic resources at the costs of RAM  Twice the clock-rate at 2 pipelines did not meet timing constraints  No buffers at the memory interfaces, straight forward stream 2 RAMs 4 pipelines 1 sensor

SLIDE 9

Results and tests

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

 Implementation: n = 4  SRAM not included in table  Correction for n=4 => max. error = 2 (Uint16)  Correction for n=6 => max. error = 1 (Uint16)  Cutoff due to noise => 3 bit in Uint16  Model-based, co-design with camera  Separate throughput test, later testing  Readout smear is a convolution  Stepwise correction removes copies of the image  FPGA module allows to use the FSP camera on the SUNRISE balloon mission

SLIDE 10

That`s it!

IEEE ASAP 2017 Stefan Tabel, MPG Semiconductor Laboratory

Hardwa rdware re-acc acceler elerated ated CC CCD re D reado adout ut sm smear ar co correc rection tion for

Fast st Sol

ar Pol

arimeter

IEEE E ASAP 2 P 2017

Monda day y July y 10th 0th, Sessi sion

ge Process cessing ing

Welcome

Stefan Tabel and Korbinian Weikl Semiconductor Laboratory

Munich, Germany Walter Stechele Chair for Integrated Systems, Technical University of Munich, Munich, Germany

Related projects

Readout smear models for the FSP camera

Optimization of the algorithm

1) 4) 5) 3) 2)

Design space exploration 1

Design space exploration 2

Memory and logic design

Parallelization

Results and tests

That`s it!

Thank you very much for your interest! Your questions, please.