Parsa, Andrew Siemion, Dan Werthimer, Mel Wright Outline What is a - - PowerPoint PPT Presentation
Parsa, Andrew Siemion, Dan Werthimer, Mel Wright Outline What is a - - PowerPoint PPT Presentation
Jason Manley, Aaron Parsons, Don Backer, Henry Chen, Terry Filiba, David MacMahon, Peter McMahon, Arash Parsa, Andrew Siemion, Dan Werthimer, Mel Wright Outline What is a correlator? Scalable packetized correlators: The architecture
Outline
- What is a correlator?
- Scalable packetized correlators:
– The architecture – The hardware – The software – The cost
- Closing thoughts
- Walk through actual design
- Questions and comments
Interferometry…
Basic idea
Amplitude Time
90°
Vij Vij Vi Vj
∑ ∑
Z-n 90°
Vij Vij Vi Vj
∑ ∑
Z-n Z-n 90°
Vik Vik Vk
∑ ∑
Vii
∑
Vjj
∑
Vjk
∑
Vjk
∑
Vkk
∑
Amplitude Time
“Actual” FX Correlator
∑ FFT Z-n FFT Z-n FFT Z-n ∑ ∑
CASPER DSP backend concept
Commercial off-the-shelf Multicast 10 Gbps (10GE
- r InfiniBand) Switch
PFB
ADC
FPGA DSP Module FPGA DSP Module FPGA DSP Module FPGA DSP Module FPGA DSP Module FPGA DSP Module
General-purpose CPUs
PFB PFB
. . .
Correlator Beamformers/ Spectrometers Pulsar timer
. . .
Reconfigurable Compute Cluster
ADC ADC
Polyphase Filter Banks
. . . . . .
Design Philosophy
- Standardized processing hardware
- Commercial interconnect
- Asynchronous compute engines
- Synchronization using common 1PPS
- UDP output delivery over ethernet
network
- Correlator scales with your array
F Engine 0 10GbE Switch F Engine 1 F Engine N-1 X Engine 0 X Engine 1 X Engine N-1
. . . . . . . . .
CASPER FX Architecture
Implementation
F Engine 0 10GbE Switch F Engine 1 F Engine N-1 X Engine 0 X Engine 1 X Engine N-1 . . . . . . . . .
Architecture to hardware mapping
Example 8 Antenna system
BEE2 10GbE Switch X Eng X Eng BEE2 user FPGA X Eng X Eng BEE2 user FPGA X Eng X Eng BEE2 user FPGA X Eng X Eng BEE2 user FPGA F Eng F Eng iBOB F Eng F Eng iBOB F Eng F Eng iBOB F Eng F Eng iBOB
F Engine Operations
Reformat
DDC Quantize
Channelize
X Engine
ADC
- Two F engines per iBOB
- Dual polarization design
- Currently uses ASTRO library
- Currently processes data at native clock
rate (<200MHz IBOB or < 400MHz ROACH)
Setup and Control
- Clocks:
– X engines each run off independent clock – Sampling synchronized at F engines, but clock not distributed to X engines
- Synchronized using global 1pps signal at ADCs
– Propagated to X engines using out-of-band signaling on XAUI links – Headers labeling 10GbE Ethernet packet data
- System control: separate 100Mbps Ethernet network on BEE2
- F engines configured from BEEs through XAUI links
- Control packets: CASPER UDP framework on BEE2 control FPGA
- Execute Python scripts for configuration, control and debugging
F engine development
- 2008:
– Coarse delays (cable length compensation) – Fringe-stopping & fine delays – Walsh code generation and phase switching – Real sampling (low bandwidth) – Parallel streams (high bandwidth)
- Future:
– Ability to output subset of band – Spectral zoom modes
X Engine Operations
- Using CASPER library
- Scales with 2^N antennas
- Fit as many X engines on an FPGA as
possible (2x 16 ant on BEE2 usr)
10GbE Buffer X Eng Accum
F Engine
Backend Software
- UDP packets received
- Currently received, parsed and saved in
MIRIAD file format by single computer.
- Computing requirements dependant on
experiment;
- Usually single computer ok: 128
antennas, 1 sec integrations, 2k chan = 512MB/s
Pending systems
- Bench sys: 8ant, DP, 200MHz, 2k ch
- PAPER: 128ant, DP, 100MHz, 2k ch
- KAT-7: 8ant, DP, 256MHz, 2k ch
- meerKAT: 80ant, DP, 1GHz, 16k ch
- Bologna: 32ant, SP, 32MHz, 1k ch
- GMRT: 32ant, DP, 400MHz, 4k-8k ch
How does it scale
1 10 100 1000 10000 100000 1000000 F Engines X Engines
FPGA Roadmap
- Processing power doubling every two years
- V4 = ½ power requirements of V2Pro*
* Manufacturers claim - Xilinx Inc.
100 200 300 400 2000 2002 2004 2006 43 100 200 330 Logic Cells Thousands
Xilinx Virtex Family
Coming soon…
- 10Gbps output optionally gives integrations ~10ms
- More efficient use of hardware DSP slices
- High speed, scalable, distributed data capture software
- Walsh codes and phase switching
- Phase rotation
- 64 antenna design
- Upgrade to 4096 channels
- ROACH hardware:
– <400MHz bandwidth – 16 384 channels – 128 antennas – no architectural changes
Questions and Comments
Visit the CASPER correlator page:
http://casper.berkeley.edu/wiki/index.php?title=Correlator
Add your own requirements:
http://casper.berkeley.edu/wiki/index.php?title=International_ Correlator_Collaboration
Email me: jason_manley@hotmail.com
PFB-FFT response
Current uses Pocket Spectrometer
- Using ATMEL ADC’s at 2
Gsamples/sec
- Performing 4 real FFT’s in 1
(complex) biplex pipelined FFT module.
- 2048 channels
- Uses just 1 ADC, 1 IBOB, and your