Efficient Imaging in Radio Astronomy using GPUs Bram Veenboer, - PowerPoint PPT Presentation

Netherlands Institute for Radio Astronomy Efficient Imaging in Radio Astronomy using GPUs Bram Veenboer, Matthias Petschow and John W. Romein Tuesday 9 th May, 2017, GTC 2017, San Jose, USA ASTRON is part of the Netherlands Organisation for Scientific Research (NWO)

Radio Astronomy Array of antennas and/or dishes Radio frequencies (30-240 Mhz) Map of radio sources LOFAR, The Netherlands Bo¨ otes field, > 1000 Megapixel Image credits: Wendy Williams, Reinout van Weeren and Huub Rottgering 1

Square Kilometre Array SKA1 Mid, Africa SKA1 Low, Australia 2

Square Kilometre Array 3

Imaging in Radio Astronomy Convert measurements (visibilities) into a sky-image: i n c o m i n g r a d i station o w a v e s imager sky-model correlator sky-image visibilities × I baseline (pair of stations) calibration imaging Measurement equation: phasor e − iφ A-term W-term m A p ( l, m ) × B ( l, m ) × e − 2 πi ( u pq l + v pq m + w pq n ) dldm V pq = � � l visibility sky coordinates source brightness visibility coordinate u, v, w 4

Fourier sampling instantaneous u,v-coverage u,v-coverage for one hour every baseline contributes one point (visibility) ‘earth rotation synthesis‘ 5

‘Gridding’ visibilities Place visibilities onto a regular Fourier grid: correct for correct for ‘direction-dependent effects’ phase correction earth curvature phasor e − iφ A-term W-term m A p ( l, m ) × B ( l, m ) × e − 2 πi ( u pq l + v pq m + w pq n ) dldm V pq = � � l visibility sky coordinates source brightness visibility coordinate u, v, w floating-point numbers Traditional approach: apply ‘convolution’ to each visibility 6

Imaging example Simulated three point sources, observed by 30 stations for 4 hours: 2 D FFT − → gridded visibilities sky image 7

Efficient Imaging in Radio Astronomy measured visibilities “image” − gridding iFFT CLEAN Fourier residual image grid bright sources Fourier grid model image degridding FFT model visibilities “predict” sky-image Problem: The ‘gridding’ and ‘degridding’ steps are computationally very expensive Solution: Use the novel Image-Domain Gridding (IDG) algorithm on accelerators Algorithm credits: Bas van der Tol 8

Placing visibilities onto a regular Fourier grid Fourier domain gridding using convolution kernels grid visibilities visibilities gridder kernel image subgrids gridder kernel FFT Fourier subgrids adder visibility: convolution: Fourier grid Fourier grid pixel in subgrid: Image domain gridding using subgrids 9

Image domain gridding: subgrids grid subgrid V j : (1 , ˜ C ) ( ˜ C ) (1 , 1) ( ˜ T, 1) A subset ( ˜ T × ˜ C ) of visibilities from baseline j are placed onto a subgrid 10

Image domain gridding: work distribution (2) subset of work (a number of subgrids) (3) work element (one subgrid) (1) work (4) pixels (all subgrids for a few baselines) 11

Optimizations General: Coarse-grained parallelism, vectorization, libraries Double buffering, shared memory Application specific: Fine-grained parallelism Data transpose (visibilities) Data alignment (uvw coordinates) Architecture specific: Computation of phasor term ( e − i φ ) Nvidia: one special function unit (SFU) for every four/six cores GCN: one transcendental operation per SIMD per four clock cycles 12

GPU implementation gpu subset of work HtoD shared global memory cpu queue memory preload into cache DtoH queue precompute / store result load from cache execute queue cpu threads o ffl oad subsets of work to GPU gpu threads perform gpu cores computation (in registers) sfu 13

Results: throughput/runtime Throughput: number of visibilities processed per second Haswell Haswell Pascal Pascal Fiji Fiji 0 20 40 60 Runtime [seconds] 0 100 200 300 gridder subgrid-ifft adder grid-fft Throughput [MVisibilities/s] splitter subgrid-fft degridder gridding degridding Most time spent in gridder/degridder GPUs perform > order of magnitude better than CPU 14

Roofline analysis: overview Pascal 10 Fiji gridder Performance [TOp/s] Haswell degridder 1 degridder gridder Pascal Fiji 0.1 Haswell 1 2 4 8 16 32 64 128 256 512 1 1 4 2 Operational intensity [Op/Byte] 15

Performance for FMA/sincos instruction mix 10 4 98 % 4 × 59 % Performance [GOp/s] 10 3 22 % 10 2 Pascal Fiji Haswell 256 128 64 32 16 8 4 2 1 1 1 2 4 ρ [fma/sincos] 16

Roofline analysis: instruction mix Pascal 10 Fiji gridder Performance [TOp/s] Haswell degridder Device Memory 1 M degridder gridder A R D Pascal Fiji 0.1 Haswell 1 2 4 8 16 32 64 128 256 512 1 1 4 2 Operational intensity [Op/Byte] 17

Roofline analysis: shared memory Pascal 10 Shared Memory Fiji gridder gridder Performance [TOp/s] Haswell degridder degridder Device Memory 1 M degridder gridder A R D Pascal Fiji 0.1 Haswell 1 2 4 8 16 32 64 128 256 512 1 1 4 2 Operational intensity [Op/Byte] 18

Results: energy consumption/efficiency Haswell Haswell Pascal Pascal Fiji Fiji 0 5 10 15 20 0 10 20 30 Energy consumption [kJ] Energy efficiency [GFlop/W] gridder subgrid-ifft adder grid-fft gridder degridder splitter subgrid-fft degridder host Most energy spent in gridder/degridder GPUs perform > order of magnitude better than CPU 19

Results: AW-projection at the cost of W-projection 400 W-projection 2 . 4 × Image-Domain Gridding Throughput [MVisibilities/s] 2 . 1 × 300 200 1 . 8 × 100 1 . 2 × 1 . 3 × 1 . 4 × 1 . 4 × 1 . 1 × 0 64 56 48 40 32 24 16 8 W-kernel size N W 20

Conclusion First implementations of the IDG algorithm on CPUs and GPUs First efficient degridding implementation on GPUs ever A thorough (roofline) analysis of the achieved performance An assessment of energy efficiency IDG on GPUs is a candidate to meet the demanding computational and energy efficiency constraints imposed by future telescopes such as the Square Kilometre Array (SKA). Image-Domain Gridding on Graphics Processors, Bram Veenboer, Matthias Petschow and John. W Romein, IPDPS 2017 21

Efficient Imaging in Radio Astronomy using GPUs Bram Veenboer, - PowerPoint PPT Presentation

Netherlands Institute for Radio Astronomy Efficient Imaging in Radio Astronomy using GPUs Bram Veenboer, Matthias Petschow and John W. Romein Tuesday 9 th May, 2017, GTC 2017, San Jose, USA ASTRON is part of the Netherlands Organisation for

Nuclear Imaging Medical Imaging Medical Imaging Nuclear Imaging Nuclear Imaging Nuclear

RADIO RADIO Ca Cate tegor gory y Br Breakdo eakdown wn Radio Radio Cr Crea eativity

The Imaging Chain The Imaging Chain in Optical Astronomy in Optical Astronomy 1 Review and

International Lightpath Experiences Radio Astronomy Courtesy of NRAO Radio vs. Optical astronomy

On-demand radio imaging On-demand radio imaging access to calibrated data for all astronomers

Efficient Video Decoding on GPUs Efficient Video Decoding on GPUs by Point Based Rendering by

The Imaging Chain The Imaging Chain in X- -Ray Astronomy Ray Astronomy in X 1 Pop quiz (1):

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

The Imaging Chain The Imaging Chain 1. energy source 2. object in Optical Astronomy in Optical

Packet Radio Lee Maddox, N4HOK What is Packet Radio? Packet radio is the connection of a computer

GNU Radio An introduction By Maryam Taghizadeh Dehkordi 9/9/2007 GNU Radio Outline

Dark Matter Radio (DM Radio) Kent Irwin for the DM Radio Collaboration DM Radio Pathfinder

Radio Radio It involve antennas It involve antennas It apparently involves electricity

WIDE-FIELD IMAGING IN CLASSIC AIPS Eric W. Greisen National Radio Astronomy Observatory

Cluster Diffuse Radio Emissions Kaustuv Basu Argelander Institute for Astronomy University of

Introduction to Medical Imaging Dr Kevin Ho-Shon Head of Medical Imaging Macquarie Medical

Key Stage 2 Maths Aims for this session: To explore some of the things your children learn in

1 6th Grade Mathematical Expressions 20151020 www.njctl.org 2 Table of Contents

L EARNED P RESENTATION A UDIO T RANSCRIPT J UNE 18, 2020 P RESENTERS Robert Bertram, USAID Bureau

Institute for Transport Studies Faculty of Environment Understanding aberrant driving behaviour

Disparity map computation on Cell Ondej Korotvika (koroto1@fel.cvut.cz) What is disparity

via Microsoft Open Microsoft Word (this is Word 2010) Change font to Times New Roman

East End Crossing Community Meeting March 2013 East End Crossing Community Meeting March 2013

Time Measure Constructions Paola Cpeda & Jiwon Yun (Stony Brook University) The 26th

Efficient Imaging in Radio Astronomy using GPUs Bram Veenboer, - PowerPoint PPT Presentation

Netherlands Institute for Radio Astronomy Efficient Imaging in Radio Astronomy using GPUs Bram Veenboer, Matthias Petschow and John W. Romein Tuesday 9 th May, 2017, GTC 2017, San Jose, USA ASTRON is part of the Netherlands Organisation for

Nuclear Imaging Medical Imaging Medical Imaging Nuclear Imaging Nuclear Imaging Nuclear

RADIO RADIO Ca Cate tegor gory y Br Breakdo eakdown wn Radio Radio Cr Crea eativity

The Imaging Chain The Imaging Chain in Optical Astronomy in Optical Astronomy 1 Review and

International Lightpath Experiences Radio Astronomy Courtesy of NRAO Radio vs. Optical astronomy

On-demand radio imaging On-demand radio imaging access to calibrated data for all astronomers

Efficient Video Decoding on GPUs Efficient Video Decoding on GPUs by Point Based Rendering by

The Imaging Chain The Imaging Chain in X- -Ray Astronomy Ray Astronomy in X 1 Pop quiz (1):

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

The Imaging Chain The Imaging Chain 1. energy source 2. object in Optical Astronomy in Optical

Packet Radio Lee Maddox, N4HOK What is Packet Radio? Packet radio is the connection of a computer

GNU Radio An introduction By Maryam Taghizadeh Dehkordi 9/9/2007 GNU Radio Outline

Dark Matter Radio (DM Radio) Kent Irwin for the DM Radio Collaboration DM Radio Pathfinder

Radio Radio It involve antennas It involve antennas It apparently involves electricity

WIDE-FIELD IMAGING IN CLASSIC AIPS Eric W. Greisen National Radio Astronomy Observatory

Cluster Diffuse Radio Emissions Kaustuv Basu Argelander Institute for Astronomy University of

Introduction to Medical Imaging Dr Kevin Ho-Shon Head of Medical Imaging Macquarie Medical

Key Stage 2 Maths Aims for this session: To explore some of the things your children learn in

1 6th Grade Mathematical Expressions 20151020 www.njctl.org 2 Table of Contents

L EARNED P RESENTATION A UDIO T RANSCRIPT J UNE 18, 2020 P RESENTERS Robert Bertram, USAID Bureau

Institute for Transport Studies Faculty of Environment Understanding aberrant driving behaviour

Disparity map computation on Cell Ondej Korotvika (koroto1@fel.cvut.cz) What is disparity

via Microsoft Open Microsoft Word (this is Word 2010) Change font to Times New Roman

East End Crossing Community Meeting March 2013 East End Crossing Community Meeting March 2013

Time Measure Constructions Paola Cpeda &amp; Jiwon Yun (Stony Brook University) The 26th

Time Measure Constructions Paola Cpeda & Jiwon Yun (Stony Brook University) The 26th