FFTs of Arbitrary Dimensions on GPUs Xiaobai Sun and Nikos - - PowerPoint PPT Presentation

ffts of arbitrary dimensions on gpus
SMART_READER_LITE
LIVE PREVIEW

FFTs of Arbitrary Dimensions on GPUs Xiaobai Sun and Nikos - - PowerPoint PPT Presentation

HPEC-2 0 0 7 DUKE MI T-LL FFTs of Arbitrary Dimensions on GPUs Xiaobai Sun and Nikos Pitsianis Duke University September 19, 2007 At High Performance Embedded Computing 2007 MIT-LL HPEC-2 0 0 7 DUKE MI T-LL Overview Motivation


slide-1
SLIDE 1

DUKE MI T-LL HPEC-2 0 0 7

FFTs of Arbitrary Dimensions

  • n GPUs

Xiaobai Sun and Nikos Pitsianis Duke University September 19, 2007 At High Performance Embedded Computing 2007 MIT-LL

slide-2
SLIDE 2
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

2

DUKE MI T-LL HPEC-2 0 0 7

Overview

  • Motivation

– FFTs of arbitrary dimensions and their applications – Graphics processing units (GPUs)

  • Basic facts on dimensionality
  • FFTs on GPUs
  • 1. 2D FFT is chosen as the primitive one at API level
  • 2. 2D FFT performance is conveyed to FFTs of other dimensions
  • Experimental results
  • Discussion of related issues and works
slide-3
SLIDE 3
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

3

DUKE MI T-LL HPEC-2 0 0 7

Remove Keystone Remove Keystone Spatially Variant Refocus Spatially Variant Refocus Motion Comp Motion Comp Auto- Focus Auto- Focus

Motivation : FFT Applications

From S. Bellofiore and H. Schmitt at

Polar Format NUFFT Polar Format NUFFT

slide-4
SLIDE 4
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

4

DUKE MI T-LL HPEC-2 0 0 7

I m a g e d e g r a d a t i

  • n

b y t r a d i t i

  • n

a l i n t e r p

  • l

a t i

  • n

p r i

  • r

i t

  • 2
  • D

F F T – L

  • s

s

  • f

r e s

  • l

u t i

  • n

– L

  • s

s

  • f

d a t a

Synthetic Aperture Ground Plane Slant Plane

CRP

Δθ Range Samples Δθ PRFs

2-D FFT

1800 1800 2048 2048

Polar Format

1:1 Range 1.25:1 Azimuth

From S. Bellofiore and H. Schmitt at

slide-5
SLIDE 5
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

5

DUKE MI T-LL HPEC-2 0 0 7

Motivation : GPU Architecture

GPU : Graphics Processing Unit

  • Highly parallel multi-processors
  • Affordable commodity product
  • Initially dedicated to graphics processing and rendering
  • Presently capable of co-processing on Desktop, Laptop
  • Increasing programmability and API support
  • Image processing & rendering
  • GP-GPU
slide-6
SLIDE 6
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

6

DUKE MI T-LL HPEC-2 0 0 7

Basic Facts on Dimensionality

  • 1. In mathematics, FFTs are considered dimensionless in the sense that

the factorizations can be described in a unified, recursive representation with provided scaling factors some of which are dimension dependent. In computation, trivial scaling may be skipped.

  • 2. In application, it is often required that phase-frequency information, or

spatial and geometric relation, be provided explicitly at input and output. FFT data are not shapeless.

  • 3. In architecture, extra dimensions are induced by the data access

patterns most efficiently supported . FFT data are transfigured at different memory level. GPUs support fine-granularity, 2D access to memory frames at the API level

slide-7
SLIDE 7
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

7

DUKE MI T-LL HPEC-2 0 0 7

2D FFT as the Primitive

  • 2D FFT
  • 2D data placement

– Two complex numbers per pixel vector (4 floating point numbers) :

  • ne at the front, one at the back

– Even columns at the front layers, odd columns at the back layers

  • 2D array operations through

– utilizing best the architectural support

  • f 2D data access at API level

– Radix-2, radix 3 and mixed radices

  • Direct 2D bit-reversal

– Up to certain sub-array size – 2D data partitioning in large data array Re( X ) Im( X )

slide-8
SLIDE 8
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

8

DUKE MI T-LL HPEC-2 0 0 7

Radix 2, Radix 3 and Mixed Radices

slide-9
SLIDE 9
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

9

DUKE MI T-LL HPEC-2 0 0 7

Direct Two Dimensional Bit Reversal

  • Not one dimension after another
  • Not recursion up to certain frame

block size (low bits)

  • For large data size, block swaps

(bit reversal in high bits)

X( i, j ) X( Rm (i), Rn (j) )

slide-10
SLIDE 10
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

10

DUKE MI T-LL HPEC-2 0 0 7

2D Bit Reversal

2K 2K 30.4 ms 2K 2K 19.6 ms

slide-11
SLIDE 11
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

11

DUKE MI T-LL HPEC-2 0 0 7

2D Bit Reversal

1K 2K 17.5 ms 9.8 ms 14.8 ms

slide-12
SLIDE 12
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

12

DUKE MI T-LL HPEC-2 0 0 7

18 19 20 21 22 23 50 10 15 20 25 30 35 40 2D FF T T im es log2 of data v

  • lum

e Time in msec Arithm etic Bit R ev ersa l

slide-13
SLIDE 13
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

13

DUKE MI T-LL HPEC-2 0 0 7

18 19 20 21 22 23 10 20 30 40 50 60 T

  • tal 2D FF

T T im es log2 of data v

  • lum

e Time in msec Write to GP U Arithm etic Bit R ev ersa l Re ad from G PU

slide-14
SLIDE 14
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

14

DUKE MI T-LL HPEC-2 0 0 7

FFTs of Other Dimensions

  • 1D FFT of size n
  • 3D FFT of dimensions

Add a scaling stage in 2D FFT Skip a scaling stage in 2D FFT ( In a simple case )

slide-15
SLIDE 15
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

15

DUKE MI T-LL HPEC-2 0 0 7

18 19 20 21 22 23 50 10 15 20 25 30 35 40 45 1D, 2D and 3D FFT s log2 of data v

  • lum

e Compute time in msec 1-D FF T 2-D FF T 3-D FF T

slide-16
SLIDE 16
  • Sept. 19, 2007

FFTs of Arbitrary Dimensions on GPUs

16

DUKE MI T-LL HPEC-2 0 0 7

Other Issues and Works

  • Twiddle factors :

– Pre-calculated, partially calculated, calculate on the fly – Numerical behavior

  • Data loading and unloading

– Data placement in main memory – A sequence of successive FFTs

  • Automated tuning
  • Other commodity products

– IBM Cell

  • FPGAs