T HREAD HIERARCHY ON CUDA GPU In CUDA, threads are grouped in blocks - - PowerPoint PPT Presentation

t hread hierarchy on cuda gpu
SMART_READER_LITE
LIVE PREVIEW

T HREAD HIERARCHY ON CUDA GPU In CUDA, threads are grouped in blocks - - PowerPoint PPT Presentation

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms GPU- BASED M ASSIVELY P ARALLEL I MPLEMENTATION OF M ETAHEURISTIC A LGORITHMS Robert Nowotniak, Jacek Kucharski Computer Engineering Department Technical University of Lodz


slide-1
SLIDE 1

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

GPU-BASED MASSIVELY PARALLEL IMPLEMENTATION OF METAHEURISTIC ALGORITHMS

Robert Nowotniak, Jacek Kucharski

Computer Engineering Department Technical University of Lodz

SŁOK, June 15-17, 2011

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011

slide-2
SLIDE 2

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

THREAD HIERARCHY ON CUDA GPU

In CUDA, threads are grouped in blocks and blocks constitute a grid. The unit of thread scheduling is warp (32 threads). Grid of Thread Blocks

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 1 / 7

slide-3
SLIDE 3

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

PROPOSED APPROACH TO PARALLELIZATION

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 2 / 7

slide-4
SLIDE 4

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

GPU-BASED IMPLEMENTATION OF METAHEURISTICS

Two levels:

1 Coarse-grained parallelization

In a grid, there can be several hundred blocks evolving independent populations with same or different parameters simultaneously.

2 Fine-grained parallelization

On the population level, each individual can be evaluated and transformed in a separate GPU thread. Thus, the whole population can be represented as a block of threads. Hundreds of populations with same or different parameters can be evolved in parallel, simultaneously.

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 3 / 7

slide-5
SLIDE 5

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

GPU-BASED IMPLEMENTATION OF METAHEURISTICS

Two levels:

1 Coarse-grained parallelization

In a grid, there can be several hundred blocks evolving independent populations with same or different parameters simultaneously.

2 Fine-grained parallelization

On the population level, each individual can be evaluated and transformed in a separate GPU thread. Thus, the whole population can be represented as a block of threads. Hundreds of populations with same or different parameters can be evolved in parallel, simultaneously.

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 3 / 7

slide-6
SLIDE 6

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

PERFORMANCE COMPARISON

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 4 / 7

slide-7
SLIDE 7

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

PERFORMANCE COMPARISON

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 4 / 7

slide-8
SLIDE 8

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

RESULTS

1 Pentium-III 500MHz (Visual C++ 6.0)

0.723 experiments / second (according to [1])

2 Intel Core i7 2.93GHz (1 core, ANSI C)

7.33 experiments / second

3 NVidia GTX 295 (CUDA C)

890 experiments / second (about 120x speedup)

4 8 GPUs (GTX295+GTX285+Tesla s1070+Tesla C2070)

3089 experiments / second (over 400x speedup)

1Han, K. H., Kim, J. H.: Genetic quantum algorithm and its application to

combinatorial optimization problem. Proceedings of the 2000 Congress on Evolutionary computation, 2000

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 5 / 7

slide-9
SLIDE 9

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

CORRECTNESS VERIFICATION

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 6 / 7

slide-10
SLIDE 10

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

CORRECTNESS VERIFICATION

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011 7 / 7

slide-11
SLIDE 11

GPU-based Massively Parallel Implementation of Metaheuristic Algorithms

Thank you for your attention

Robert Nowotniak, Jacek Kucharski SŁOK, June 15-17, 2011