Optimising In- Field Processing using GPUs Tarik Saidani Senior - PowerPoint PPT Presentation

Optimising In- Field Processing using GPU’s Tarik Saidani Senior Software Engineer, PGS Peng Wang DevTech, Nvidia

From a Seismic Acquisition Survey

To a High Resolution Image of The Sea Subsurface

Problem: Source and Receiver Ghost receiver ghost sea surface source ghost source ghost ray-path cable (receiver) ghost ray-path direct ray-path direct ray-path far field

A Ghost Free Marine Acquisition System 5

Solution: Dual Sensor Streamer Acquisition • Method – Combine pressure and velocity sensors in a solid streamer – Use the complementary ghost patterns of the two sensors to remove the receiver ghost – Tow the dual sensor streamer deep for low frequency content • Result – The bandwidth of the data is increased for both low and high frequencies when compared with conventional streamer data – There is better low frequency penetration of the source signal – The acquisition method is less sensitive to weather conditions 6

The Big Data Challenge In the Seismic Business 1995 6 streamers 2005 16 streamers 2015 24 streamers

Seismic Acquisition Data Volumes • A typical streamer is 8000 meters long and contains 1280 receivers • Data is recorded in time chunks or as continuous series, 2ms sample interval, generating 500 samples per second per receiver • A streamer (single sensor) generates 640,000 samples per second • One streamer spread (10 streamers) generates 6,400,000 samples per second • Big spread, 20 streamers, dual sensor will generates 25,600,000 samples per second • Typical acquisition can generate multi-TBs of data per day

3D Wave-field Separation Workflow upsampling 16% Receiver 84 % Deghosting 100 % 9

Getting the Best Possible Image from the Early Stages Streamer-wise Wavefield Separation 3D Wavefield Separation 10

Upsampling • Iterative process • Frequency domain (wavenumber) • Not enough parallelism in inner loop (few thousands of threads) • Window parallelism not exposed in the CPU code • Loop restructuring to expose window parallelism • After the code change enough parallelism for the GPU (millions of threads) 11

Receiver Deghosting • Large volume of data (hydrophone and geosensor data) • Frequency domain computations • Parallelism over traces and frequency samples • Fairly straightforward parallel code • Parallelism available at many loop levels on a large number of iterations 12

Infield Constraints • Although she looks big in the picture the ship has very limited space to host a compute cluster • Power and cooling are also limited on-board a vessel • A CPU based solution was considered but was quickly discarded because of the constraints described above 13

But Also Facing Up to New Realities …

Phase 1: Getting the Most of the CPU Cycles • CPU code profiling and analysis • Hotspot analysis showed that not much could be improved • The vectorizer was not doing a great job, had to write vector intrinsics … • Reached an upper bound in terms of CPU performance … not enough! 15

Phase 2: What Can We Do Next? • Parallelism already present at different levels: thread, process, vectorization … • We can not rely on increasing the CPU core count because of the above constraints • GPU accelerators were the most obvious way forward • GPU prototype code: – Ported the streamerwise degohsting code to the GPU – 25x speedup compared to the single core CPU (Haswell) – 7x speedup on the entire flow: interesting … 16

Phase 3: The Bigger Picture • The streamerwise deghosting code having been ported to the GPU, upsampling was the new hotspot in the flow • Two parallel development branches: – Porting the upsampling code to the GPU – Porting the 3D deghosting code to the GPU • At the end of this phase the processing flow was 15x faster • In the meantime an additional processing step was added to the deghosting code: extrapolation • It increased the runtime and changed the application profile ( 50% of the runtime). 17

Phase 4: Putting it All Together • Ported the extrapolation to the GPU • Very similar compute kernel to the upsampling • The first benchmarks showed a throughput that was 40x faster than 1 CPU core • After running more production like tests we achieved an impressive 100x ! 18

Hardware Footprint CPU based 20:1 GPU based Nvidia Tesla K80 19

Summary • Wavefield separation is a fundamental step in marine data acquisition and processing • It is a very demanding process in terms of compute power • Infield constraints discard large scale systems • In order to deliver an acceptable throughput within an acceptable footprint the only viable solution is GPU based • The final result showed an impressive throughput along with a very small footprint • It Improves the geophysical quality of PGS field acquisition deliverable • Real-time 3D processing of data during acquisition • GPU deployment started on vessels in Q1 2016 20

Titan Class Tethys, Now With GPU- Based “3D Wavefield Separation A ppliance”

Acknowledgment Peng Wang Ty Mckercher Ken Hester 22

Optimising In- Field Processing using GPUs Tarik Saidani Senior - PowerPoint PPT Presentation

Optimising In- Field Processing using GPUs Tarik Saidani Senior Software Engineer, PGS Peng Wang DevTech, Nvidia From a Seismic Acquisition Survey To a High Resolution Image of The Sea Subsurface Problem: Source and Receiver Ghost

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

Optimising Optimising the Gas the Gas Netw Networ ork Helen Fitzgerald Wales & West

Memcheck vs Optimising Compilers: Memcheck vs Optimising Compilers: keeping the false positive

Real-Time GPU Management Heechul Yun 1 This Week Topic: General Purpose Graphic Processing

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

Performance Evaluation of a Multithreaded GPU Using CUDA GPU architecture GeForce 8800 GPU

Use Tesla to provide first GPU VM Service in China Feng Zhu

THEIA GPU Open Source multicore programmable GPU Problem Statement Develop an open source 3D

Super GPU & Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

MULTI-GPU TRAINING WITH NCCL Sylvain Jeaugey MULTI-GPU COMPUTING Harvesting the power of

GPU Architecture and chitecture and GPU Ar The good The good The bad The bad

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

MVAPICH2-GPU: Op0mized GPU to GPU Communica0on for InfiniBand

GPU programming Dr. Bernhard Kainz 1 Overview About myself Last week Motivation GPU

A Separation Logic to Verify Termination of Busy-Waiting for Abrupt Program Exit Tobias Reinhard 1

Roger Hosein TEDU UWI St Augustine Outline of presentation a) background on Make work

EeIP, Brussels 15 th September 2017 Vitor Judcibus I_HeERO Portuguese Project Coordinator

using a ghost-mode participation tool by Eguono Wayne Omagamre Kausik Das Outline of

McKesson , The Ghost of New Rules Past? Susan Perng Pan SUGHRUE MION, PLLC March 10, 2008 On

CLICKERS IN THE CLASSROOM By Laura Seddelmeyer ON THE MAP BELOW, THE LOCATION 1. Germany

Source watersheds & multiple land-use challenges Presented by Marina Krainer, Executive

Strategy Group Two ED Transfer Communication Physician Generated Information: History and

Optimising In- Field Processing using GPUs Tarik Saidani Senior - PowerPoint PPT Presentation

Optimising In- Field Processing using GPUs Tarik Saidani Senior Software Engineer, PGS Peng Wang DevTech, Nvidia From a Seismic Acquisition Survey To a High Resolution Image of The Sea Subsurface Problem: Source and Receiver Ghost

Status of GPU offloading on Wayland Axel Davy FOSDEM 2014 Status of GPU offloading on Wayland

Motivation to Learn GPGPU Julius Parulek Why to Learn About GPU? Computational power of GPU vs.

Optimising Optimising the Gas the Gas Netw Networ ork Helen Fitzgerald Wales &amp; West

Memcheck vs Optimising Compilers: Memcheck vs Optimising Compilers: keeping the false positive

Real-Time GPU Management Heechul Yun 1 This Week Topic: General Purpose Graphic Processing

UNIFIED MEMORY ON PASCAL AND VOLTA Nikolay Sakharnykh - May 10, 2017 1 HETEROGENEOUS

Advancements in V-Ray RT GPU Vlado Koylazov, CTO &amp; Co-founder Blagovest Taskov, RT GPU Team

Performance Evaluation of a Multithreaded GPU Using CUDA GPU architecture GeForce 8800 GPU

Use Tesla to provide first GPU VM Service in China Feng Zhu

THEIA GPU Open Source multicore programmable GPU Problem Statement Develop an open source 3D

Super GPU &amp; Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

MULTI-GPU TRAINING WITH NCCL Sylvain Jeaugey MULTI-GPU COMPUTING Harvesting the power of

GPU Architecture and chitecture and GPU Ar The good The good The bad The bad

GPU programming in Haskell Henning Thielemann 2015-01-23 GPU programming in Haskell Motivation:

MVAPICH2-GPU: Op0mized GPU to GPU Communica0on for InfiniBand

GPU programming Dr. Bernhard Kainz 1 Overview About myself Last week Motivation GPU

A Separation Logic to Verify Termination of Busy-Waiting for Abrupt Program Exit Tobias Reinhard 1

Roger Hosein TEDU UWI St Augustine Outline of presentation a) background on Make work

EeIP, Brussels 15 th September 2017 Vitor Judcibus I_HeERO Portuguese Project Coordinator

using a ghost-mode participation tool by Eguono Wayne Omagamre Kausik Das Outline of

McKesson , The Ghost of New Rules Past? Susan Perng Pan SUGHRUE MION, PLLC March 10, 2008 On

CLICKERS IN THE CLASSROOM By Laura Seddelmeyer ON THE MAP BELOW, THE LOCATION 1. Germany

Source watersheds &amp; multiple land-use challenges Presented by Marina Krainer, Executive

Strategy Group Two ED Transfer Communication Physician Generated Information: History and

Optimising Optimising the Gas the Gas Netw Networ ork Helen Fitzgerald Wales & West

Advancements in V-Ray RT GPU Vlado Koylazov, CTO & Co-founder Blagovest Taskov, RT GPU Team

Super GPU & Super Kernels: Make programming of multi-GPU systems easy Michael Frumkin, May 8,

Source watersheds & multiple land-use challenges Presented by Marina Krainer, Executive