Multi-parameter Waveform Inversion with GPUs for the Cloud A - PowerPoint PPT Presentation

Multi-parameter Waveform Inversion with GPUs for the Cloud A Pipelined Implementation Huy Le*, Stewart A. Levin, and Robert G. Clapp Geophysics Department, Stanford University March 28, 2018 Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 1

Waveform inversion χ ( m ) = 1 2 � f ( m ) − d � 2 2 χ ( m ): objective function m : subsurface parameters to recover f ( m ): modeled data by solving wave equations d : observed seismic data Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 2

Gradient-based optimization � T g ( m ) = 0 u ( m ) v ( m ) dt g ( m ): gradients u ( m ): source wavefields by solving forward wave equations v ( m ): receiver wavefields by solving adjoint wave equations in reverse time with data residuals as sources � T 0 : zero-lag temporal cross-correlation Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 3

Multi-parameters for better physics Solving wave equations with multiple parameters requires more memory. For a 1000 × 1000 × 500 volume, Physics Parameters Wavefields Memory (GBs) Acoustic isotropic 1 1 4 Acoustic VTI 3 2 12 Elastic isotropic 3 9 54 Elastic VTI 6 9 108 (VTI: vertical transverse isotropic). Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 4

Conventional domain decomposition Divide volumes among multiple GPUs, which are potentially on different nodes. More parameters demand more GPUs or GPUs with larger memory. Two-way communication among devices to exchange halos. Fast inter-nodal connection is not guaranteed, particularly on the cloud. Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 5

Pipelined approach Thor Johnsen and Alex Loddoch (GTC 2014). Divide computational domain along one axis into blocks. A single GPU streams through domain block by block and updates as many time steps as possible. Multiple updates significantly overlap host-device IO. Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 6

Stencil for 2nd-order time difference Y Divide along z-axis. X Z Each block contains half-stencil-length number of depth slices. block i-1 t=1 Need three consecutive blocks block i block i block i block i v t=0 t=1 t=2 for second derivatives. block i+1 t=1 time Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 7

Pipeline iteration 0 CPU GPU block0 block0 block0 block0 block0 block0 v t=0 t=1 v t=0 t=1 block1 block1 block1 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 8

Pipeline iteration 1 CPU GPU block0 block0 block0 block0 block0 block0 v t=0 t=1 v t=0 t=1 block1 block1 block1 block1 block1 block1 v t=0 t=1 v t=0 t=1 block2 block2 block2 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 9

Pipeline iteration 2 CPU GPU block0 block0 block0 block0 block0 block0 block0 v t=0 t=1 v t=0 t=1 t=2 block1 block1 block1 block1 block1 block1 v t=0 t=1 v t=0 t=1 block2 block2 block2 block2 block2 block2 v t=0 t=1 v t=0 t=1 block3 block3 block3 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 10

Pipeline iteration 3 CPU GPU block0 block0 block0 block0 block0 block0 block0 block0 v t=0 t=1 v t=0 t=1 t=2 t=3 block1 block1 block1 block1 block1 block1 block1 v t=0 t=1 v t=0 t=1 t=2 block2 block2 block2 block2 block2 block2 v t=0 t=1 v t=0 t=1 block3 block3 block3 block3 block3 block3 v t=0 t=1 v t=0 t=1 block4 block4 block4 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 11

Pipeline iteration 4 CPU GPU block0 block0 block0 block0 block0 block0 block0 block0 v t=0 t=1 v t=0 t=1 t=2 t=3 block1 block1 block1 block1 block1 block1 block1 block1 v t=0 t=1 v t=0 t=1 t=2 t=3 block2 block2 block2 block2 block2 block2 block2 v t=0 t=1 v t=0 t=1 t=2 block3 block3 block3 block3 block3 block3 v t=0 t=1 v t=0 t=1 block4 block4 block4 block4 block4 block4 v t=0 t=1 v t=0 t=1 block5 block5 block5 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 12

Pipeline iteration 5 CPU GPU block0 block0 block0 block0 block0 block0 block0 block0 v t=2 t=3 v t=0 t=1 t=2 t=3 block1 block1 block1 block1 block1 block1 block1 block1 v t=0 t=1 v t=0 t=1 t=2 t=3 block2 block2 block2 block2 block2 block2 block2 block2 v t=0 t=1 v t=0 t=1 t=2 t=3 block3 block3 block3 block3 block3 block3 block3 v t=0 t=1 v t=0 t=1 t=2 block4 block4 block4 block4 block4 block4 v t=0 t=1 v t=0 t=1 block5 block5 block5 block5 block5 block5 v t=0 t=1 v t=0 t=1 block6 block6 block6 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 13

Streams and threads to overlap transfer and compute Pipeline takes some iterations to initialize and drain. Stagger tasks to overlap. cudaMemcpyAsynch to copy between host and devices. Two CPU threads to copy between swappable buffers and pinned buffers. Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 14

Pipeline for 2 GPUs CPU GPU1 block0 block0 block0 block0 block0 block0 block0 block0 v t=4 t=5 v t=2 t=3 t=4 t=5 block1 block1 block1 block1 block1 block1 block1 block1 v t=0 t=1 v t=2 t=3 t=4 t=5 block2 block2 block2 block2 block2 block2 block2 block2 v t=0 t=1 v t=2 t=3 t=4 t=5 block3 block3 block3 block3 block3 block3 block3 v t=0 t=1 v t=2 t=3 t=4 GPU0 block4 block4 block4 block4 block4 block4 v t=0 t=1 v t=2 t=3 block5 block5 block5 block5 block5 block5 block5 block5 block5 block5 block5 v t=0 t=1 v t=0 t=1 t=2 t=3 v t=2 t=3 block6 block6 block6 block6 block6 block6 block6 block6 v t=0 t=1 v t=0 t=1 t=2 t=3 block7 block7 block7 block7 block7 block7 block7 block7 v t=0 t=1 v t=0 t=1 t=2 t=3 block8 block8 block8 block8 block8 block8 block8 v t=0 t=1 v t=0 t=1 t=2 block9 block9 block9 block9 block9 block9 v t=0 t=1 v t=0 t=1 block10 block10 block10 block10 block10 block10 v t=0 t=1 v t=0 t=1 block11 block11 block11 v t=0 t=1 transfer in update transfer out Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 15

IO bottle neck Computation of the gradients requires reverse-time propagation. Absorbing boundary condition and checkpoints require three propagations, but are IO- and memory-intensive. Solution: random boundary condition (Clapp, SEG 2009; Shen, SEG 2011). Trade-off: gradients computed on the fly and on device but require four propagations. Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 16

Pipelines for source and receiver wavefields Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 17

Acoustic isotropic wave equation One medium parameter and one wavefield at two consecutive time steps: 12 bytes per cell. Example: 6GB for volume 1000 × 1000 × 500 and 8th-order stencil. CPU code: blocked, Intel Thread Building Blocks (TBB), Intel SPMD Program Compiler (ISPC), single Xeon machine with 12 cores and 24 threads. "Optimal" speed when volume fits in one Tesla K80 GPU (12GB global memory, 2500 threads), i.e. no domain decomposition or host-device transfer. Pipelined code updates 94 times per host-device transfer for same memory. Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 18

Acoustic isotropic wave equation: forward modeling 3.0 2.5 2.380 2.220 2.0 GCells/s 1.5 1.000 1.0 0.5 0.0 CPU Pipeline 1 GPU Optimal Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 19

Acoustic VTI wave equations System of two second-order wave equations, three medium parameters and two wavefields, each at two consecutive time steps: 28 bytes per cell. Example: 28GB for volume 1000 × 1000 × 1000. Number of updates GPU Memory (GBs) 2 0.736 4 1.024 8 1.6 16 2.752 32 5.056 64 9.664 Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 20

Acoustic VTI wave equations: forward modeling 2.00 1.834 1.828 1.816 1.790 1.75 1.50 1.25 GCells/s 1.003 1.00 0.75 0.579 0.50 0.25 0.00 2 4 8 16 32 64 Number of updates Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 21

Acoustic VTI wave equations: forward modeling 8 updates (1.6GB on GPU) completely overlap host-device transfers. bandwidth max. speed = bytes per cell × N update . 7 GB/s 28 bytes per cell × 8 = 2 GCell/s. Achieved 1.79 GCell/s. Huy Le Multi-parameter Waveform Inversion with GPUs for the Cloud March 28, 2018 22

Multi-parameter Waveform Inversion with GPUs for the Cloud A - PowerPoint PPT Presentation

Multi-parameter Waveform Inversion with GPUs for the Cloud A Pipelined Implementation Huy Le*, Stewart A. Levin, and Robert G. Clapp Geophysics Department, Stanford University March 28, 2018 Huy Le Multi-parameter Waveform Inversion with GPUs

Waveform tomography and inversion - Full Waveform Inversion (FWI) Unit 12 Slide #1 Slide #2

Seismic Modeling, Migration and Velocity Inversion Full Waveform Inversion Bee Bednar Panorama

Seismic Modeling, Migration and Velocity Inversion Full Waveform Inversion Bee Bednar Panorama

Salvus: A flexible open-source package for full- waveform modelling and inversion Michael

Strengthening the inversion Tactic in Coq Dependent Types Inversion Lemmas Implications Anne

Why use GPUs for graph processing? FOSDEM 2020 2 GPUs and Graphs Graphs GPUs Found

Sovereign SCA Based Sovereign SCA Based Waveform Development Waveform Development p Mark

WDA waveform feeders ew2wda reads from EW waveform ring cs2wda reads from Comserv

Judicious Choice of Waveform Parameters and Judicious Choice of Waveform Parameters and Accurate

6. Parameter Passing Parameter Passing CS 381 Spring 2016 Example (Formal) Parameter void

10/16/19 Parameter Control Genetic Algorithms Motivation Parameter setting Tuning

NVIDIA GPUs in the Cloud 4 EVOLVING CLOUD REQUIREMENTS On Off Hybrid Cloud premises premises

Pumping and population inversion - Laser amplification Gustav Lindgren 2015-02-12 Contents

Short range geoacoustic inversion Short range geoacoustic inversion with a vertical line array

Asteroid orbital inversion using Asteroid orbital inversion using Markov-chain Monte Carlo

Boolean Algebra - Part 2 September 4, 2008 Typeset by Foil T EX Inversion Inversion or

T OF D Who we ar e Co nve ntio na l a nd Oil a nd Ga s, Re fine ry, Ove r 400 pe rso nne l

HSRP HYDROGRAPHIC SERVICES HSRP HYDROGRAPHIC SERVICES IMPROVEMENTS IMPROVEMENTS Jeffrey L.

Architecture Optimisation applied to Gearboxes SIM@SYST. Level 2014, Carg` ese Steven Masfaraud 1

IMPLEMENTING THE HR STRATEGY FOR RESEARCH (HRS4R) . . . . . . . . . . . .

Wall Turbulence Control by spanwise-traveling waves Wenxuan Xie, Maurizio Quadrio Department of

1 W AVE L AND B ASIC C ONCEPTS W AVE L AND is a specialised computer tool for assessment of wave

LEVELLING THE NEW SEA LOCKS IN THE NETHERLANDS; INCLUDING THE DENSITY DIFFERENCE Wim Kortlever,

June 2020 Investor Presentation Safe harbor FORWARD-LOOKING STATEMENTS This presentation