bifrost easy gpu pipeline development
play

Bifrost: Easy GPU Pipeline Development - PowerPoint PPT Presentation

Bifrost: Easy GPU Pipeline Development github.com/ledatelescope/bifrost Presenter: Miles Cranmer (CfA/McGill) On behalf of: Ben Barsdell (NVIDIA), Danny Price (Berkeley), Jayce Dowell (UNM), Hugh Garsden (CfA), Frank Schinzel (NRAO),


  1. Bifrost: Easy GPU Pipeline Development github.com/ledatelescope/bifrost • Presenter: Miles Cranmer (CfA/McGill) • On behalf of: Ben Barsdell (NVIDIA), Danny Price (Berkeley), Jayce Dowell (UNM), Hugh Garsden (CfA), Frank Schinzel (NRAO), Greg T aylor (UNM), Lincoln Greenhill (CfA) 8/14/17 Miles Cranmer 1

  2. Stream-processing and real-time GPU computing • Stream-processing: operating on data which is potentially unlimited in extent • E.g., time stream of digitized voltages • Nontrivial for CPU/GPU systems: • Creation of data structures for bufger memory management, packet capture • Additional complexities for asynchronous copies and kernel execution • Manual parallelization/core binding of algorithms and pipelines • Potential issues include memory leaks and race conditions 8/14/17 Miles Cranmer 2

  3. Bifrost is deployed in the wild: • Backend for newest LWA station in NM • Bifrost-powered data capture for live all-sky image • Google: “LWA TV 2” • Pulsar detection: • Validation timing within 0.0001 ms of canonical for PSR B0834+06 (well within 1σ of measurement) 8/14/17 Miles Cranmer 3

  4. Bifrost core concepts • Blocks • Independent thread • “Black box” algorithm • Ring bufgers (Rings) • Emulates wrap-around in memory • Memory spaces • Rings assigned to specifjc “space” • Pipelines • Combination of the above 8/14/17 Miles Cranmer 4

  5. The Bifrost framework • Python frontend wraps fast C/C++/CUDA backend • Frontend: • Blocks and Pipelines are Python object abstractions for the backend • ND-array object for memory management (span of ring bufger) • ctypes wraps all C calls • Backend: • Common type defjnitions and “BFarray” generic data structure • “Ring bufger” used for inter-block communication • Several common modules implemented 8/14/17 Miles Cranmer 5

  6. Ring Bufger implementation • Multiple readers, single writer ⇒ branched pipelines OK • Thread safe • Allocated in system (CPU), cuda (GPU), or cuda_host (pinned CPU) memory 8/14/17 Miles Cranmer 6 • What’s unique?

  7. API example 1: block class QuantizeBlock( TransformBlock ): def __init__ ( self , iring , dtype , scale = 1., * args , ** kwargs ): TransformBlock. __init__ (self, iring, * args, ** kwargs) self.dtype = dtype self.scale = scale def on_sequence ( self , isequence ): ohdr = deepcopy(isequence.header) ohdr['_tensor']['dtype'] = self.dtype return ohdr def on_data ( self , ispan , ospan ): bf.quantize.quantize(ispan.data, ospan.data, self.scale) 8/14/17 Miles Cranmer 7

  8. API example 2: pipeline bc = bf.BlockChainer() Read in fjle bc.blocks.read_wav(['audio_file.wav'], gulp_nframe = 4096) bc.blocks.copy( space = 'cuda') Copy to GPU bc.views.split_axis('time', 256, label = 'fine_time') FFT bc.blocks.fft( axes = 'fine_time', axis_labels = 'freq') Square modulus bc.blocks.detect( mode = 'scalar') Transpose bc.blocks.transpose(['time', 'pol', 'freq']) bc.blocks.copy( space = 'cuda_host') Copy back to CPU Convert to 8-bit bc.blocks.quantize('i8') integer bc.blocks.write_sigproc() Save pipeline = bf.get_default_pipeline() pipeline.shutdown_on_signals() Run the pipeline pipeline.run() 8/14/17 Miles Cranmer 8

  9. bf.map • Easy CUDA kernel generation from Bifrost • JIT compiler uses NVRTC # Create three arrays on the GPU, A and B, and an empty output C a = bf.ndarray([1,2,3,4,5], space = 'cuda') b = bf.ndarray([1,0,1,0,1], space = 'cuda') c = bf.empty(5, space = 'cuda') # Add A, B together bf.map("c = a + b", data = {'c': c, 'a': a, 'b': b}) 8/14/17 Miles Cranmer 9

  10. bf.map Explicit indexing also supported. Outer product: bf.map("c(i,j) = a(i) * b(j)", {'c': c, 'a': a, 'b': b}, axis_names = ('i','j')) 8/14/17 Miles Cranmer 10

  11. Why Bifrost? 8/14/17 Miles Cranmer 11

  12. Why Bifrost? Astronomy-specifjc • Bifrost developed in parallel with LWA-SV, driven by radio astronomy applications • ⇒ Core structural advantages for astronomy • Ring features • Metadata describes the units of ring bufger dimensions; used in algorithms (e.g., dedispersion) • Multi-sequence ring bufgers, useful for difgerent observations. The metadata will propagate down the pipeline. • Time-tagged sequences in ring bufgers ⇒ can dump section of data to disk based on time range, observation name • Useful for detections of transient phenomena • Ndarray is a child of numpy.ndarray ⇒ compatibility with many numpy functions, matplotlib, etc. 8/14/17 Miles Cranmer 12

  13. Why Bifrost? Block library Many astronomy and general processing blocks already built • State of the art and fmexible high-performance implementations • Metadata rich • Well-documented • accumulate • Flexible dimensions • audio • binary_io • detect These include: • fdmt • fft • fftshift • guppi_raw • quantize • reduce • reverse • serialize • sigproc • transpose • unpack • wav 8/14/17 Miles Cranmer 13

  14. Why Bifrost? Logging and performance benchmarking • getirq • getsiblings • like_bmon • like_ps • like_top • pipeline2dot • setirq 8/14/17 Miles Cranmer 14

  15. Why Bifrost? Rapid development speed; high performance Bifrost code vs. C++ legacy: 8/14/17 Miles Cranmer 15

  16. Why Bifrost? Rapid development speed; high performance 8/14/17 Miles Cranmer 16

  17. Why Bifrost? Rapid development speed; high performance 8/14/17 Miles Cranmer 17

  18. Conclusion • Future work • PSRDADA – Bifrost block • T o enable capture with PSRDADA to a Bifrost ring for post-processing • Additional options for visualization, "ScopeBlock” • Visualize ring contents in real-time • Aiming for full support of correlation, pulsar/transient backend pipelines github.com/ledatelescope/bifrost (or, Google: “leda telescope bifrost”) 8/14/17 Miles Cranmer 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend