BIFROST HIGH-THROUGHPUT CPU/GPU PIPELINES MADE EASY Ben Barsdell, - PowerPoint PPT Presentation

April 4-7, 2016 | Silicon Valley BIFROST HIGH-THROUGHPUT CPU/GPU PIPELINES MADE EASY Ben Barsdell, 4/7/2016

DISAMBIGUATION The ‘ Bifrost ’ presented here is… NOT the stellar atmospheres code of the same name NOT the fluid simulation framework of the same name NOT a burning rainbow bridge that connects Midgard and Asgard (although that’s where the name comes from) 2 4/7/2016 https://www.youtube.com/watch?v=K7qM7l7GE5E

Background What Bifrost is OUTLINE What’s inside Future work 3

ACKNOWLEDGEMENTS Stems from many useful discussion with: Lincoln Greenhill, Danny Price, Hugh Garsden @ Harvard CFA (the LEDA project) Work related to the LWA project based at UNM 4 4/7/2016

BACKGROUND Application areas Pipeline processing Soft real-time constraints High throughput demands (latency not a big concern) Experimental science, computer vision Can’t afford to be inefficient 5 4/6/2016

BACKGROUND Example: Radio astronomy correlator pipeline Cross-mult Gain solve accum UDP Beamform ADC + FPGA capture Triggered dump 6 4/6/2016

BACKGROUND Current approaches PRODUCTIVITY PERFORMANCE Numpy, Matlab etc. High Low Monolithic C/C++/CUDA Low Medium Pipeline C/C++/CUDA Very low High 7 4/6/2016

BACKGROUND Motivation We know GPUs are great at signal processing Many efficient kernels have been written BUT: Sharing of code within the community could be improved Stitching together a pipeline is still a hard problem Debugging a pipeline can be very painful 8 4/6/2016

BACKGROUND Existing software PSRDADA HashPipe Pelican GNU Radio CASPER toolflow Plus many standalone processing pipelines for individual projects… 9 4/6/2016

BIFROST What it aims to be A framework for flexible CPU/GPU pipelines + a library of common operations Productivity: high-level API, rapid prototyping and debugging Performance: competitive with best-in-class, suitable for instant deployment 10 4/6/2016

BIFROST What it aims to be Describe pipelines in, e.g., JSON or simple Python Iterate quickly on new ideas, watch results in real time Share and reuse common operations within the community Reduce total development time by 10x 11 4/7/2016

BIFROST What it actually is Still very early in development! Lots more work to be done. Currently consists of: Flexible ring buffer implementation (the heart of the framework) Small selection of useful functions Prototype packet capture functionality Portable C API with C++ and Python wrappers 12 4/7/2016

BIFROST Ring buffer CPU or GPU memory space Independent access to contiguous spans of any size at any offset Fully thread-safe, including resize at any time Multiple readers, guaranteed or commensal ‘Ringlets’ (aka channels) allow time to be fastest-changing dimension Sequence management with random access by name or time tag 13 4/6/2016

BIFROST Library functions Memcpy/memset wrappers General ND array transpose (1-16 byte elements) Under development: CMAC, delay-and-sum, gain solve Eventually: filtering, imaging, RFI mitigation, transient searching… Existing implementations can be wrapped for integration into pipelines 14 4/6/2016

BIFROST Asynchronous execution model Launch processing operations in different CPU threads Communicate via ring buffers, copy-free Pass metadata via sequence headers in the ring Execute synchronously within each thread, but don’t block the GPU (use local stream + cudaStreamSynchronize) IO + CPU + H2D + GPU + D2H in separate threads => full pipelining 15 4/6/2016

BIFROST Packet capture Fast UDP packet capture very important for radio telescope backends Want to achieve line rate on 10 or 40 Gbps ethernet NICs Catch packets and scatter into correct order in ring buffer Keep up to 3 ‘spans’ open for writing, commit the earliest when the latest is touched Auto-segment based on header changes or timeouts 16 4/6/2016

BIFROST Triggered dump operation “Triggered baseband dumps” are a common feature of radio telescopes Use large ring buffer to keep the past X seconds in memory Ring sequences enable random access to buffered points in time 17 4/6/2016

BIFROST The importance of metadata Sequence headers can be used to store metadata Enables strong decoupling of processing operations Allows ‘smart’ operations; avoids manual configuration/adjustment of parameters Using a standard encoding (e.g., json) simplifies mixed-language pipelines 18 4/7/2016

BIFROST Python operation example class TransposeOp(object): def main(self): with self.oring.begin_writing() as oring: for iseq in self.iring: ihdr = json.loads(iseq.header.tostring()) dtype = np.dtype(ihdr['dtype']) Metadata ohdr = {} handling ohdr['frame_shape'] = ihdr['ringlet_shape'] … ohdr = json.dumps(ohdr) self.oring.resize(ogulp_nbyte) with oring.begin_sequence(iseq.name, ohdr, onringlet) as oseq: Ring for ispan in iseq.read(ogulp_nbyte, self.guarantee): handling with oseq.reserve(igulp_nbyte) as ospan: src = ispan.data_view(dtype) dst = ospan.data_view(dtype) Processing bfTranspose(dst, src, axes=[1,0]) 19 4/7/2016

BIFROST Ring buffer C API sample BFstatus bfRingCreate BFstatus bfRingDestroy BFstatus bfRingResize BFstatus bfRingSequenceBegin BFstatus bfRingSequenceEnd BFstatus bfRingSequenceOpen BFstatus bfRingSequenceOpenAt BFstatus bfRingSequenceOpenLatest BFstatus bfRingSequenceOpenEarliest BFstatus bfRingSequenceOpenNext BFstatus bfRingSequenceClose BFstatus bfRingSpanReserve BFstatus bfRingSpanCommit BFstatus bfRingSpanAcquire BFstatus bfRingSpanRelease 20 4/7/2016

FUTURE WORK Current plans Abstractions for quickly writing new ops Automated pipeline construction (threads, ring allocation, metadata handling etc.) Large library of operations that can be strung together Fast and customizable UDP packet capture Live streaming data visualization (‘scopes’) 21 4/6/2016

FUTURE WORK Contributions Looking for feedback, suggestions, contributions Planning to push new code to GitHub soon http://beingevil.tumblr.com/post/10980294735/horrible-thor-pickup-lines-1 22 4/7/2016

April 4-7, 2016 | Silicon Valley THANK YOU JOIN THE NVIDIA DEVELOPER PROGRAM AT developer.nvidia.com/join

BIFROST HIGH-THROUGHPUT CPU/GPU PIPELINES MADE EASY Ben Barsdell, - PowerPoint PPT Presentation

April 4-7, 2016 | Silicon Valley BIFROST HIGH-THROUGHPUT CPU/GPU PIPELINES MADE EASY Ben Barsdell, 4/7/2016 DISAMBIGUATION The Bifrost presented here is NOT the stellar atmospheres code of the same name NOT the fluid simulation

Bifrost Easy High-Throughput Computing github.com/ledatelescope/bifrost Miles Cranmer

Bifrost: Easy GPU Pipeline Development github.com/ledatelescope/bifrost Presenter: Miles

February 2003 FIRST Technical Colloquium February 10-11, 2003 @ Uppsala, Sweden bifrost a high

Panfrost A reverse engineered FOSS driver for Mali Midgard and Bifrost GPUs Contributors

CASPER AND GPUS MODERATOR: DANNY PRICE, SCRIBE: RICHARD PRESTAGE Frameworks MPI,

Simulations of the Impact of Partial Ionization on the Chromosphere Juan Martnez-Sykora Bart

Globe 24-7 HR Corporate and site-based Human Resources in Africa since 2003 Proud members of

EBN UNDP/ I DEASS I nternational Contest for I nnovation 2 0 0 9 Them e: Eco-I nnovations

m-Mode Analysis Imaging with the Owens Valley LWA Michael Eastwood California Institute of

Presented by Heather Balas President & Executive Director - New Mexico First MVEDA Business in

Gteborg & Co r en del av Gteborgs Stad Gteborg & Co Gteborg & Co:s uppdrag

Learnings from the Centre of Research Excellence in Aboriginal Chronic Disease Knowledge

Depression in Adolescents in Malta: Is it a growing cause for Concern? Antonella Sammut

The relationship of social support, psychological well-being, anxiety, depression, and student

Results after 3 years of research The Hague, 14 Sept 2016 Outline NatureCoast What has been

PUBLIC HEARING US 62/180 (Montana Avenue) From Global Reach Drive to FM 659 (N. Zaragoza Road)

Planning Efforts & Policies Planning Efforts & Policies Strategic Planning for Climate

IPANM Annual Meeting August 1-3, 2016 Santa Fe, New Mexico Presented by Karin Foster THANK YOU!

Kawartha Lakes-Haliburton Housing Corporation (KLHHC) Regeneration Projects Overview of Phases 1

Evaluation of WRF performance for depicting orographically-induced gravity waves in the

Plio-Quaternary kinematics and geometry of the Calama-Olacapato-El Toro fault zone across the Puna

Bounty Oil and Gas NL Presentation to: Excellence in Oil and Gas March 11 12, 2014 ASX

WHA Premium Growth Real Estate Investment Trust (WHART) Quarterly Report Q32018 REIT Manager

Non Resident Investors in Brazilian Public Debt Andre Proite Brazilian National Treasury

BIFROST HIGH-THROUGHPUT CPU/GPU PIPELINES MADE EASY Ben Barsdell, - PowerPoint PPT Presentation

April 4-7, 2016 | Silicon Valley BIFROST HIGH-THROUGHPUT CPU/GPU PIPELINES MADE EASY Ben Barsdell, 4/7/2016 DISAMBIGUATION The Bifrost presented here is NOT the stellar atmospheres code of the same name NOT the fluid simulation

Bifrost Easy High-Throughput Computing github.com/ledatelescope/bifrost Miles Cranmer

Bifrost: Easy GPU Pipeline Development github.com/ledatelescope/bifrost Presenter: Miles

February 2003 FIRST Technical Colloquium February 10-11, 2003 @ Uppsala, Sweden bifrost a high

Panfrost A reverse engineered FOSS driver for Mali Midgard and Bifrost GPUs Contributors

CASPER AND GPUS MODERATOR: DANNY PRICE, SCRIBE: RICHARD PRESTAGE Frameworks MPI,

Simulations of the Impact of Partial Ionization on the Chromosphere Juan Martnez-Sykora Bart

Globe 24-7 HR Corporate and site-based Human Resources in Africa since 2003 Proud members of

EBN UNDP/ I DEASS I nternational Contest for I nnovation 2 0 0 9 Them e: Eco-I nnovations

m-Mode Analysis Imaging with the Owens Valley LWA Michael Eastwood California Institute of

Presented by Heather Balas President &amp; Executive Director - New Mexico First MVEDA Business in

Gteborg &amp; Co r en del av Gteborgs Stad Gteborg &amp; Co Gteborg &amp; Co:s uppdrag

Learnings from the Centre of Research Excellence in Aboriginal Chronic Disease Knowledge

Depression in Adolescents in Malta: Is it a growing cause for Concern? Antonella Sammut

The relationship of social support, psychological well-being, anxiety, depression, and student

Results after 3 years of research The Hague, 14 Sept 2016 Outline NatureCoast What has been

PUBLIC HEARING US 62/180 (Montana Avenue) From Global Reach Drive to FM 659 (N. Zaragoza Road)

Planning Efforts &amp; Policies Planning Efforts &amp; Policies Strategic Planning for Climate

IPANM Annual Meeting August 1-3, 2016 Santa Fe, New Mexico Presented by Karin Foster THANK YOU!

Kawartha Lakes-Haliburton Housing Corporation (KLHHC) Regeneration Projects Overview of Phases 1

Evaluation of WRF performance for depicting orographically-induced gravity waves in the

Plio-Quaternary kinematics and geometry of the Calama-Olacapato-El Toro fault zone across the Puna

Bounty Oil and Gas NL Presentation to: Excellence in Oil and Gas March 11 12, 2014 ASX

WHA Premium Growth Real Estate Investment Trust (WHART) Quarterly Report Q32018 REIT Manager

Non Resident Investors in Brazilian Public Debt Andre Proite Brazilian National Treasury

Presented by Heather Balas President & Executive Director - New Mexico First MVEDA Business in

Gteborg & Co r en del av Gteborgs Stad Gteborg & Co Gteborg & Co:s uppdrag

Planning Efforts & Policies Planning Efforts & Policies Strategic Planning for Climate