SymEngine Symbolic Executjon of OpenCL Kernels Alberto Magni - PowerPoint PPT Presentation

Jun 07, 2023 •439 likes •593 views

SymEngine Symbolic Executjon of OpenCL Kernels Alberto Magni Optjmize code for GPUs Optjmize Memory Accesses 2 GPU Memory Transactjons Coalesced Access GPU Core 1 Load Request = 4 Bytes per Thread 128 Bytes L1 32 Threads GPU Memory

SymEngine Symbolic Executjon of OpenCL Kernels Alberto Magni
Optjmize code for GPUs Optjmize Memory Accesses 2
GPU Memory Transactjons Coalesced Access GPU Core 1 Load Request = 4 Bytes per Thread 128 Bytes L1 32 Threads GPU Memory Cache 1 Cache Line 3
GPU Memory Transactjons UnCoalesced Access GPU Core 1 Load Request = 4 Bytes per Thread L1 512 Bytes 32 Threads GPU Memory Cache 4 Cache Lines 4
GPU Memory Transactjons UnCoalesced Access GPU Core 1 Load Request = 4 Bytes per Thread L1 512 Bytes 32 Threads GPU Memory Cache 4 Cache Lines Wasted Bandwidth 5
SymEngine Statjcally Detect Suboptjmal Accesses to Memory 6
SymEngine Statjcally Detect Suboptjmal Accesses to Memory OpenCL Kernel int threadID = get_global_id(0); sX = x[threadID]; Resolve Address sY = y[threadId]; sZ = z[threadId]; sQr = Qr[threadId]; sQi = Qi[threadId]; for (int kIndex = 0; (kIndex < KERNEL_ELEMS_PER_GRID); kIndex ++, kGlobalIndex ++) { Compute fmoat expArg = PIx2 * (ck[kIndex].Kx * sX + ck[kIndex].Ky * sY + Number of Transactjons ck[kIndex].Kz * sZ); sQr += ck[kIndex].PhiMag * cos(expArg); sQi += ck[kIndex].PhiMag * sin(expArg); } Qr[threadId] = sQr; Qi[threadId] = sQi ; 7
Symbolic Executjon OpenCL Code Warp-Id Hardware SymEngine Memory Number of Transactjons Threads Input Values 8
Symbolic Executjon Threads in a Warp 0 1 2 3 4 … 29 30 31 Memory Memory Memory Memory ... Instructjon Instructjon Instructjon Instructjon SCEV SCEV SCEV SCEV Address Address Address Address 9
Symbolic Executjon Threads in a Warp 0 1 2 3 4 … 29 30 31 Memory Memory Memory Memory ... Instructjon Instructjon Instructjon Instructjon SCEV SCEV SCEV SCEV Address Address Address Address Number of Cache Transactjon lines touched Number 10
Validatjon – Nvidia GTX480 Against Hardware Performance counters Total HW Transactjons for Black-Scholes HW Counter Program Versions 11
Validatjon – Nvidia GTX480 Against Hardware Performance counters Total HW Transactjons for Black-Scholes HW Counter Predictjon Program Versions 12
Validatjon – Nvidia GTX480 13
Validatjon – Nvidia GTX480 0.99 correlatjon with HW counters 14
It's on GitHub! htup://github.com/HariSeldon/SymEngine 15

Recommend

THE SIGNATURE MOD 2, 4 AND 8 Andrew Ranicki (Edinburgh) Larry Taylor (Notre Dame) Oxford, 31st

THE SIGNATURE MOD 2, 4 AND 8 Andrew Ranicki (Edinburgh) Larry Taylor (Notre Dame) Oxford, 31st January 2005 1 The signature mod 2, 4 and 8 of a 4 k -dimensional Poincar e space X Theorem ( X ) ( X ) (mod 2) with ( X

506 views • 33 slides

Programming Language Concepts: Lecture 14 Madhavan Mukund Chennai Mathematical Institute

Programming Language Concepts: Lecture 14 Madhavan Mukund Chennai Mathematical Institute madhavan@cmi.ac.in http://www.cmi.ac.in/~madhavan/courses/pl2009 PLC 2009, Lecture 14, 11 March 2009 Function programming A quick review of Haskell

988 views • 75 slides

Structure and Interpretation of Neural Codes Jacob Andreas Translating Neuralese Jacob

Structure and Interpretation of Neural Codes Jacob Andreas Translating Neuralese Jacob Andreas, Anca Dragan and Dan Klein Learning to Communicate [Wagner et al. 03, Sukhbaatar et al. 16, Foerster et al. 16] 3 3 Learning to Communicate

1.3k views • 129 slides

Functions Readings: HTDP , sections 1-3 Thrival and Style guides Topics: Programming language

Functions Readings: HTDP , sections 1-3 Thrival and Style guides Topics: Programming language design The DrRacket environment Values, expressions, & functions Defining functions Programming in DrRacket PL Design DrRacket Values,

577 views • 14 slides

Introduction to Optimization Dr. Mihail October 23, 2018 (Dr. Mihail) Optimization October 23,

Introduction to Optimization Dr. Mihail October 23, 2018 (Dr. Mihail) Optimization October 23, 2018 1 / 20 Overview What is optimization? Optimization is a mathematical discipline concerned with finding the maxima and minima of functions,

477 views • 21 slides

An introduction to Are C macros on Steroids C++ Templates Give you the power to

4/13/2017 Templates An introduction to Are C macros on Steroids C++ Templates Give you the power to parametrize Compile time computation Performance For : COP 3330. Object oriented Programming (Using C++) ht t p: / / www. com

284 views • 7 slides

Presented by Mark Hosang, Wayne Wight, Sedat Behar, Yevgeny Ioffe, Archana Suhas Joshi, Izi

Presented by Mark Hosang, Wayne Wight, Sedat Behar, Yevgeny Ioffe, Archana Suhas Joshi, Izi Aviyente 1 Mar. 18,2003 Query Languages Timeline ! 1:40 Talk Outline ! 1:45 CQL Primer ! 2:05 SQuAl Primer ! 2:25 ATLaS Primer ! 2:50

859 views • 41 slides

The R Language A Hands-on Introduction Venkatesh-Prasad Ranganath http://about.me/rvprasad What

The R Language A Hands-on Introduction Venkatesh-Prasad Ranganath http://about.me/rvprasad What is R? A dynamical typed programming language http://cran.r-project.org/ Open source and free Provides common programming language

698 views • 20 slides

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a compiler

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall Phases of a compiler Intermediate Representation (IR): specification and generation Figure 1.6, page 5 of text Switch [p. 419] TEXTBOOK OUR LANGUAGE switch (E)

591 views • 57 slides

static file cache Static file caching using realurl, mod_rewrite and mod_expires. . . . It slows

static file cache Static file caching using realurl, mod_rewrite and mod_expires. . . . It slows down the warming of the earth. Michiel Roos Netcreators What does it do? Ehrm . . . it caches static files? I mean . . . it statically caches

607 views • 46 slides

CSC 369: Distributed Computing Alex Dekhtyar This is an official test How Zoom records

CSC 369: Distributed Computing Alex Dekhtyar This is an official test How Zoom records videos Sound, visuals, etc.. ambari-head.csc.calpoly.edu access MongoDB Access Authentication Databases Preliminaries

855 views • 11 slides

CS 161: Object Oriented and discrete math . Why is math important to us? What does that

About this course About this course Course webpage: Like 160, 161 is a combination of programming http://www.cs.colostate.edu/~cs161/ CS 161: Object Oriented and discrete math . Why is math important to us? What does that Problem

88 views • 5 slides

FNAL Spack / SpackDev status update Chris Green, FNAL LArSoft Coordination Meeting, 2019-06-04

FNAL Spack / SpackDev status update Chris Green, FNAL LArSoft Coordination Meeting, 2019-06-04 Recap Previous status reports in this forum: 2018-05-22. 2018-08-14. Quick recap: Developing a long term replacement for our current UPS-based

311 views • 6 slides

SAM/ARGO Status update P. Korosoglou (GRNET/AUTH) EGI-TJRA2.1 Evolution of the ARGO platform:

SAM/ARGO Status update P. Korosoglou (GRNET/AUTH) EGI-TJRA2.1 Evolution of the ARGO platform: Partner PM Add status results & enhance metric store Enhance custom factors support GRNET 7 (TBC) Enhance the UI component of

490 views • 6 slides

Lattice Assumptions in Crypto: Status Update Chris Peikert University of Michigan (covers work

Lattice Assumptions in Crypto: Status Update Chris Peikert University of Michigan (covers work with Oded Regev and Noah Stephens-Davidowitz to appear, STOC17) 10 March 2017 1 / 14 Lattice-Based Cryptography p d o m x g = y N =

1.01k views • 84 slides

SHINE Medical Technologies Presentation to NRC (1) : Status Update on Medical Isotope Facility

SHINE Medical Technologies Presentation to NRC (1) : Status Update on Medical Isotope Facility May 11, 2012 Gregory Piefer, Chief Executive Officer 1 SHINE Values and Culture We share values with the NRC (1) : Protect the health and

629 views • 13 slides

I am Moving, I am Learning in Colorado Jodi Christopfel Early Childhood Obesity Prevention

I am Moving, I am Learning in Colorado Jodi Christopfel Early Childhood Obesity Prevention Projects Coordinator CDPHE jodi.christopfel@state.co.us Strategy for preventing and reversing childhood obesity Research based

598 views • 34 slides

UKQCD Grid Status Update ILDG15 4/DEC/2009 George Beckett Project Manager, EPCC

UKQCD Grid Status Update ILDG15 4/DEC/2009 George Beckett Project Manager, EPCC george.beckett@ed.ac.uk +44 131 650 5818 ILDG Services/Data UKQCD continues to provide full range of ILDG services: Metadata Catalogue, File Catalogue,

329 views • 4 slides

OHS Quality Council Meeting October 1, 2020 Agenda Welcome and Introductions - 5 minutes

OHS Quality Council Meeting October 1, 2020 Agenda Welcome and Introductions - 5 minutes Public Comment - 10 minutes Approval of July 22, 2020 Meeting Minutes - 5 minutes Quality Scorecard Discussion - 30 minutes

684 views • 31 slides

Aerospace & Defense Forum Aerospace & Defense Forum M&A Panel M&A Panel M&A

4/6/2011 Aerospace & Defense Forum Aerospace & Defense Forum M&A Panel M&A Panel M&A Panel M&A Panel Michael Cohen Global Capital Markets, Inc. Global Capital Markets Inc 310 829 9301 March 18, 2011 March 18, 2011

222 views • 3 slides

Welcome! 8:30 a.m. Registration and Hospitality 9:00 Introductions Overview of Workshop

Welcome! 8:30 a.m. Registration and Hospitality 9:00 Introductions Overview of Workshop 9:05 Session 1 Following Christ - Mini-Retreat 10:45 Break 11:00 Session 2 Calling Youth to Missionary Discipleship 12:15 p.m. Lunch 1:00

1.47k views • 112 slides

Introduction to Brainstorming! CSE 440: INTRO. TO HCI April 5, 2019 Angel Vuong Liang He BFA

Introduction to Brainstorming! CSE 440: INTRO. TO HCI April 5, 2019 Angel Vuong Liang He BFA Studio Art (Printmaking) PhD, UW CSE, transferred from UMD in 2017 MS HCDE TAed for HCID 521, UbiComp advuong@uw.edu MSR intern, Redmond, 2016 3

741 views • 28 slides

Grow Y our Busi siness ss on Facebook With ithout t Spend ndin ing a Dim ime By

Grow Y our Busi siness ss on Facebook With ithout t Spend ndin ing a Dim ime By Katherine Sullivan Facebooks new ew algori rithm got t ya ya down? Your Customers are on Facebook What Youll Learn: 1 2 3 Why hy Faceb

774 views • 43 slides

DCS/CSCI 2350: Social & Economic Networks Matching Markets Readings: Ch. 10 of EK &

11/19/20 DCS/CSCI 2350: Social & Economic Networks Matching Markets Readings: Ch. 10 of EK & Handout for stable marriage Mohammad T . Irfan 1 11/19/20 Alvin Roth Nobel Prize 2012 Lloyd Shapley Nobel Prize 2012 2 11/19/20

382 views • 15 slides