SVM on Intel Graphics Jesse Barnes Intel Open Source Technology - PowerPoint PPT Presentation

SVM on Intel Graphics Jesse Barnes Intel Open Source Technology Center 1

● What is SVM? ● Discussion of current practices ● SVM OS and driver modifications ● Device options and implications 2

SVM defined Pointer sharing between CPU and GPU 3

But wait, there’s more! ● Pointer sharing with buffers ○ Offset in device address space matches offset in process address space ○ Can use a buffer allocation API to manage device page tables ○ Allows OpenCL “fine grained, buffered” model ● Pointer sharing with a bufferless API ○ Requires pinning or page fault support ○ Allows OpenCL “fine grained, bufferless” model ○ Requires core OS and driver support ○ Ideal for application programmers ● Important to be clear when discussing “SVM” 4

SVM PCIe and VT-d extensions • ATS – Address Translation Services • Basic IOMMU support • PASID – Process Address Space ID • Tells IOMMU which page tables to use, equivalent to the ASID on the CPU side • PRI – Page Request Interface • Allows functions to raise page faults to the IOMMU • VT-d SVM • Extends root complex IOMMU to comprehend x86 page table formats 5

Good, old days CPU PCIe EP GFX Root Complex CPU page tables GPU page tables DRAM Generally devices used physical addrs 6

VT-d (and big servers) make it more complicated... CPU PCIe EP GFX Root Complex CPU page tables GPU page tables DMAR page tables DRAM DMA address is indirect 7

Current interfaces (sans softpin) ● Buffer alloc ● Buffer map – allow direct CPU access to buffer ● Buffer read/write – just like read/write on I/O buffers ● Buffer share – create handle for inter-process sharing ● Buffer query – check status of buffer ● Exec buffer – execute code, pass in whole buffer list ○ Synchronizes with all of the above 8

1 Device 2 Kernel 3 a buffer 3 1 a Kernel 2 User 3 buffer b 1. Alloc buffer, syscall or ioctl to kernel (maybe for both) 2. Alloc buffer, request device to initiate DMA 3a 1 . Device DMAs to/from kernel buffer 3a 2 . Kernel copies to user buffer 3b. Device DMAs directly to translated process address (pinned!) Process address space 9

SVM changes 10

Possible SVM model CPU PCIe EP Root Comple GFX x VT-d hw VT-d looks up correct tables with CPU page tables PASID DMAR page tables DRAM Potential to share page tables 11

1 Device 2 Kernel buffer Kernel User 3 buffer 1. Alloc buffer, syscall or ioctl to kernel (maybe for both) 2. Alloc buffer, request device to initiate DMA 3. Device DMAs directly to translated process address (with faulting!) Process address space 12

Driver implications • Must alloc/track PASID • Either linked to process or device specific context struct • Optionally design new APIs • Potentially just “execute starting at this address” or “write to this address” • Device<->CPU synchronization is flexible • PCIe atomic ops • Memory polling • Interrupts passed from device to process through driver specific mechanism 13

Possible SVM driver interfaces • Malloc, mmap, etc – normal libc interfaces for memory management • Context create ioctl takes a flag to indicate you want an SVM context • Can mix & match SVM and non-SVM execution • Single interface for submission: i915_exec_mm ioctl • struct drm_i915_exec_mm { batch_ptr; ctx_id; ring, flags; fence; deps; } • Synchronization through interrupt forwarding • Command buffer contains interrupt command, driver maps that back to fd event for app 14

SVM for devices 15

SVM device options ● Adding PASID support ○ Can get you a shared address space on supported platforms ○ Application to device interaction still potentially complex – need to manage pinning, potentially include buffer alloc APIs ● Adding page faults ○ Allows for bufferless APIs – simple malloc and use of pointers across CPU/device boundaries ○ Major and minor faults can be handled ○ What to do while servicing the fault? 16

Context handling ● Wait for fault handling ○ Simple, but potentially poor device utilization, depending on use model ● Restart/abort on fault ○ Simply re-submit work after fault is handled, starting from the top ○ Also simple to implement, but potentially even worse utilization than waiting ● Context switch on fault ○ Save device context on fault, switch to new context like on CPU ○ Potentially very complex for device designers ○ Added complexity for drivers 17

SVM on Intel Graphics Jesse Barnes Intel Open Source Technology - PowerPoint PPT Presentation

SVM on Intel Graphics Jesse Barnes Intel Open Source Technology Center 1 What is SVM? Discussion of current practices SVM OS and driver modifications Device options and implications 2 SVM defined Pointer sharing between CPU

Graphics Murray Cole Graphics 1 Graphics 2 Graphics 3 Graphics 4 Graphics 5 Graphics 6

SVM-flexible discriminant analysis Huimin Peng November 20, 2014 Outline SVM Nonlinear SVM =

Overview SVM theoretical framework ORACLE data mining technology SVM parameter

Linear, Binary SVM Classifiers COMPSCI 371D Machine Learning COMPSCI 371D Machine

Machine Learning Theory CS 446 1. SVM risk SVM risk Consider the empirical and true/population

Lecture 5: SVM II Princeton University COS 495 Instructor: Yingyu Liang Review: SVM objective

CS378 - Mobile Computing 3D Graphics 2D Graphics android.graphics library for 2D graphics

3D GRAPHICS design animate render Computer Graphics 3D animation movies Computer Graphics

Fitting SVM models in Matlab mdl = fitcsvm(X,y) fit a classifier using SVM X is a

An SVM- -based Masquerade Detection based Masquerade Detection An SVM Method with Online Update

SVM Kernels COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning SVM Kernels 1 /

Classication SVM algorithms with interval-valued training data using triangular and

Eye-blink Detection Based on SVM Wang Xiaoxing Shanghai Jiao Tong University figure1

Graphics Processing CS418 Computer Graphics John C. Hart Graphics Processing Graphics

Intel Case Intel Case Processor Serial Number (PSN) Processor Serial Number (PSN) 5/9/99 Group

Validation Labs with OpenStack Shuquan Huang, Intel IT Engineering Computing Weibo: @

Single Event Upsets in the ATLAS IBL Frontend ASICs December 12 th , 2018 Yosuke Takubo (KEK) On

Leisure Futures Workshop The Future? Commercial or Social Policy? Or both? Building the Health

Generalized symmetries and arithmetic applications James Borger Australian National University

Chor d: A Scalable Peer-t o-peer Lookup Chor d: A Scalable Peer-t o-peer Lookup Ser vice f or I

Policy Advisory Committee 7 February 2018 Meeting - PAC # 14 1 Policy Advisory Committee - Agenda

SI485i : NLP Set 14 Reference Resolution Reference Resolution Kraken , also called the Crab-fish

Device I/O Programming Don Porter 1 COMP 790: OS Implementation Logical Diagram Binary Memory

Better Care Fund 2017-19 A guide to assurance of plans Draft v5 The Better Care Fund Contents #