GPU OPEN ANALYTICS INITIATIVE END-TO-END ACCELERATED ANALYTICS Brad - - PowerPoint PPT Presentation

gpu open analytics initiative
SMART_READER_LITE
LIVE PREVIEW

GPU OPEN ANALYTICS INITIATIVE END-TO-END ACCELERATED ANALYTICS Brad - - PowerPoint PPT Presentation

GPU OPEN ANALYTICS INITIATIVE END-TO-END ACCELERATED ANALYTICS Brad Rees, Ph.D. - Senior Solution Architect - NVIDIA GTC DC, November 2017 The AI Computing Company AGENDA TWO PARTS Discuss Analysis from the Perspective of Data Science


slide-1
SLIDE 1

GPU OPEN ANALYTICS INITIATIVE

END-TO-END ACCELERATED ANALYTICS

Brad Rees, Ph.D. - Senior Solution Architect - NVIDIA GTC DC, November 2017

The AI Computing Company

slide-2
SLIDE 2

AGENDA – TWO PARTS

  • Part 1
  • Big Data and Spark
  • GPU Barriers
  • Part 2
  • GOAI

Discuss Analysis from the Perspective of Data Science

Better Exploration ∝ Better Science Fail Fast Needs to be Embraces

I have not failed. I've just found 10,000 ways that won't work.

  • Thomas A. Edison

“Data science, also known as data-driven science, is an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data …”

  • WIkipedia

Faster Analytics yield better Exploration

slide-3
SLIDE 3
slide-4
SLIDE 4

the Big Data Catalyst

The Glue that Binds Big Data

  • Spark has become synonymous with Hadoop and Big Data
  • It’s the interface/API for big data app to app communication
  • The processing layer for big data and leading ML framework
slide-5
SLIDE 5

SPARK IS NOT ENOUGH

We Want More Efficiency and Speed

  • Common issue is speed at scale
  • Scaling out to get the necessary speed for

mission critical workloads is prohibitively expensive

  • Clients want core ML on GPU

Commercial Government HPC

We need a GPU-equivalent to Spark … But there are some Barriers

slide-6
SLIDE 6

GPU ADOPTION BARRIERS

  • Too much data movement
  • Too many makeshift data formats
  • No inter-GPU communication
  • No Python API for data manipulation
  • No all inclusive Machine Learning Library

Concerns:

  • Too Hard to Integrate GPUs
  • Not suited for Data Science
slide-7
SLIDE 7

DATA MOVEMENT AND TRANSFORMATION

  • Too much time spent Moving data
  • Data movement and conversion

hinder any performance gains

  • No Inter-GPU Communication

The bane of productivity / performance

CPU

slide-8
SLIDE 8

DATA FORMATS

Avro XML JSON GML ProtoBuf HDFS Pickle CSV Parquet Panda Plain Text vs Binary Compressed vs Uncompressed CSR COO CSC * Not a complete list Numpy

slide-9
SLIDE 9

ARE THE GPU BARRIERS TO GREAT?

☹️ Data movement ☹️ Data formats ☹️ Inter-GPU communication ☹️ No Python API for data manipulation ☹️ No all inclusive Machine Learning Library

Is there any hope?

slide-10
SLIDE 10

GPU OPEN ANALYTICS INITIATIVE

  • Formed in March at Strata SJ; Launched at GTC in May
  • Goal: GOAI seeks to foster open collaboration between GPU

analytics projects and products to enable data scientists to efficiently combine the best tools for their workflows. Luckily others were also thinking about the problems

slide-11
SLIDE 11

Data Manipulation

ACCELERATED ANALYTICS ECOSYSTEM

MapD GPU Ram BlazingDB Disk

STORAGE IN GPU MEMORY DATA STRUCTURE PROCESSING AND ANALYTICS INTERACTION

MapD BlazingDB (“SQL”) Many Columnar Data Frames (everyone has their own makeshift data frame) Anaconda * (Dask “Python”) Fast Data (Streaming) NV Graph Graphistry MapD Immerse Jupyter NB Open Source Free to Use Closed Source

Key:

* Primarily x86 w/ some GPU acceleration

  • Fragmented with too

many holes

  • Still too reliant on CPU for

moving data between applications

  • 80-90% of data science is

accelerated analytics, not deep learning yet

Prior State (pre-March 2017)

slide-12
SLIDE 12

Data Manipulation MapD GPU Ram BlazingDB Disk

STORAGE IN GPU MEMORY DATA STRUCTURE PROCESSING AND ANALYTICS INTERACTION

MapD BlazingDB (“SQL”) Standard Columnar Data Frame (Open Sourced/Free to Use from MapD) H2O (Data. Table “R”) Anaconda (Dask “Python”) Fast Data (Streaming) H2O.ai (GPU MLlib) NV Graph MapD + BlazingDB System Memory Graphistry MapD Immerse Jupyter NB Open Source Free to Use Closed Source

Key:

ACCELERATED ANALYTICS ECOSYSTEM

Post-March 2017

slide-13
SLIDE 13

LEARNING FROM APACHE ARROW

Interoperability

Big Data ecosystem facing similar issues Major push in the big data world to remove bottlenecks

  • f copy & converting data between systems

Apache Arrow™

  • enables execution engines to take advantage of the

latest SIMD (Single input multiple data) operations

  • Columnar layout is optimized for data locality for

better performance on modern hardware like CPUs and GPUs.

  • The Arrow memory format supports zero-copy

reads for lightning-fast data access without serialization overhead.

slide-14
SLIDE 14

THE GPU DATA FRAME

First GOAI Project

CPU

✓ Data movement ✓ Data formats ✓ Inter-GPU communication ✓ Python API ✓ Machine Learning Library

So …. What does this get me?

slide-15
SLIDE 15

SEAMLESS CALLS BETWEEN APPLICATIONS

  • Load data into MapD
  • Call an H2O ML algorithm
  • All via Anaconda Python
  • Within a Jupyter Notebook

What does GOAI get me? Big improvement for Data Science

Demos available on goai github

slide-16
SLIDE 16

SEAMLESS CALLS BETWEEN APPLICATIONS

  • Load data into MapD
  • Call an H2O ML algorithm
  • All via Anaconda Python
  • Within a Jupyter Notebook

What does GOAI get me? Big improvement for Data Science

Demos available on goai github

pygdf: Python library for manipulating GDFs

  • Creating GDFs from numpy arrays and Pandas DataFrames
  • Performing math operations on columns
  • Import/export via CUDA IPC
  • Sort, join, reductions
  • JIT compilation of group by and filter kernels using Numba
slide-17
SLIDE 17

SIMPLE DATA CONVERSION

Convert from Pandas and Numpy

slide-18
SLIDE 18

Several Examples Available on GOAI GitHub

slide-19
SLIDE 19

GOAL OF GOAI

Better Adoption with Better Usability and TCO

HDFS Read HDFS Write HDFS Read HDFS Write HDFS Read SQL Query ETL Train HDFS Read SQL Query ETL ML Train HDFS Read GPU Read SQL Query CPU Write GPU Read ETL CPU Write GPU Read ML Train Arrow Read SQL Query ETL ML Train

Hadoop Processing, Reading from disk

5-10x Improvement More code Language rigid Substantially on GPU 25-100x Improvement Same code Language flexible Primarily on GPU

Spark In-Memory Processing GPU + Spark In-Memory Processing

Large TCO benefit

  • ver Spark

Large Adoption? Small TCO benefit

  • ver Spark

Small Adoption Large TCO benefit

  • ver Hadoop

Large Adoption 25-100x Improvement Less code Language flexible Primarily In-Memory

End-to-End GPU Processing (GOAI)

slide-20
SLIDE 20

INITIAL LIBRARIES

github.com/gpuopenanalytics GPU Data Frame

  • libgdf: C library of helper functions:
  • Copying GDF metadata block to the host and parsing it

to a host-side struct

  • Importing/exporting via CUDA IPC
  • CUDA kernels to perform element-wise math
  • perations on GDF columns.
  • CUDA sort, join, and reduction operations on GDFs.
  • pygdf: Python library for manipulating GDFs
  • Creating GDFs from numpy arrays and Pandas

DataFrames

  • Performing math operations on columns
  • Import/export via CUDA IPC
  • Sort, join, reductions
  • JIT compilation of group by and filter kernels

using Numba

  • dask_gdf: Extension for Dask to work with distributed GDFs.
  • Same operations as pygdf, but working on GDFs

chunked onto different GPUs and different servers.

slide-21
SLIDE 21

~100x speedup using MapD on half a DGX to analyze census data vs a 20 node Spark cluster >50x speedup in performing pagerank on a graph on half a DGX vs an 8 node Spark cluster ~8.5x speedup on half a DGX to produce a robust GLM via 10-fold cross-validation vs an 8 node Spark cluster ~100x more cyber security data interactively visualized using an intuitive layout algorithm on a single GPU as a connected graph ~5X faster than Redshift to utilize full disk storage and system memory Python on GPU... Numba and Pandas

ABOUT

slide-22
SLIDE 22

MapD

GPU-accelerated analytics platform

Consists of MapD Core database and MapD Immerse

MapD Core database is an in-GPU-memory, columnar, open-source, GPU-accelerated, SQL database. MapD Enterprise brings distributed and high availability modes, GPU-accelerated backend rendering, Kerberos/LDAP security, and ODBC/JDBC. MapD Immerse is a visual analytics platform on top

  • f the MapD Core database that allows data

scientists and analysts to interactively explore large datasets.

slide-23
SLIDE 23

21 596 1560 80 518 1250 150 795 2250 372 1209 2970

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 MapD DGX-1 Kinetica DGX-1 Redshift 6-node Spark 11-node

Query 1 Query 2 Query 3 Query 4

1.1 BILLION TAXI RIDES BENCHMARK

Time in Milliseconds

Source: MapD Benchmarks on DGX-1 from internal NVIDIA testing following guidelines of Mark Litwintschik’s blogs: Redshift, 6-node ds2.8xlarge cluster & Spark 2.1, 11 x m3.xlarge cluster w/ HDFS

@marklit82

10190 8134 19624 85942

GPU Memory based databases 8x to 15x faster than CPU in- memory databases such as Redshift. 100x to 485x faster than Spark

  • n 11-servers

Open Source core DBMS Free Community Edition

slide-24
SLIDE 24

BlazingDb

GPU-accelerated petabyte scale data warehouse

Consists of BlazingDB database

BlazingDB database is a disk-based, columnar, GPU-accelerated SQL database. BlazingDB has distributed and high availability modes, JDBC, and Python/C# APIs. BlazingDB offers a Community Edition that can be downloaded for free and has an Enterprise Edition that you can launch today on AWS.

slide-25
SLIDE 25

Blazing DB high performance SQL on petabyte scale

BlazingDB SQL is built on a columnar relational data model. Enterprise grade security through Spring Security BlazingDB distributes both data and computation to multiple instances, for more data,

  • r faster query speeds
  • https://blazingdb.com/

Blazing speedup

slide-26
SLIDE 26

Anaconda Python

Open-source focused, GPU-accelerated data science platform

Contains Anaconda Accelerate, Numba, and Dask

Anaconda Accelerate provides access to libraries

  • ptimized for performance on NVIDIA GPUs such as

CUDA Sorting and cuBLAS. Numba is a compiler for Python functions that generates native code for GPU hardware. Dask is a parallel computing library for analytic computing in Python. It enables distributed computing in Pure Python and integrates with Anaconda Accelerate and Numba.

slide-27
SLIDE 27

NUMBA PERFORMANCE

Jeremy Howard

Deep learning researcher & educator.

Founder: fast.ai Faculty: USF & Singularity University Previously - CEO: Enlitic President: Kaggle CEO Fastmail

Rewrote the PolynomialFeatures from scikit_learn in Numba. Got a 40x speedup in only 12 lines of code

How Fast

slide-28
SLIDE 28

H2O.ai

Open-source GPU-accelerated machine learning platform

Contains H2O.ai platform

H2O.ai has a working implementation of GPU- accelerated generalized linear modeling. H2O.ai is working to GPU-accelerate additional machine learning algorithms such as random forests, gradient boosting machines, and clustering. H2O.ai is working on porting data.table, a columnar data frame library, along with the world's fastest implementation of the sort algorithm to NVIDIA GPUs.

slide-29
SLIDE 29

MACHINE LEARNING LIBRARY

H2O4GPU Roadmap

slide-30
SLIDE 30

Graphistry

GPU-accelerated graph visualization engine

Consists of Graphistry graph visualization engine

Graphistry uses GPUs in the backend for layout calculation and machine learning. Graphistry uses GPUs in the frontend for rendering the visualization in a web browser. Graphistry allows a user to interactively visualize magnitudes more data than traditional solutions in an intuitive way.

slide-31
SLIDE 31

Different Graphs, Different Questions

Fraud: Tracking Embezzlers Hunting: Daily Anomalies SecOps: Shadow IT Use Ops/NOC: Outage Root Cause Threat Intel: Botnet Analysis IR: Killchain Analysis

slide-32
SLIDE 32

Gunrock

Open-source GPU-accelerated graph analytics library

Consists of Gunrock graph analytics library

Gunrock has multi-GPU implementations of graph algorithms such as PageRank, Breadth First Search, Single Source Shortest Path, etc. Gunrock has high level API in C that is accessible from Python.

slide-33
SLIDE 33

JOIN THE REVOLUTION

Everyone Can Help! APACHE ARROW APACHE PARQUET

https://arrow.apache.org/ @ApacheArrow https://parquet.apache.org/ @ApacheParquet

GPU Open Analytics Initiative

http://gpuopenanalytics.com/ @Gpuoai Integrations, feedback, documentation support, pull requests, new issues, or donations welcomed!

slide-34
SLIDE 34

GOAI PARTNER SESSION LINE-UP AT GTC DC 2017

Session # Topic

Wednesday 11/1 2:00pm Hemisphere A

DC7213

World's Fastest Machine Learning With GPUs

Jon Mckinney - Senior Developer, H2O.ai Wednesday 11/1 2:30pm Hemisphere A

DC7212

Interpretable AI: Not Just For Regulators

Patrick Hall - Director of Data Science, H2O.ai Wednesday 11/1 5:00pm Polaris

DC7189

The Impact of GPUs in Geovisualization for Government

Todd Mostak - CEO & Founder, MapD Thursday11/2 2:00pm Hemisphere B

DC7133

Scaling Event Data Investigations with GPU Visual Graph Analytics

Leo Meyerovich - CEO, Graphistry, Inc Thursday 11/2 4:30pm Atrium Hall

DC7111

Accelerating Cyber Threat Detection with GPUs

Josh Patterson - NVIDIA

slide-35
SLIDE 35

Fundamentals Autonomous Vehicles Media & Entertainment Finance

NVIDIA DEEP LEARNING INSTITUTE

Training available as online self-paced labs and instructor-led workshops Take self-paced labs at www.nvidia.com/dlilabs Find or request an instructor-led workshop at www.nvidia.com/dli Educators: download the Teaching Kit at developer.nvidia.com/teaching-kit and contact nvdli@nvidia.com for info on the University Ambassador Program

Machine Vision - IVA Healthcare …and more

slide-36
SLIDE 36

Thank You !

http://gpuopenanalytics.com/