OPERATIONALIZING MACHINE LEARNING USING GPU 1 ACCELERATED, - - PowerPoint PPT Presentation

operationalizing machine learning using gpu
SMART_READER_LITE
LIVE PREVIEW

OPERATIONALIZING MACHINE LEARNING USING GPU 1 ACCELERATED, - - PowerPoint PPT Presentation

OPERATIONALIZING MACHINE LEARNING USING GPU 1 ACCELERATED, IN-DATABASE ANALYTICS Why GPUs? Performance Increase A Tale of Numbers Infrastructure Cost Savings 100x 75% Performance Costs 100x gains over traditional 75% reduction in


slide-1
SLIDE 1

OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS

1

slide-2
SLIDE 2

Why GPUs?

2

A Tale of Numbers

100x 75%

Performance 100x gains over traditional RDBMS / NoSQL / In-Mem Databases Cores Modern GPUs can consist of up to 3000+ cores compared to 32 in a CPU Costs 75% reduction in infrastructure costs, licensing, staff, etc. More with Less Increase performance, throughput, capability while minimizing the costs to support the business

Performance Increase Infrastructure Cost Savings

3000

vs 32

slide-3
SLIDE 3

Why a GPU Database?

  • Leverage Innovations in CPUs and

GPUs

  • Single Hardware Platform
  • Simplified Software Stack

3

slide-4
SLIDE 4

What are AI, ML, and Deep Learning?

4

ML

Predict y using function on data x

Deep Learning AI

slide-5
SLIDE 5

AI/ML/Deep Learning Cheat Sheet

5

No shortage of techniques and programing languages

slide-6
SLIDE 6

6

ML Cheat Sheet

Python and SQL cover almost all the algorithms in that scary spider and Kinetica supports all Python libraries!

slide-7
SLIDE 7

7

ML/AI/Deep Learning Lifecycle

slide-8
SLIDE 8

ML/AI/Deep Learning Lifecycle

  • Create, extract, transform, and

process big data: batch and streams

  • Apply ML to data.
  • Model pre-processing
  • Model execution
  • Model post-processing
  • Within an ecosystem of general

analytics

  • Supporting a range of human and

machine consumers

9

slide-9
SLIDE 9

9

Typical AI Process: High Latency, Rigid, Complex Tech Stack

SPECIALIZED AI/ DATA SCIENCE TOOLS

SUBSET

DATA SCIENTISTS BUSINESS USERS

EXTRACT EXTRACTING DATA FOR AI IS EXPENSIVE AND SLOW ENTERPRISES STRUGGLE TO MAKE AI MODELS AVAILABLE TO BUSINESS ???

slide-10
SLIDE 10

Kinetica: A More Ideal AI Process

10

Monte Carlo Risk Custom Function 2 Custom Function 3

API EXPOSES CUSTOM FUNCTIONS WHICH CAN BE MADE AVAILABLE TO BUSINESS USERS

BUSINESS USERS DATA SCIENTISTS

UDFs

slide-11
SLIDE 11

Current Inefficient Use of Python

11

python

  • Interpreted
  • Single threaded
  • Clean, transform
  • Flow: for each member
  • Pre-process
  • Model execute
  • Post-process

=

slide-12
SLIDE 12

Optimized SQL and Python UDF with Kinetica

12

=

SQL UDF

python

SQL

  • Pre-process
  • Binary executable code
  • Superior optimization
  • declarative SQL
  • Model execute
  • Only essential imperative model code
  • Not relational set processing
  • Post-process
  • Binary executable code
  • Superior optimization
  • Declarative SQL
slide-13
SLIDE 13

Various ETL/ELT

Head Node Worker 1

KINETICA: 10 Node Cluster

Worker 9

Fact and dimensions tables for various Use Cases Billions of rows Massive Stream Ingestion Massive Fast Analytics Apache Tomcat Applications Servers

  • Spring Endpoint oriented architecture
  • Horizontal elastic scaling

Full Model Pipeline 1 Various ETL/ELT Full Model Pipeline N Prompts Project

Comprehensive Solution Architecture

Major U.S Retailer

13

Fast Streaming Projects Fast Analytics Projects

slide-14
SLIDE 14

Use Case Example

slide-15
SLIDE 15

MNIST: Simple Image Processing Use Case

15

A Parametric ModelPython Using TensorFlow Model Training

  • Set of image files stored in Kinetica Database

Table

  • Python UDF in Kinetica using TensorFlow

Model Serving

  • Python UDF in Kinetica using TensorFlow
  • Input = table TFModel table.
  • Output = table mnist_inference_out

Model Analytics

  • SQL!
slide-16
SLIDE 16

UDF train_nd_udf.py

Machine 0 Rank 0 Tom 0

Table mnist_training Shard 0 Table TFModel Shard 0 Table mnist_inference Shard 0 Table mnist_inference_out Shard 0

Tom 1

Table mnist_training Shard 1 Table TFModel Shard 1 Table mnist_inference Shard 1 Table mnist_inference_out Shard 1

Tom 2

Table mnist_training Shard 2 Table TFModel Shard 2 Table mnist_inference Shard 2 Table mnist_inference_out Shard 2

Tom 3

Table mnist_training Shard 3 Table TFModel Shard 3 Table mnist_inference Shard 3 Table mnist_inference_out Shard 3

Machine 0 Rank 0 Tom 4

Table mnist_training Shard 4 Table TFModel Shard 4 Table mnist_inference Shard 4 Table mnist_inference_out Shard 4

Tom 5

Table mnist_training Shard 5 Table TFModel Shard 5 Table mnist_inference Shard 5 Table mnist_inference_out Shard 5

Tom 6

Table mnist_training Shard 6 Table TFModel Shard 6 Table mnist_inference Shard 6 Table mnist_inference_out Shard 6

Tom 7

Table mnist_training Shard 7 Table TFModel Shard 7 Table mnist_inference Shard 7 Table mnist_inference_out Shard 7

UDF UDF UDF UDF UDF UDF UDF UDF

Model Training & Inference Da Data Model: MPP Shar arding

slide-17
SLIDE 17

info@kinetica.com

Thank You!

Come get your copy of the O’Reilly Book at Booth G.01!