HOW TO ACHIEVE REAL-TIME ANALYTICS ON A DATA LAKE USING GPUS Mark - - PowerPoint PPT Presentation
HOW TO ACHIEVE REAL-TIME ANALYTICS ON A DATA LAKE USING GPUS Mark - - PowerPoint PPT Presentation
HOW TO ACHIEVE REAL-TIME ANALYTICS ON A DATA LAKE USING GPUS Mark Brooks - Principal System Engineer @ Kinetica May 09, 2017 The Challenge: How to maintain analytic performance while dealing with: Larger data volumes Streaming data
The Challenge:
How to maintain analytic performance while dealing with:
- Larger data volumes
- Streaming data with minimal end-to-end latency
- Ad-hoc drill down (you can’t pre-aggregate everything)
2
Architectural and Design Approaches
- 1. One database to rule them all
- 2. SQL on Hadoop (or directly on the Data Lake)
- 3. Data Lake + NoSQL + Spark + Search + Cache +…
- 4. Lambda Architecture
- 5. Kappa Architecture
- 6. Next generation hardware acceleration
3
One Database To Rule Them All
4
SQL on a Data Lake
Credit: https://www.slideshare.net/Bigdatapump/sql-on-hadoop-49494494
5
Hadoop + NoSQL + Search + Memory Cache +…
Credit: Matt Turck - https://www.slideshare.net/mjft01/big-data-landscape-matt-turck-may-2014
6
Lambda Architecture
Credit: Nathan Marz http://nathanmarz.com/blog/how-to-beat-the-cap-theorem.html James Kinley http://jameskinley.tumblr.com/tagged/Lambda
7
Lambda Architecture
Credit: James Kinley http://jameskinley.tumblr.com/tagged/Lambda
7
Kappa Architecture
Credit: Jay Kreps https://www.oreilly.com/ideas/questioning-the-lambda-architecture
8
Kappa Architecture
Credit: Jay Kreps https://www.oreilly.com/ideas/questioning-the-lambda-architecture
8
Stream processing systems already have a notion
- f parallelism; why not just handle reprocessing by
increasing the parallelism and replaying history very, very fast?
Next Generation Hardware Acceleration
Credit: Jay Kreps https://www.oreilly.com/ideas/questioning-the-lambda-architecture
8
Consider a system with these characteristics:
- Horizontally Scalable
- Low end-to-end latency
- Powerful enough to not require pre-aggregation
This is now possible…
GPU Accelerated Compute
12
DATA WAREHOUSE
RDBMS & Data Warehouse technologies enable
- rganizations to store and
analyze growing volumes of data
- n high performance machines,
but at high cost.
DISTRIBUTED STORAGE
Hadoop and MapReduce enables distributed storage and processing across multiple machines. Storing massive volumes of data becomes more affordable, but performance is slow
AFFORDABLE MEMORY
Affordable memory allows for faster data read and write. HANA, MemSQL, & Exadata provide faster analytics.
1990 - 2000’s 2005… 2010… 2017… AT SCALE PROCESSING BECOMES THE BOTTLENECK GPU ACCELERATED COMPUTE
GPU cores bulk process tasks in parallel - far more efficient for many data-intensive tasks than CPUs which process those tasks linearly.
Kinetica: Core
13
ANALYTICS DATABASE ACCELERATED BY GPUs
KINETICA
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
GPU Accelerated Columnar In-memory Database
HTTP Head Node
Columnar in-memory database Data available much like a traditional RDBMS… rows, columns Data held in-memory; persisted to disk Interact with Kinetica through its native REST API, Java, Python, JavaScript, NodeJS, C++, SQL, etc… as well as with various connectors Native GIS & IP address object support VERY FAST: Ideal for OLAP workloads
Typical hardware setup: 256GB - 1TB memory with 2-4 GPUs per node.
Multi-Head Ingest and Scale-Out Architecture
ON-DEMAND SCALE OUT
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory HTTP Head Node
+
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory HTTP Head Node Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory HTTP Head Node
MULTI-HEAD INGEST
19
Real-Time Data Handlers for Structured & Unstructured Data
VISUALIZATION via ODBC/JDBC APIs
Java API JavaScript API REST API C++ API Node.js API Python API
OPEN SOURCE INTEGRATION
Apache NiFi Apache Kafka Apache Spark Apache Storm
GEOSPATIAL CAPABILITIES
Geometric Objects Tracks Geospatial Endpoints WMS WKT
KINETICA CLUSTER
On-Demand Scale
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory
HTTP Head Node
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory
HTTP Head Node
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory
HTTP Head Node
Commodity Hardware w/ GPUs
Disk
A1 B1 C1 A2 B2 C2 A3 B3 C3 A4 B4 C4
Columnar In-memory
HTTP Head Node
OTHER INTEGRATION
Message Queues ETL Tools Streaming Tools
20
Parallel Ingest Provides High Performance Streaming
16
1 NODE (1TB/2GPU) PARALLEL INGEST 1 NODE (1TB/2GPU) 1 NODE (1TB/2GPU) Each node of the system can share the task of data ingest, provides more and faster throughput. It can be made faster simply by adding more nodes. No compute is used on ingest !
Speed Layer for the Data Lake
17 Parallel Ingestion
Parallel ingestion of events Kinetica is speed layer with real- time analytic capabilities HDFS for archival store Much looser coupling than traditional lambda architecture Batch mode Spark or MR jobs can push data to Kinetica as needed for fast query on data loaded from the data lake
EVENTS MESSAGE BROKERS Amazon Kinesis ANALYSTS MOBILE USERS DASHBOARDS & APPLICATIONS ALERTING SYSTEMS
Put, get, scan Execute complex analytics on the fly Kinetica Connectors
STREAM PROCESSING
° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° °
HDFS / AWS S3 / GCS / Azure Data Lake
Real-Time, Advanced Analytics, Speed Layer for Teradata or Oracle
18
Parallel ingestion of events Lambda-type architecture for Teradata or Oracle Kinetica is speed layer with near-real-time analytic capabilities Converge Machine Learning, streaming and location analytics and fast Query and Analytics with Kinetica and RDBMS
DATA IN MOTION AND REST
DATA WAREHOUSE / TRANSACTIONAL Amazon Kinesis ANALYSTS MOBILE USERS DASHBOARDS & APPLICATIONS ALERTING SYSTEMS
Kinetica Connectors
STREAM / ETL PROCESSING
Fast GPU accelerated, in- Memory Database Converge ML, AI, Streaming
Advanced In-Database Analytics
- 1. User-defined functions (UDFs) can receive table data,
do arbitrary computations, and save output to a separate table in a distributed manner.
- 2. UDFs have direct access to CUDA APIs – enables
compute-to-grid analytics for logic deployed within Kinetica.
- 3. Works with custom code, or packaged code. Opens
the way for machine learning/artificial intelligence libraries such as TensorFlow, BIDMach, Caffe and Torch to work on data directly within Kinetica.
- 4. Available now with C++ & Java bindings.
19
ORCHESTRATION LAYER WITH USER-DEFINED FUNCTIONS (UDFs)
PHYSICAL / VIRTUAL SERVER Table A
Table n
GPU
UDFs exposed from RESTful endpoint Data returned to
- utput table for
further analysis
CUDA Libraries
n number of Kinetica servers Table B Table C
Proc Server
UDF_A UDF_B UDF_n
/exec/proc/UDF_A/
Kinetica Architecture
20
ETL / STREAM PROCESSING
ON DEMAND SCALE OUT + 1TB MEM / 2 GPU CARDS SQL Native APIs PARALLEL INGEST Geospatial WMS Custom Connectors
In-Database Processing CUSTOM LOGIC BIDMach
ML Libs
BI DASHBOARDS
BI / GIS / APPS
CUSTOM APPS & GEOSPATIAL KINETICA ‘REVEAL’
STREAMING DATA ERP / CRM / TRANSACTIONAL DATA
UDFs
21
AI & BI on One GPU-Accelerated Database
HIGH PERFORMANCE ANALYTICS DATABASE
UDF UDF UDF ODBC / JDBC Native REST API WMS BUSINESS INTELLIGENCE CUSTOM APPLICATIONS HIGH FIDELITY GEOSPATIAL PIPELINE MACHINE LEARNING & DEEP LEARNING GPU-ACCELERATED DATA SCIENCE PREDICTIVE MODELS e.g. Risk Management, Sales Volume, Fraud.
BIDMach
SQL
DATA SCIENTISTS / DEVELOPERS BUSINESS USERS
50-100x Faster on Queries with Large Datasets
- Large retailer tested complex SQL queries
- n 3 years of retail data (150bn rows)
- 10 node Kinetica cluster against 30TB+
cluster from next best alternative
- GPU is able to perform many instructions in
- parallel. Huge performance gains on
aggregations, group bys, joins, etc.
- Kinetica sustained ingest of 1.3bn
- bjects/minute with 70 attributes per row
22
WHEN COMPARED TO LEADING IN-MEMORY ALTERNATIVES
SUM (Q1) GROUP BY (Q5) SELECT (Q10) 5 10 15 20 25 30 35 40 45 50
Kinetica Leading In-Memory DB
More Details
23
Distributed Geospatial Pipeline
23
- NATIVE VISUALIZATION IS DESIGNED FOR FAST MOVING, LOCATION-BASED DATA
Native Geospatial Object Types
- Points, Shapes, Tracks, Labels
Native Geospatial Functions
- Filters (by area, by series, by geometry, etc.)
- Aggregation (histograms)
- Geofencing - triggers
- Video generation (based on dates/times)
Generate Map Overlay Imagery (via WMS)
- Rasterize points
- Style based on attributes (class-break)
- Heat maps
Full-Text Search
“Rain Tire” ~5
Kinetica includes powerful text search functionality, including :
- Exact Phrases
- Boolean – AND / OR
- Wildcards
- Grouping
- Fuzzy Search (Damerau-Levenshtein optimal string alignment algorithm)
- N-Gram Term Proximity Search
- Term Boosting Relevance Prioritization
"Union Tranquility"~10 [100 TO 200]
22
INTELLIGENCE: US Army - INSCOM
US Army’s in-memory computational engine for any data with a geospatial or temporal attribute for a major joint cloud initiative within the Intelligence Community (IC ITE). Intel analysts are able to conduct near real-time analytics and fuse SIGINT, ISR, and GEOINT streaming big data feeds and visualize in a web browser. First time in history military analysts are able to query and visualize billions to trillions of near real- time objects in a production environment. Major executive military and congressional visibility.
Oracle Spatial (92 Minutes) 42x Lower Space 28x Lower Cost 38x Lower Power Cost U.S Army INSCOM Shift from Oracle to GPUdb GPUdb (20ms) 1 GPUdb server vs 42 servers with Oracle 10gR2 (2011)
CASE STUDY : LOCATION BASED ANALYTICS
24
LOGISTICS: Workforce optimization
DISTRIBUTED ANALYSIS
USPS’ parallel cluster is able to serve up to 15,000 simultaneous sessions, providing the service’s managers and analysts with the capability to instantly analyze their areas of responsibility via dashboards.
AT SCALE
With 200,000 USPS devices emitting location once every minute, that amounts to more than a quarter billion events captured and analyzed daily… tracked on 10 nodes.
USPS is the single largest logistic entity in the country, moving more individual items in four hours than the combination of UPS, FedEx, and DHL move all year.
CASE STUDY : LOCATION BASED ANALYTICS
25
LOGISTICS & FLEET MANAGEMENT
27
Kinetica enables agile tracking of shipments to assist store managers for tracking of inventory and arrival times.
- Visibility and tracking of deliveries & trucks for store
managers
- ETA & Notifications – Provide estimated time of delivery,
notifications and custom location based alerting
- Route Optimization based on truck size, and if cargo is
perishable or contains hazardous materials. LARGE RETAILER
CASE STUDY : LOCATION BASED ANALYTICS
RISK MANAGEMENT
28
Large financial institution moves counterparty risk analysis from overnight to real-time.
- Data collected by XVA library which computes risk
metrics for each trade
- Risk computations are becoming more complex and
computationally heavy. xVA analysis needs to project years into the future.
- Kinetica enables banks to move from batch/overnight
analysis to a streaming/real-time system for flexible real-time monitoring by traders, auditors and management. MULTINATIONAL BANK
CASE STUDY : ADVANCED IN-DATABASE ANALYTICS
Scale Out on Industry Standard Hardware
29
Kinetica typically results in 1⁄10 hardware costs of standard in-memory databases.
IN THE CLOUD WITH: CERTIFIED ON PREMISE WITH:
Runs on industry standard servers, 512GB memory with GPUs (ex. NVIDIA K80)
COMING SOON: