Big Data, Little Cluster: Using a Small Footprint of GPU Servers to Interactively Query and Visualize Massive Datasets
May 9, 2017 Todd Mostak | Co-founder + CEO, MapD @toddmostak | @mapd
Big Data, Little Cluster: Using a Small Footprint of GPU Servers to - - PowerPoint PPT Presentation
Big Data, Little Cluster: Using a Small Footprint of GPU Servers to Interactively Query and Visualize Massive Datasets May 9, 2017 Todd Mostak | Co-founder + CEO, MapD @toddmostak | @mapd The data explosion is just beginning Exabytes 40k
May 9, 2017 Todd Mostak | Co-founder + CEO, MapD @toddmostak | @mapd
2
Enterprise Data VOIP Social Media & Web Sensor + Devices
0k 10k 20k 30k 40k
2014 2015 2016 2017 2018 2019 2020
Source: IDC and EMC Digital Universe Report 20k 30k 10k 40k
Doubling in less than 3 years Exabytes
3
0.00 0.02 0.04 0.06 0.08 0.10 0.12
2015 2016 2017 2018 2019 2020
Source: Wikibon 2015 4-year costs/TB SSD including packaging, power, cooling, maintenance + space
Terabytes
Amount of Storage $1 Buys (in T erabytes)
4
Data Growth CPU Processing Power
40% per year 20% per year
5
Data Growth CPU Processing Power GPU Processing Power
50% per year 40% per year 20% per year
Ability to Read Data
1,000 2,000 3,000 4,000 5,000 6,000 7,000
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Memory Bandwidth
10 20 30 40 50 60 70 80 90
2007 2008 2009 2010 2011 2012 2013 2014 2015 2016
Teraflops
memory bandwidth GB/sec floating point operations /sec
6
GPU GPU CPU CPU Compute Power Compute Power Ability to Read Data
7
MapD Core MapD Immerse
An in-memory, relational, column store database powered by GPUs A visual analytics engine that leverages the speed + rendering capabilities of MapD Core
100x Faster Queries Speed of Thought Visualization
Tableau or 3rd party viz Non-Viz Output
JDBC/Hadoop
8
Streaming Data
MapD Immerse MapD Core
GPU Acceleration
JDBC, ODBC, Thrift
Kafka
9
The world's fastest in-memory GPU database powers the world's most immersive data exploration experience
Data Lake/Data Warehouse/SOR
10
SSD or NVRAM STORAGE (L3) 250GB to 20TB 1-2 GB/sec CPU RAM (L2) 32GB to 3TB 70-120 GB/sec GPU RAM (L1) 24GB to 384GB 3000-5000 GB/sec Hot Data Speedup = 1500x to 5000x Over Cold Data Warm Data Speedup = 35x to 120x Over Cold Data Cold Data COMPUTE LAYER STORAGE LAYER
SPEED INCREASES
Traditional DBs can be highly inefficient
MapD compiles queries w/LLVM to create one custom function
10111010101001010110101101010101 00110101101101010101010101011101 11
12
Noted DB blogger, Mark Litwintschik has benchmarked MapD vs. major CPU systems and found it to be between 74x to 3,500x faster than CPU DBs.
13
ML frameworks GPU Acceleration Zone Custom functions Result set
14
Output result set
Lightning fast visual analytics for the MapD Core database
Basic charts are frontend rendered using D3 and other related toolkits Scatterplots, pointmaps + polygons are backend rendered using the Iris Rendering Engine on GPUs Geo-Viz is composited over a frontend rendered basemap
16
Data goes from compute (CUDA) to graphics (OpenGL) pipeline without copy and comes back as compressed PNG (~100 KB) rather than raw data (> 1GB) Vega Spec (a visualization grammar)
visualization designs
which can be driven by data columns and mapped by scales
Shader Compilation Framework
floats, colors, etc), and multiple continuities (discrete, continuous)
Backend
Query-to- Render
PNG SQL+ Vega Frontend
17
Performantly scaling the MapD Analytics Platform to analyze big data on small clusters
RAM for caching bigger datasets in memory
better throughput
19
20
Confidential & Proprietary 21
MapD Aggregator Cluster Metadata MapD Leaf Data Data Data MapD Leaf Data Data Data MapD Leaf Data Data Data
GPUN GPU2
22 Identify and Load Data (If Needed)
Execute Query
Identify and Load Data (If Needed)
Execute Query GPU1 Accept Query Prepare Execution
Identify and Load Data (If Needed)
Execute Query Reduce Result
MapD Handler
...
Query Done?
Return Result
Yes No
Leaf 2 Leaf 1
. . .
Parse and Validate SQL Generate Algebraic Sequence
23 Reduce Result Query Done? Return Result
Yes
No
Prepare Execution
Leaf N Shared Dictionary
Leaf Aggregator
1.1B record NYC Taxi Dataset benchmark (conducted by Mark Litwintschik)
24
Query AWS P2.8xlarge timings (seconds) 2 x P2.8xlarge cluster timings (seconds) SELECT cab_type, count(*) FROM trips GROUP BY cab_type;
0.022 0.034
SELECT passenger_count, avg(total_amount) FROM trips GROUP BY passenger_count;
0.156 0.061
SELECT passenger_count, extract(year from pickup_datetime) AS pickup_year, count(*) FROM trips GROUP BY passenger_count, pickup_year;
0.309 0.178
SELECT passenger_count, extract(year from pickup_datetime) AS pickup_year, cast(trip_distance as int) AS distance, count(*) AS the_count FROM trips GROUP BY passenger_count, pickup_year, distance ORDER BY pickup_year, the_count desc;
0.771 0.499
Load Time
48 minutes 26 minutes
25
26
27
Polling smartphones on demand to assess network health. Running complex queries in real-time for customers to drive insights and ad-buys. npm looks at over 8B records at a given moment to identify trends, segments + anomalies in the javascript world.
Previously had to respond in 24+ hours. Took hours on Oracle previously. Splunk couldn’t scale economically
28
We are at an inflection point in compute and GPUs are set to dominate the coming decade.
29
GPUs allow users to scale up before needing to scale out. lowering performance-killing network overheads and decreasing hardware and administration costs.
30
Integrated Analytics on GPUs comprising querying, viz and ML provide critical efficiencies and capabilities not found in siloed systems.