1
@sqreamtech SIL7138
How to run SQL queries on TBs of data using GPUs Jake Wheat Lead - - PowerPoint PPT Presentation
How to run SQL queries on TBs of data using GPUs Jake Wheat Lead Architect, SQream Technologies @sqreamtech SIL7138 1 How to run SQL queries on TBs of data using GPUs 1. A toy SQL query engine 2. Support wide range of SQL queries 3.
1
@sqreamtech SIL7138
2
@sqreamtech SIL7138
3
@sqreamtech SIL7138
select a+b, c * 5 from t select
(a.k.a project/extend/rename)
thrust::transform select a, count(*), sum(b), avg(b) from t group by a stream aggregate thrust::reduce_by_key select a, b from t where a > 0.5 filter thrust::remove_if select distinct a from t stream distinct thrust::unique select a, b, c, d from t
sort thrust::sort select * from t union all select * from u union all
inner join u using (a) sort merge join (smj) simple implementation: thrust::upper_bounds, lower_bounds, unnest, gather
4
@sqreamtech SIL7138
5
@sqreamtech SIL7138
6
@sqreamtech SIL7138
GPU
Chunk the data
7
@sqreamtech SIL7138
8
@sqreamtech SIL7138
GPU
Use external sorting algorithms and a variety of spools
9
@sqreamtech SIL7138
10
@sqreamtech SIL7138
11
@sqreamtech SIL7138
12
@sqreamtech SIL7138
13
@sqreamtech SIL7138
Host Worker Host Worker Host Worker GPU task queue GPU 1 GPU worker GPU worker …. GPU 2 GPU worker GPU worker …. Host Worker Host Worker
14
@sqreamtech SIL7138
select a, b+c as d from t where b > 5 order by a
Logical Direct Combined
TableScan a,b,c Transform d:=b+c Remove If b>5 Sort by a Sort Merge
TableScan a,b,c To device Transform d:=b+c To host To device Remove If b>5 To host To device Sort by a To host Sort Merge
TableScan a,b,c To device Transform d:=b+c Remove If b>5 Sort by a To host Sort Merge
15
@sqreamtech SIL7138
16
@sqreamtech SIL7138
Small Big Big
slow good fast too big to fit fast good split
17
@sqreamtech SIL7138
18
@sqreamtech SIL7138
High selectivity Low selectivity input remove if sort fast input remove if sort slow input remove if rechunk sort fast
19
@sqreamtech SIL7138
20
@sqreamtech SIL7138
9 chunks Loop and load 9 times 3 chunks Loop and load 3 times
Reducing PCI transfer amounts in NINLJ
21
@sqreamtech SIL7138
22
@sqreamtech SIL7138
23
@sqreamtech SIL7138
24
@sqreamtech SIL7138