Beyond'the'Wall:'
Near0Data'Processing'for'Databases
1 HARVARD'UNIVERSITY
Sam$Xi,'Ore'Babarinsa,'Manos$Athanassoulis,'Stratos Idreos
Beyond'the'Wall:' Near0Data'Processing'for'Databases Sam$Xi - - PowerPoint PPT Presentation
Beyond'the'Wall:' Near0Data'Processing'for'Databases Sam$Xi ,'Ore'Babarinsa,' Manos$Athanassoulis ,'Stratos Idreos HARVARD'UNIVERSITY 1 Memory'Wall Memory'Wall HARVARD'UNIVERSITY 3 Row'store Column'store tuple tuple HARVARD'UNIVERSITY 4
1 HARVARD'UNIVERSITY
Sam$Xi,'Ore'Babarinsa,'Manos$Athanassoulis,'Stratos Idreos
3 HARVARD'UNIVERSITY
Row'store Column'store
tuple tuple
4 HARVARD'UNIVERSITY
5 HARVARD'UNIVERSITY
6 HARVARD'UNIVERSITY
HARVARD'UNIVERSITY 7
HARVARD'UNIVERSITY 8
HARVARD'UNIVERSITY 9
We'are'not'the'first'to'visit'this'pyramid!
Intelligent'RAM DIVA LogicRinRmemory Terasys RADram
10 HARVARD'UNIVERSITY
HARVARD'UNIVERSITY 11
DRAM Logic Leakage Switching2speed
Low Slow High Fast
Fabrication2processes2are2incompatible
provided'consistent'performance'scaling'for'years
Metric Scaling2factor
Area 1/κ2 Delay 1/κ Power 1 Moore’s'Law. Dennard'scaling.
12 HARVARD'UNIVERSITY
Not'the'case'anymore!
HARVARD'UNIVERSITY 13
Ibex
Our$approach
HARP Q100 Widx
Intro NDP'for'data'systems:'Past'and'present The'architecture'of'JAFAR Experimental'results Conclusion
HARVARD'UNIVERSITY 14
…
Host'server Database Query
Lots2of2data
Many'rows'fail'the' query'predicate'and' are'discarded.
Filter2data2before2 it2is2sent2to2CPU.
15 HARVARD'UNIVERSITY
DRAM
JAFAR
CPU CPU CPU CPU System'bus'+'memory'controller DRAM
JAFAR
16 HARVARD'UNIVERSITY
Last'level'cache
Rank
Sense2Amps Sense2Amps
Bank20 Bank20 Bank20
Sense2Amps Sense2Amps
Bank20
Row'address'decoder
Column'address'decoder
Chip
17 HARVARD'UNIVERSITY
Sense2Amps Sense2Amps
Bank20 Bank20 Bank20
Sense2Amps Sense2Amps
Bank20
Row'address'decoder
Column'address'decoder
Rank Array20 Array21 Array22 Array23
Bank
Rank
18 HARVARD'UNIVERSITY
DRAM
JAFAR
CPU CPU CPU CPU System'bus'+'memory'controller DRAM
JAFAR
19 HARVARD'UNIVERSITY
Last'level'cache
Sense2Amps Sense2Amps
Bank20 Bank20 Bank20
Sense2Amps
Sense'Amps
Bank20
JAFAR
IO'buffer
Memory2 access2 arbiter From'CPU RAS CAS
20 HARVARD'UNIVERSITY
Opcode Left Right Opcode
Comparison'is'true? page'offset'bitmask write'enable
From1IO1buffer Data'latch
ALU ALU
Page'offset'counter Output'buffer
21 HARVARD'UNIVERSITY
int errno = select_jafar( void* col_data, int range_low, int range_high, uint8_t*
size_t num_input_rows, size_t* num_output_rows); 22 HARVARD'UNIVERSITY
DRAM
JAFAR
DRAM
JAFAR
23 HARVARD'UNIVERSITY
CPU CPU CPU CPU System'bus'+'memory'controller Last'level'cache
Fill'up'each'module'first
DRAM DRAM
JAFAR JAFAR
24 HARVARD'UNIVERSITY
CPU CPU CPU CPU System'bus'+'memory'controller Last'level'cache
Interleave'data'across'modules
DRAM DRAM
JAFAR JAFAR
25 HARVARD'UNIVERSITY
CPU CPU CPU CPU System'bus'+'memory'controller Last'level'cache
The'CPU'and'JAFAR'cannot'simultaneously'attempt' to'access'memory. CPU'grants'JAFAR'ownership'to'a'DRAM'rank'for'a' period'of'time. Possible'mechanism:'DRAM'mode'registers
26 HARVARD'UNIVERSITY
Simulation'framework
OutRofRorder'CPU Classic'cache'model SimpleDRAM
27 HARVARD'UNIVERSITY
1M
InRhouse'column'store' database 4'million'rows'of' unsorted'integers
Queries,'input'data,'and'database
select * from table where column < n ;
28 HARVARD'UNIVERSITY
29 HARVARD'UNIVERSITY
Scheduling'of'ownership'transfers'will'be' important What'would'JAFAR’s'performance'look'like' without a'scheduler?
30 HARVARD'UNIVERSITY
CPU
Idle'period JAFAR'can'execute Memory'requests Memory'requests
31 HARVARD'UNIVERSITY
32 HARVARD'UNIVERSITY
More'operators Aggregations Projections Sort Joins
33 HARVARD'UNIVERSITY
Data'types'and'layouts RowRstores'and'hybrids
Multiple$filters$per$row Efficient$projections
Variable'length'datatypes
Process$on$CPU?
34 HARVARD'UNIVERSITY
35 HARVARD'UNIVERSITY
HARVARD'UNIVERSITY 36
NDP'is'a'promising'solution'to'the' memory'wall'for'data'systems. JAFAR'provides'up'to'9x'speedup'on' simple'select'queries. JAFAR'is'built'on'an'extensible' framework'for'accelerating'data'systems.
37 HARVARD'UNIVERSITY