FPGA Acceleration for the Frequent Item Problem
Jens Teubner, Ren´ e M¨ uller, Gustavo Alonso ETH Zurich, Systems Group
FPGA Acceleration for the Frequent Item Problem Jens Teubner, Ren - - PowerPoint PPT Presentation
FPGA Acceleration for the Frequent Item Problem Jens Teubner, Ren e M uller, Gustavo Alonso ETH Zurich, Systems Group not (only) about FPGAs not about a new solution to the frequent item problem 2 / 17 Frequent Item Problem: Given a
Jens Teubner, Ren´ e M¨ uller, Gustavo Alonso ETH Zurich, Systems Group
2 / 17
1 foreach stream item x ∈ S do 2
find bin bx with bx.item = x lookup by item ;
3
if such a bin was found then
4
bx.count ← bx.count + 1 ;
5
else
6
bmin ← bin with minimum count value lookup by count ;
7
bmin.count ← bmin.count + 1 ;
8
bmin.item ← x ;
3 / 17
16 32 64 128 256 512 1024 10 20 30 40 50 z = ∞ z = 2 z = 1.5 z = 1 z = 0 number of items monitored throughput [million items / sec]
(Intel T9550 @ 2.66 MHz; code by Cormode and Hadjieleftheriou, VLDB 2008)
4 / 17
“hash table on steroids”
min-heap maintenance speed-up
5 / 17
16 32 64 128 256 512 1024 10 20 30 40 50 hardware software number of items monitored throughput [million items / sec]
6 / 17
7 / 17
coordinator bin 1 bin 2 bin 3 · · · bin k − 1 bin k item
?
= xi, count ? reduction: bx, bmin update!
1 Broadcast input item xi to all bins. 2 Reduce to determine bx and bmin. 3 Update bx/bmin.
8 / 17
16 32 64 128 256 512 1024 10 20 30 40 50 z = ∞ z = 1.5 z = 0 number of items monitored throughput [million items / sec]
9 / 17
coordinator bin 1 bin 2 bin 3 bin 4 · · · bin k − 1 bin k
10 / 17
11 / 17
item count bi−1 · · · item count bi x1 item count bi+1 x1 item count bi+2 · · · bi.item
?
= x1 bi.count
?
< bi+1.count
1 Compare input item x1 to content of bin bi
(and increment count value if a match was found).
2 Order bins bi and bi+1 according to count values. 3 Move x1 forward in the array and repeat.
→ Drop x1 into last bin if no match can be found.
12 / 17
13 / 17
?
?
?
?
14 / 17
16 32 64 128 256 512 1024 20 40 60 80 100 software FPGA (data parallel) FPGA (pipeline parallel) number of items monitored throughput [million items / sec]
15 / 17
16 / 17
Straightforward s/w → h/w mapping will not do the job.
Signal propagation delays will limit scalability.
Keep communication and synchronization cheap.
This work was supported by the Swiss National Science Foundation.
17 / 17