Adaptive FPGA-based Database Accelerators Achievements, - - PowerPoint PPT Presentation

adaptive fpga based database
SMART_READER_LITE
LIVE PREVIEW

Adaptive FPGA-based Database Accelerators Achievements, - - PowerPoint PPT Presentation

Adaptive FPGA-based Database Accelerators Achievements, Possibilities, and Challenges Daniel Ziener and Jrgen Teich Database Acceleration Overview Idea: Translate each SQL query into an FPGA-based accelerator circuit through run-time


slide-1
SLIDE 1

Adaptive FPGA-based Database Accelerators – Achievements, Possibilities, and Challenges

Daniel Ziener and Jürgen Teich

slide-2
SLIDE 2

Database Acceleration – Overview

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 2

Idea: Translate each SQL query into an FPGA-based accelerator circuit through run-time assembly

  • f dynamically reconfigurable hardware modules

SELECT Price, Volume FROM Trades WHERE Symbol=“UBSN“ INTO UBSTrades a: Symbol = USBN WHERE a SELECT Price, Vol.

Trades UBSTrades

W

=

S

a Trades UBSTrades FPGA DynSoC Hardware Module Library SQL query

slide-3
SLIDE 3

Database Acceleration – Architecture

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 3

FPGA Host I N O U T Reconfigurable Area I N O U T Reconfigurable Area

  • Reconf. Manager

Library > < A N D SELECT * FROM table WHERE age > 20 > SELECT * FROM table WHERE salary > 10000 AND year < 1990 PCIe

A N D

> < Data > > < A N D

slide-4
SLIDE 4

Database Acceleration – Overview Module Library

  • Each partial area consists of 16 slots

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 4

Module Operator Coverage Number of Slots Throughput Restriction Arithmetic (+,-, ) Comparators (<,>,=,≠) Bitwise functions (AND, OR, NOT, XOR, ...) 2 1 Sample/Cycle Aggregation SUM(), MIN(), MAX(), COUNT() 2 1 Sample/Cycle Reorder Reorder Attributes of a tuple 4 1 Sample/Cycle Join Hash and Merge Join

  • 1 Sample/Cycle

Sort line for sorting 2 KB (64 KB) data 16 1 Sample/Cycle Sort tree merges sorted block

  • 1 Sample/Cycle
  • Each reconfigurable area consists of 16 slots
  • 4 reconfigurable areas available on our prototype
slide-5
SLIDE 5

Database Acceleration – Lessons Learned

  • High processing throughput achievable
  • Pipelined modules have a throughput of 2 GByte/s per reconfigurable

area (125 MHz x 16 Bytes)

  • The throughput is independent of the number of concatenated modules
  • I/O turns out to define the bottleneck
  • PCIe Gen2 x4: 1.7 GByte/s
  • Only one interface to feed all reconfigurable areas
  • Flexibility is the key feature
  • For each query different decisions can be taken at run-time
  • All processing alternatives can be executed on the same static system

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 5

Hash Join Merge Join Row- based Column- based

New Architecture: 12.8 GByte/s and 64 Bytes per Clock Cycle New Architecture: DDR3 Memory: 12.8 GByte/s

slide-6
SLIDE 6

Database Acceleration – New High-Performance Architecture

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 6

FPGA Reconfigurable Area Incoming queries > Database Tables =

B L O O M

Align- ment Unit Host Hash Join + Aggr. Conf. Manager time Query analysis + filter configuration Data processing Data processing Data processing Data processing FPGA Host

slide-7
SLIDE 7

Database Acceleration – Results (FPT’15)

  • Comparing Energy/Power consumption of an Intel Core i7 with our

approach based on an embedded Xilinx Zynq-SoC

  • Analysis of example query based on the TPC-DS benchmark (1 GB

scale), including restrictions, aggregations, and joins

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 7

Accl@ Zynq ARM – MySQL Intel i7 – MySQL Execution time 44.2 ms 6900 ms 420 ms Overall energy 190 mJ 1.47 J 5.33 J Improvment texe 156 9.5 Improvment Energ. 7.72 27.97

slide-8
SLIDE 8

Database Acceleration – Results (FPT’15)

  • Comparing Energy/Power consumption of an Intel Core i7 with our

approach based on an embedded Xilinx Zynq-SoC

  • Analysis of example query based on the TPC-DS benchmark (1 GB

scale), including restrictions, aggregations, and joins

Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges 8

Accl@ Zynq ARM – MySQL Intel i7 – MySQL Execution time 44.2 ms 6900 ms 420 ms Overall energy 190 mJ 1.47 J 5.33 J Improvment texe 156 9.5 Improvment Energ. 7.72 27.97 More Information: [1] D. Ziener, F. Bauer, A. Becher, C. Dennl, K. Meyer-Wegener, U. Schürfeld,

  • J. Teich, J. Vogt and H. Weber. FPGA-Based Dynamically

Reconfigurable SQL Query Processing. ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 9, no. 4, Article 25, July 2016. [2] A. Becher, D. Ziener, K. Meyer-Wegener and J. Teich. A Co-Design Approach for Accelerated SQL Query Processing via FPGA-based Data Filtering. In Proceedings of 2015 International Conference on Field-Programmable Technology (FPT '15), Queenstown, New Zealand, December 7--9, 2015.

slide-9
SLIDE 9

Current Database Management Systems

  • Database management systems are multi-user systems
  • Different queries with different complexity have to be processed on

different data at the same time

  • Response time is very important
  • Bunch of different operations
  • Query processing
  • Sorting
  • Data analytics
  • Data update
  • Changing load scenarios over time
  • E.g., day: query processing; night: data analytics

9 Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges

slide-10
SLIDE 10

HW Accelerators for Big Data Applications

  • Current software solutions
  • Multi-Core server systems with many nodes
  • On each core, data processing is done with data or time slices
  • Advantages:

OS support (task switching, mapping onto processing places) Easy to extend with new operators or analytic functions

10 Daniel Ziener | 07.03.2017 | Dagstuhl | FPGA-based Database Accelerators – Achievments, Possibilities, and Challenges

Question: How can we achieve such a flexibility for HW-based accelerators?

Library > < & Host FPGA

PCIe/ CAPI SATA

SSDs

  • Ext. Memory