Reconfigurable hardware for big ig data Gustavo Alonso Systems - PowerPoint PPT Presentation

Reconfigurable hardware for big ig data Gustavo Alonso Systems Group Department of Computer Science ETH Zurich, Switzerland

www.systems.ethz.ch Systems Group • 7 faculty • ~40 PhD • ~8 postdocs Researching all aspects of system architecture, sw and hw

The team behind the work: David Sidler Muhsen Owaida Zsolt Istvan Kaan Kara

Data processing today: Appliances Data Centers (Cloud)

What is a database engine? • As complex or more complex than an operating system • Full software stack including • Parsers, Compilers, Optimizers • Own resource management (memory, storage, network) • Plugins for application logic • Infrastructure for distribution, replication, notifications, recovery • Extract, Transform, and Load infrastructure • Large legacy, backward compatibility, standards • Hugely optimized

Databases are blindly fast at what they do well

Databases = think big ORACLE EXADATA From Oracle documentation

Database engine trends: Appliances Oracle: T7, SQL in Hardware, RAPID SAP: OLTP+OLAP on main memory SAP Hana on SGI UV 300H Hana on SGI supercomputer SGI documentation

The challenge of hardware acceleration

If it sounds too good to be true ..

Usual unspoken caveats in HW acceleration • Where is the data to start with? • Where does the data has to be at the end? • What happens with irregular workloads? • What happens with large intermediate states? • What is the architecture? • Is the design preventing the system from doing something else? • Can the accelerator be multithreaded? • Is the gain big enough to justify the additional complexity? • Can the gains be characterized?

Do not replace, enhance Help the CPU to do what it does not do well

Text search in databases Istvan et al, FCCM’16 INTEL HARP: This is an experimental system provided by Intel any results presented are generated using pre- production hardware and software, and may not reflect the performance of production or future systems.

100% processing on FPGA

Hybrid Processing CPU/FPGA

Inside a real database … Sidler et al., SIGMOD’17

Accelerating real engines

Accelerators to come From Oracle M7 documentation

If the data moves, do it efficiently Bumps in the wire(s)

(Woods, VLDB’14) IBEX

Sounds good? The goal is to be able to do this at all levels: Smart storage On the network switch (SDN like) On the network card (smart NIC) On the PCI express bus On the memory bus (active memory) Every element in the system (a node, a computer rack, a cluster) will be a processing component

Disaggregated data center Near Data Computation

Consensus in a Box (Istvan et al, NSDI’16 ) Xilinx VC709 Evaluation Board FPGA SW Clients / SFP+ TCP Reads Other nodes Replicated Other nodes Writes SFP+ Direct Networking key-value store Atomic Broadcast Other nodes SFP+ Direct SFP+ DRAM (8GB) 18-Nov-16 23

The system 3 FPGA cluster 10Gbps Switch Comm. over TCP/IP Comm. over direct + Leader election connections X 12 + Recovery Clients • Drop-in replacement for memcached with Zookeeper’s replication • Standard tools for benchmarking (libmemcached) • Simulating 100s of clients 24

Latency of puts in a KVS Direct connections ~3 μ s Consensus Memaslap (ixgbe) 15-35 μ s ~10 μ s TCP / 10Gbps Ethernet 25

The benefit of specialization… 10000000 Specialized Througput (consensus rounds/s) solutions 1000000 10-100x FPGA (Direct) FPGA (TCP) 100000 DARE* (Infiniband) General Libpaxos (TCP) purpose Etcd (TCP) 10000 solutions Zookeeper (TCP) 1000 1 10 100 1000 Consensus latency (us) [1] Dragojevic et al. FaRM: Fast Remote Memory . In NSDI’14. 26 [2] Poke et al. DARE: High-Performance State Machine Replication on RDMA Networks. In HPDC’15. *=We extrapolated from the 5 node setup for a 3 node setup.

This is the end … There is a killer application (data science/big data) There is a very fast evolution of the infrastructure for data processing (appliances, data centers) Conventional processors and architectures are not good enough FPGAs great tools to: Explore parallelism Explore new architectures Explore Software Defined X/Y/Z Prototype accelerators

Reconfigurable hardware for big ig data Gustavo Alonso Systems - PowerPoint PPT Presentation

Reconfigurable hardware for big ig data Gustavo Alonso Systems Group Department of Computer Science ETH Zurich, Switzerland www.systems.ethz.ch Systems Group 7 faculty ~40 PhD ~8 postdocs Researching all aspects of system

Reconfigurable Computing Reconfigurable Computing Reconfigurable Architectures Reconfigurable

Reconfigurable Computing Computing Reconfigurable Reconfigurable Architectures Architectures

Hardware Observability Framework Hardware Observability Framework Hardware Observability

Reconfigurable Computing Reconfigurable Computing VHDL Crash Course VHDL Crash Course Chapter 2

Machine Learning Anders Holst SICS Big Data Analytics Analysis Big Data Big Value Big Data

Reconfigurable Computing Computing Reconfigurable Design and implementation implementation

Reconfigurable Computing Reconfigurable Computing Design and implementation Design and

Reconfigurable Computing Reconfigurable Computing Applications Applications Chapter 9 Chapter

Reconfigurable Computing Computing Reconfigurable Partial reconfiguration reconfiguration

Reconfigurable Computing Reconfigurable Computing Partitioning Partitioning Chapter 5 Chapter

Reconfigurable Computing Reconfigurable Computing Introduction Introduction Chapter 1 1

Reconfigurable Computing Computing Reconfigurable On- -line line communication communication

Big Data Algorithms with Medical Applications Yixin Chen Outline Challenges to big data

Reconfigurable and Reconfigurable and Adaptive Systems (RAS) Adaptive Systems (RAS) 7. Adaptive

Using Reconfigurable Logic Using Reconfigurable Logic to Simulate Computer Systems Derek Chiou

Reconfigurable Computing Reconfigurable Computing for System on a Chip for System on a Chip

Replication and Consistency Setting: Concurrent threads accessing shared data Roland Meyer (TU

MySQL High Availability Solutions Alex Poritskiy Percona The Five 9s of Availability

Har Hardw dware are box? Whats inside the Inside the case Motherboard Ethernet

Introduction to Network Security Security Chapter 5 Physical Network Layer Dr. Doug Jacobson -

The Multikernel: A new OS architecture for scalable multicore systems Andrew Baumann, Paul

Ideas for evolution of replication technology @ CERN Openlab Minor Review December 14 th , 2010

PipeDream: Generalized Pipeline Parallelism for DNN Training Deepak Narayanan , Aaron Harlap

1 What Limits Performance? Stalls (Data Hazards) Data hazards Code Instruction depends on

Sambuz

Useful Links

Newsletter

Mail Us