Cheaper, Faster Computing with hardware accelerators and NVM - - PowerPoint PPT Presentation

cheaper faster computing
SMART_READER_LITE
LIVE PREVIEW

Cheaper, Faster Computing with hardware accelerators and NVM - - PowerPoint PPT Presentation

Cheaper, Faster Computing with hardware accelerators and NVM storage Sang-Woo Jun Assistant Professor Department of Computer Science University of California, Irvine 2018-10-05 About Me Sang-Woo Jun Ph.D. (2018) @ MIT Research


slide-1
SLIDE 1

Cheaper, Faster Computing

with hardware accelerators and NVM storage

Sang-Woo Jun Assistant Professor Department of Computer Science University of California, Irvine

2018-10-05

slide-2
SLIDE 2

About Me

 Sang-Woo Jun  Ph.D. (2018) @ MIT  Research Interests

  • Systems architecture
  • Accelerators
  • NVM storage
  • Applications!
  • Graphs, Bioinformatics, Machine learning…

 Some Nice Papers

  • (ISCA, VLDB, FAST, FPGA, …)

 Some Nice Media Coverage

  • Engadget, The Next Platform, …
slide-3
SLIDE 3

Exciting Time to Be a Compute Architect

Google TPU Microsoft Azure Samsung Reconfigurable Processor

slide-4
SLIDE 4

Not the most exciting time to be an architect…

A Computer – Some History

CPU Memory

Program Data

John Hennessy and David Patterson, “Computer Architecture: A Quantitative Approach”, 2018 (Cropped) Bon-jae Koo, “Understanding of semiconductor memory architecture”, 2007 (Cropped)

Same program runs faster on more data tomorrow

slide-5
SLIDE 5

Running Into the Power Wall

0.007 μ

slide-6
SLIDE 6

Crisis Averted With Manycores?

Bernd Hoefflinger, “ITRS 2028—International Roadmap of Semiconductors”, 2015

CPU

Program Data

Memory CPU

slide-7
SLIDE 7

Memory/Storage Worries Too!

“[…] per gigabit (Gb) has declined from $11 in 2006 to less than $1 [in 2013]” We are still around $0.5 - $1/Gb as of 2018

Western Digital, “CPU Bandwidth – The Worrisome 2020 Trend”, 2016

Processing requirements are still increasing exponentially!

slide-8
SLIDE 8

The Exascale Challenge

Lynn Freeny, Department of Energy

Department of Energy requests an exaflop machine by 2020 1,000,000,000,000,000,000 floating point operations per second Using 2016 technology, 200 MW MIT Research nuclear reactor 6 MW

slide-9
SLIDE 9

Smaller Challenges Near Us

Smartphones IoT Devices AI Assistants

slide-10
SLIDE 10

No Better Time to Be an Architect!

Photo: Peg Skorpinski,UC Berkeley

“There are Turing Awards waiting to be picked up if people would just work on these things.” —David Patterson, 2018

slide-11
SLIDE 11

A Big Data Application: Personalized Genome

Cancer Patient Normal Genome Tumor Genome Next-Generation Sequencing Identified Mutations

“Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads,” Moncunill V. & Gonzalez S., et al., 2014

slide-12
SLIDE 12

Cluster System for Personalized Genome

Complex Algorithm Terabytes of Data 16 Machines (2 TB DRAM) 6 Hours $100,000 7,000 Watts

slide-13
SLIDE 13

A Cheaper Alternative Using Hardware-Accelerated SSD

+ + $2,000 80 Watts

slide-14
SLIDE 14

Reconfigurable Hardware Acceleration

Field Programmable Gate Array (FPGA) Program application-specific hardware High performance, Low power Reconfigurable to fit the application FPGA GPU

Bracco Filippo, “Rationale behind FPGA”, 2017

slide-15
SLIDE 15

Storage for Analytics

Terabytes in size Irregular access Fine-grained, DRAM TB of DRAM

$$$ $

Our goal: $8000/TB, 200W $500/TB, 10W

slide-16
SLIDE 16

Research Topics Galore

General Specific Programming Systems System Design OS Support Machine Learning Accelerator Libraries Climate Simulation Bioinformatics

slide-17
SLIDE 17

Project: Accelerated Object Storage

PCIe/Ethernet Object Object Object Object Virtual Object Virtual Object FPGA Acceleration Client

  • Storage exposes high-level object store abstraction to software
  • Computation offloaded to accelerator using “virtual objects”,

not breaking object store abstraction

slide-18
SLIDE 18

Project: Accelerating Stencil Computation for Climate Simulation

slide-19
SLIDE 19

Project: Distributed FPGA Cluster

slide-20
SLIDE 20

Project: Applications For Accelerator Platform

 Platform for efficient fine-grained acceleration  Goal: 10x performance against baseline  Claim: Easy to develop!  Candidate applications: Dynamic Time Warping, Smith-Waterman, Cosine Similarity, N-body simulation, …

Ideas?

slide-21
SLIDE 21

Things To Come!

Thank you!