CS6453 Data-Intensive Systems: Technology trends, Emerging - - PowerPoint PPT Presentation

cs6453
SMART_READER_LITE
LIVE PREVIEW

CS6453 Data-Intensive Systems: Technology trends, Emerging - - PowerPoint PPT Presentation

CS6453 Data-Intensive Systems: Technology trends, Emerging challenges & opportuni=es Rachit Agarwal Slides based on: many many discussions with Ion Stoica, his class, and many industry folks Servers Typical node Memory bus PCI


slide-1
SLIDE 1

Data-Intensive Systems:

Technology trends, Emerging challenges & opportuni=es

CS6453

Rachit Agarwal

Slides based on: many many discussions with Ion Stoica, his class, and many industry folks

slide-2
SLIDE 2

Servers — Typical node

Memory bus PCI SATA Ethernet

slide-3
SLIDE 3

Servers — Typical node

Memory bus 80GB/s 100s GB 10s sec Capacity Time to read

slide-4
SLIDE 4

Servers — Typical node

Memory bus 80GB/s 100s GB 10s sec Capacity Time to read PCI (1GB/s) 100s GB 10s min

slide-5
SLIDE 5

Servers — Typical node

Memory bus 80GB/s 100s GB 10s sec Capacity Time to read PCI (1GB/s) SATA 100s GB 10s min 10s min 600MB/s 100MB/s 1s TB hours

slide-6
SLIDE 6

Trends — Moore’s law slowing down?

  • Stated 50 years ago by

Gordon Moore

  • Number of transistors on

microchip double ~2 years

  • Why interesting for

systems people?

  • Bryan Krzanich — Today,

closer to 2.5 years

slide-7
SLIDE 7

Trends — CPU (#cores)

Today, +20% every year

slide-8
SLIDE 8

Servers — Trends

Memory bus PCI SATA Ethernet

+20%

slide-9
SLIDE 9

Trends — CPU (performance per core)

Today, +10% every year

slide-10
SLIDE 10
  • Number of cores: +20%
  • Performance per core: +10%
  • Overall: +30-32%

Trends — CPU scaling

slide-11
SLIDE 11

Servers — Trends

Memory bus PCI SATA Ethernet

+30%

slide-12
SLIDE 12

Trends — Memory

+29% every year

slide-13
SLIDE 13

Servers — Trends

Memory bus PCI SATA Ethernet

+30% +30%

slide-14
SLIDE 14

Trends — Memory Bus

+15% every year

slide-15
SLIDE 15

Servers — Trends

Memory bus PCI SATA Ethernet

+30% +30% +15%

slide-16
SLIDE 16

Trends — SSD

SSDs cheaper than HDD

slide-17
SLIDE 17
  • Following Moore’s law (late start)
  • 3D technologies
  • May even outpace Moore’s law

Trends — SSD capacity scaling

slide-18
SLIDE 18

Servers — Trends

Memory bus PCI SATA Ethernet

+30% +30% +15% + >30%

slide-19
SLIDE 19

Trends — PCI bandwidth (and ~SATA)

+15-20% every year

slide-20
SLIDE 20

Servers — Trends

Memory bus PCI SATA Ethernet

+30% +30% +15% + >30% +15-20%

slide-21
SLIDE 21

Trends — Ethernet bandwidth

+33-40% every year

slide-22
SLIDE 22

Servers — Trends

Memory bus PCI SATA Ethernet

+30% +30% +15% + >30% +15-20% +40%

slide-23
SLIDE 23
  • Intra-server Bandwidth an increasing bottleneck
  • How could we overcome this?
  • Reduce the size of the data?
  • What does that mean for applications?
  • Prefer remote over local?
  • Challenges?
  • Non-intuitive; we always prefer locality

Trends — Implications?

slide-24
SLIDE 24
  • Non-volatile memory
  • 8-10x density of DRAM (close to SSD)
  • 2-4x higher latency
  • But who cares? Bandwidth is the bottleneck…

Trends — Emergence of new technologies

slide-25
SLIDE 25

Trends — Emergence of new technologies

slide-26
SLIDE 26

Trends — Emergence of new technologies

heps://www.youtube.com/watch?v=IWsjbqbkqh8

slide-27
SLIDE 27

Trends — & Implications

  • HDD is new tape
  • SSD/NVRAM is the new persistent storage
  • But, increasing gap between capacity and b/w concerning …
  • Deeper storage hierarchy (L1, L2, L3, DRAM, NVRAM, SSD, HDD)
  • Do CPU caches even matter?
  • How do design software stack to work with deeper hierarchy?
  • CPU-storage “disaggregation” is going to be a norm
  • Easier to overcome bandwidth bottlenecks
  • Google and Microsoft have already realized
  • What happens to locality?
  • Re-think software design?
slide-28
SLIDE 28

Paper 1 — Memory-centric design

  • SSD/NVRAM is the new persistent storage (+archival)
  • Not just the persistent storage, THE storage
  • +(private memory), deep storage hierarchy
  • CPU-storage “disaggregation”
  • NVRAM shared across CPUs
  • Challenges?
  • How to manage/share resources?
  • NVM: accelerators and controllers
  • Addressing? Flat virtual address space?
  • NVM sharing in multi-tenant scenarios?
  • NVM+CPU+Network: software-controlled?
  • Storage vs compute heavy workloads?
slide-29
SLIDE 29

Paper 1 — Memory-centric design

  • New failure modes? [very interesting direction!!]
  • CPU-storage can fail independently
  • Very different from today’s “servers”
  • Good? Bad?
  • Transparent failure mitigation…?
  • How about the OS?
  • Where should the OS sit?
  • What functionalities should be implemented within the OS?
  • Application-level semantics
  • ?
slide-30
SLIDE 30

Paper 2 — Nanostores (An alternative view)

  • DRAM is dead
  • SSD/NVRAM is the new persistent storage (+archival)
  • Not just the persistent storage, THE storage
  • No storage hierarchy
  • CPU-storage “convergence” is going to be a norm
  • CPU-storage hyper-convergence
  • Berkeley IRAM project (late 90s)
  • Challenges?
  • Network? (topology, intra-nanostore latency, throughput)
  • How does this bypass the trends discussed earlier?
slide-31
SLIDE 31

Trends — The missing piece?

  • Data volume increasing significantly faster than Moore’s law
  • 56x increase in Google indexed data in 7 years
  • 173% increase in enterprise data
  • Uber, Airbnb, Orbitz, Hotels, …
  • Data types
  • Images, audio, videos, logs, logs, logs, genetics, astronomy, ….
  • YouTube: ~50TB of data every day
slide-32
SLIDE 32

Trends — Discussion

  • Other missing pieces?
  • Software overheads
  • Application workloads
  • Specialization vs. generalization?