HeteroOS - Heterogeneous Memory Management in Datacenter Sudarsun - - PowerPoint PPT Presentation

heteroos heterogeneous memory management in datacenter
SMART_READER_LITE
LIVE PREVIEW

HeteroOS - Heterogeneous Memory Management in Datacenter Sudarsun - - PowerPoint PPT Presentation

HeteroOS - Heterogeneous Memory Management in Datacenter Sudarsun Kannan (University of Wisconsin-Madison), Ada Gavrilovska (Georgia Tech), Vishal Gupta (VMWare), Karsten Schwan (Gerogia Tech) Systems and OS for Decades Application OSes


slide-1
SLIDE 1

HeteroOS - Heterogeneous Memory Management in Datacenter

Sudarsun Kannan (University of Wisconsin-Madison), Ada Gavrilovska (Georgia Tech), Vishal Gupta (VMWare), Karsten Schwan (Gerogia Tech)

slide-2
SLIDE 2

write()/read() SATA malloc()

Memory Controller

Application

Cache

OS

Virtual Memory File System

Hard Drive DRAM

  • OSes had to make simple binary data placement
  • Keep volatile hot data in memory
  • Move cold or persistent data to storage

1

Systems and OS for Decades

Introduction Motivation Analysis Design Conclusion

1

slide-3
SLIDE 3

2

Data Explosion Across Environments

Introduction Motivation Analysis Design Conclusion

2

slide-4
SLIDE 4

Controller

Cache

OS

DRAM

More CPUs and accelerators

NVM 3D-DRAM

Memory heterogeneity

Application

3

Changing Memory Hierarchy

Introduction Motivation Analysis Design Conclusion

Virtual Memory

3

slide-5
SLIDE 5

DRAM

Capacity Limited Very low High Bandwidth 1x 2x 0.25x Persistence? No No Yes Programmable No No No

NVM 3D-DRAM

Latency 1x

0.75x

4x $/GB 1x ~4x 0.3x

Significant difference in latency, bandwidth, and capacity across devices

Changing Memory Hierarchy

4

Introduction Motivation Analysis Design Conclusion

slide-6
SLIDE 6

Simply add more memory like NVM and applications will automatically scale

OS

Virtual Memory

NVM 3D-DRAM DRAM

To truly embrace memory heterogeneity:

  • 1. OSes must seamlessly scale heap capacity across heterogeneous memory
  • Our prior work – pVM (persistent virtual memory) EuroSys ‘16

5

OS for Heterogeneous Memory Systems

Introduction Motivation Analysis Design Conclusion

5

slide-7
SLIDE 7

OS

NVM 3D-DRAM DRAM

Virtual memory must provide optimal memory placement

Virtual Memory

6

OS for Heterogeneous Memory Systems

To truly embrace memory heterogeneity:

  • 1. OSes must seamlessly scale heap capacity across heterogeneous memory
  • Our prior work – pVM (persistent virtual memory) EuroSys ’16
  • 2. Efficiently place data across memories with different characteristics
  • HeteroOS

Introduction Motivation Analysis Design Conclusion

6

slide-8
SLIDE 8

Heterogeneity increases complexity Hypervisors do not expose even NUMA info to guest OS Guest OS Hypervisor APP 1 APP 2 Guest OS APP 1 Where to manage heterogeneity - APP, the OS, or the hypervisor (VMM)? Complex multi-layered software stack

Complex Datacenter System Stack

Introduction Motivation Analysis Design Conclusion

7

slide-9
SLIDE 9

State-of-the-art Systems

  • Application-level data placement across memory [X-mem, EuroSys ‘16]

ü Works well for heap-intensive applications with exclusive memory access ü Lacks holistic view of system and in-effective for non-exclusive access ü Cannot place OS-level pages

  • OS or Hypervisor-level hotness tracking and page migration [HeteroVisor, VEE ‘15]

ü Application-agnostic data placement ü Reactive technique – significant data tracking and movement overhead

Introduction Motivation Analysis Design Conclusion

8

slide-10
SLIDE 10
  • Capture demand for different memory page types at the guest-OS

ü Page types include heap, IO cache, network buffers

  • Expose heterogeneity to guest-OS
  • Migration only if direct OS allocation fails
  • Directly allocate/place pages on faster memory based on demand for page type

HeteroOS Key Idea

  • Designed for virtualized datacenter systems

Introduction Motivation Analysis Design Conclusion

9

slide-11
SLIDE 11
  • Our study

ü Analysis of memory heterogeneity impact on applications ü Page migration cost analysis

  • HeteroOS guest-OS management

ü OS design for direct memory placement and management

  • Coordinated management

ü Coordinate management between guest-OS with hypervisor

  • Conclusion

Talk Outline

10

slide-12
SLIDE 12

X-Stream Compute + memory + IO Memory + Network Metis CPU + Memory Network

Applications

Introduction Motivation Analysis Design Conclusion

11

slide-13
SLIDE 13

Processors

Thermal throttled memory

Node 0 (FastMem) Node 1 (SlowMem)

  • Prior techniques used simulators or delaying all memory instruction
  • Infeasible for long running application
  • Two memory sockets used to represent fast (FastMem) and slow (SlowMem) memory
  • SlowMem node thermal throttled to reduce bandwidth reduced by 9x

Emulating Heterogeneous Memory

Introduction Motivation Analysis Design Conclusion

12

slide-14
SLIDE 14

1 2 3 4 L2:B2 L3:B2 L5:B5 L5:B9 Slowdown relative to using FastMem-only Graphchi Xstream Redis Nginx

13

Latency sensitivity Bandwidth sensitivity Taller the bars, more app. slowdown

ü X-axis - factor by which latency is increased and bandwidth is reduced

  • E.g., L2:B2 indicates 2x increase in latency, 2x reduction in bandwidth

ü Y-axis - application slowdown relative to using only FastMem

Memory Latency and Bandwidth Impact?

Introduction Motivation Analysis Design Conclusion

13

slide-15
SLIDE 15

1 2 3 4 L2:B2 L3:B2 L5:B5 L5:B9 Slowdown relative to using FastMem-only Graphchi Xstream Redis Nginx

14

Latency sensitivity Bandwidth sensitivity Taller the bars, more app. slowdown

Heterogeneity impact significant on applications Current OS-level NUMA mechanisms not sufficient

Memory Latency and Bandwidth Impact?

Introduction Motivation Analysis Design Conclusion

14

slide-16
SLIDE 16

SlowMem Pages Cold pages Hot pages

  • Initially, all pages are placed in slower memory [HeteroVisor:Gupta:VEE15]

FastMem

  • Hypervisor identifies hot and cold pages passively
  • Moves hot pages to faster memory and cold pages to slower memory
  • Management dependent on page migration across memories
  • Memory heterogeneity is hidden from guest [Intel Memory Drive Technology]

Page migrations cause significant performance overhead

Migration-based Techniques

Introduction Motivation Analysis Design Conclusion

15

slide-17
SLIDE 17

Hot page scan in software requires

  • Traversing page table and setting page reference bit
  • Invalidating TLB to force CPU to access page table
  • Clearing bits and traversing page table to count references
  • Repeating until hotness threshold reached

Page movement requires

  • Allocating memory pages at destination
  • Copying pages and invalidating old TLB entries
  • Releasing old memory pages

Page migration —> hotness tracking + page movement

Understanding Page Migration Overheads

16

16

Introduction Motivation Analysis Design Conclusion

slide-18
SLIDE 18

15 30 45 60 75 100ms, 32K 300ms, 32K 500ms, 32K Rutime overhead (%) Sampling frequency, pages sampled Migration Hotness tracking Graphchi Hotness tracking more expensive than migration! Expensive data migrations can impact seamless capacity scaling benefits

17 17

Understanding Page Migration Overheads

Introduction Motivation Analysis Design Conclusion

17

slide-19
SLIDE 19

18

  • Application transparent heterogeneity management
  • Reduce page migrations and maximize application performance
  • Step 1 (HeteroOS-guest)

ü Directly allocate and manage heterogeneity at the guest-OS

  • Step 2 (HeteroOS-coordinated)

ü Guest-OS coordinates with hypervisor for hotness-tracking and migration Design goals Design steps

HeteroOS

Introduction Motivation Analysis Design Conclusion

18

slide-20
SLIDE 20

19

  • Expose heterogeneity to guest-OS via. NUMA abstraction
  • Perform direct allocation to FastMem and reduce page migration

Guest OS SlowMem Manager FastMem Manager Unmodified applications Hypervisor On-Demand Alloc Driver BackEnd

Alloc FastMem Failed Alloc SlowMem

Find inactive FastMem pages

Guest-OS Management

Analysis

Introduction Motivation Analysis Design Conclusion

19

slide-21
SLIDE 21

0% 25% 50% 75% 100% Graphchi Metis Redis Xstream #. Of Pages Heap I/O cache NW-buff Slab Current OSes always prioritize heap pages to faster memory

How Applications Use Memory?

  • Applications and OS allocate different types of pages
  • Placement of OS-level pages critical in addition to application’s heap pages

20

Introduction Motivation Analysis Design Conclusion

20

slide-22
SLIDE 22

Heap page Heap page I/O page I/O page I/O page

Heterogeneity-aware OS Placement

Heap page Heap page I/O page I/O page

Move to SlowMem even when I/O page demand is high

  • Heap pages always prioritized even when demand for I/O pages is high

Traditional OS virtual memory

Heap page I/O page I/O page I/O page I/O page I/O page I/O page I/O page I/O page

Move to SlowMem memory

  • All page types prioritized based on the demand

Heterogeneity-aware OS virtual memory

I/O page

FastMem (DRAM) FastMem (DRAM)

I/O page I/O page

21

Heap page Heap page

Introduction Motivation Analysis Design Conclusion

21

slide-23
SLIDE 23

Heterogeneity-aware OS Placement

22

  • Principle: OS knows how applications and subsystems use memory
  • OS knows demand for heap, IO cache, network buffer pages
  • Demand represents #. of pages of a page type requested in an epoch
  • Directly allocate to ”right memory” based on current demand
  • Use migration selectively when direct placement is not possible

22

Introduction Motivation Analysis Design Conclusion

22

slide-24
SLIDE 24
  • X-axis shows FastMem to SlowMem capacity ratio

ü 1/4 -> FastMem 8GB, SlowMem 32GB, 1/8 -> FastMem 4GB, SlowMem 32GB

  • Y-axis show the gains (%) relative to using only SlowMem

23

Guest-OS Management Impact

Introduction Motivation Analysis Design Conclusion

50 100 150 200 250 1/4 1/8 1/4 1/8 1/4 1/8 1/4 1/8 Graphchi X-Stream Redis Metis Gains(%) relative to SlowMem-only FastMem to SlowMem capacity ratio Migration-only HeteroOS-guest FastMem-only

23

Taller bars show better perf.

slide-25
SLIDE 25

24

On-demand allocation not sufficient for memory-hungry apps 69% average gains relative to naïve method, 20% over Migration-only for 1:4 ratio

Guest-OS Management Impact

Introduction Motivation Analysis Design Conclusion

50 100 150 200 250 1/4 1/8 1/4 1/8 1/4 1/8 1/4 1/8 Graphchi X-Stream Redis Metis Gains(%) relative to SlowMem-only FastMem to SlowMem capacity ratio Migration-only HeteroOS-guest FastMem-only

24

Taller bars show better perf.

slide-26
SLIDE 26

25

  • Delegate hotness tracking to the hypervisor and also provide insights

Guest OS SlowMem Manager FastMem Manager Unmodified applications Hypervisor On-Demand Alloc Driver BackEnd Alloc FastMem Failed

Find inactive FastMem pages

Hotpage Mechanism Perform hotness tracking Failed

Shared Mem

Migration Component

Hypervisor-Guest Coordinated

Introduction Motivation Analysis Design Conclusion

25

slide-27
SLIDE 27

Where to perform hotness tracking?

  • Hotness tracking in the hypervisor and not guest-OS
  • Frequent TLB invalidation and page table updates
  • Hypervisor has direct hardware control, hence lower cost

Why perform migration in the guest-OS?

  • Page-level info avoids false positives (e.g., inactive, or freed pages)

When to do page migrations?

  • When cache misses are high

ü Hot pages in slower memory can be already in processor cache

Hypervisor-Guest Coordinated

Introduction Motivation Analysis Design Conclusion

26

slide-28
SLIDE 28

50 100 150 200 250 1/4 1/8 1/4 1/8 1/4 1/8 1/4 1/8 Graphchi X-Stream Redis Metis Gains(%) relative to SlowMem-only FastMem to SlowMem capacity ratio Migration-only HeteroOS-guest HeteroOS-coordinated FastMem-only

27

9% 21% 32% 43%

Coordinated approach provides more gains when FastMem capacity low

  • Avg. of 80% gains over naïve placement and 31% over Migration-only for 1:4 ratio

Coordinated Management Impact

Introduction Motivation Analysis Design Conclusion

27

slide-29
SLIDE 29

Reduction in Migrations

28

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Graphchi X-Stream Redis Metis Pages migrated in Millions Migration-only HeteroOS-Coordinated

Introduction Motivation Analysis Design Conclusion

28

slide-30
SLIDE 30

Summary

29

  • Goal: Application-transparent data placement and reduce page migration
  • Key idea: Expose heterogeneity to guest-OS and make direct page placements

ü Reduce page migrations with coordinated management

  • Outcome: 80% average gains over naïve placement and 31% over migration-
  • nly approach

Introduction Motivation Analysis Design Conclusion

29

slide-31
SLIDE 31

Conclusion

30

  • Time to think about software heterogeneity management
  • Like accelerators, exposing memory heterogeneity to OS (or S/W) critical
  • Avoiding page migrations critical for best performance
  • S/W and H/W must be co-designed for efficient heterogeneity management

Introduction Motivation Analysis Design Conclusion

30

slide-32
SLIDE 32

31

slide-33
SLIDE 33

Conclusion

32

  • Time to think about software heterogeneity management
  • Like accelerators, exposing memory heterogeneity to OS (or S/W) critical
  • Avoiding page migrations critical for best performance
  • S/W and H/W must be co-designed for efficient heterogeneity management

Introduction Motivation Analysis Design Conclusion

32

Thank You!

Sudarsun Kannan sudarsun@cs.wisc.edu

HeteroOS @ ISCA ’17

slide-34
SLIDE 34

HeteroOS - OS design for heterogeneous memory management in datacenter

Sudarsun Kannan (University of Wisconsin-Madison), Ada Gavrilovska (Georgia Tech), Vishal Gupta (VMWare), Karsten Schwan (Gerogia Tech)

slide-35
SLIDE 35

Backup

34