HeteroOS - Heterogeneous Memory Management in Datacenter Sudarsun - - PowerPoint PPT Presentation
HeteroOS - Heterogeneous Memory Management in Datacenter Sudarsun - - PowerPoint PPT Presentation
HeteroOS - Heterogeneous Memory Management in Datacenter Sudarsun Kannan (University of Wisconsin-Madison), Ada Gavrilovska (Georgia Tech), Vishal Gupta (VMWare), Karsten Schwan (Gerogia Tech) Systems and OS for Decades Application OSes
SLIDE 1
SLIDE 2
write()/read() SATA malloc()
Memory Controller
Application
Cache
OS
Virtual Memory File System
Hard Drive DRAM
- OSes had to make simple binary data placement
- Keep volatile hot data in memory
- Move cold or persistent data to storage
1
Systems and OS for Decades
Introduction Motivation Analysis Design Conclusion
1
SLIDE 3
2
Data Explosion Across Environments
Introduction Motivation Analysis Design Conclusion
2
SLIDE 4
Controller
Cache
OS
DRAM
More CPUs and accelerators
NVM 3D-DRAM
Memory heterogeneity
Application
3
Changing Memory Hierarchy
Introduction Motivation Analysis Design Conclusion
Virtual Memory
3
SLIDE 5
DRAM
Capacity Limited Very low High Bandwidth 1x 2x 0.25x Persistence? No No Yes Programmable No No No
NVM 3D-DRAM
Latency 1x
0.75x
4x $/GB 1x ~4x 0.3x
Significant difference in latency, bandwidth, and capacity across devices
Changing Memory Hierarchy
4
Introduction Motivation Analysis Design Conclusion
SLIDE 6
Simply add more memory like NVM and applications will automatically scale
OS
Virtual Memory
NVM 3D-DRAM DRAM
To truly embrace memory heterogeneity:
- 1. OSes must seamlessly scale heap capacity across heterogeneous memory
- Our prior work – pVM (persistent virtual memory) EuroSys ‘16
5
OS for Heterogeneous Memory Systems
Introduction Motivation Analysis Design Conclusion
5
SLIDE 7
OS
NVM 3D-DRAM DRAM
Virtual memory must provide optimal memory placement
Virtual Memory
6
OS for Heterogeneous Memory Systems
To truly embrace memory heterogeneity:
- 1. OSes must seamlessly scale heap capacity across heterogeneous memory
- Our prior work – pVM (persistent virtual memory) EuroSys ’16
- 2. Efficiently place data across memories with different characteristics
- HeteroOS
Introduction Motivation Analysis Design Conclusion
6
SLIDE 8
Heterogeneity increases complexity Hypervisors do not expose even NUMA info to guest OS Guest OS Hypervisor APP 1 APP 2 Guest OS APP 1 Where to manage heterogeneity - APP, the OS, or the hypervisor (VMM)? Complex multi-layered software stack
Complex Datacenter System Stack
Introduction Motivation Analysis Design Conclusion
7
SLIDE 9
State-of-the-art Systems
- Application-level data placement across memory [X-mem, EuroSys ‘16]
ü Works well for heap-intensive applications with exclusive memory access ü Lacks holistic view of system and in-effective for non-exclusive access ü Cannot place OS-level pages
- OS or Hypervisor-level hotness tracking and page migration [HeteroVisor, VEE ‘15]
ü Application-agnostic data placement ü Reactive technique – significant data tracking and movement overhead
Introduction Motivation Analysis Design Conclusion
8
SLIDE 10
- Capture demand for different memory page types at the guest-OS
ü Page types include heap, IO cache, network buffers
- Expose heterogeneity to guest-OS
- Migration only if direct OS allocation fails
- Directly allocate/place pages on faster memory based on demand for page type
HeteroOS Key Idea
- Designed for virtualized datacenter systems
Introduction Motivation Analysis Design Conclusion
9
SLIDE 11
- Our study
ü Analysis of memory heterogeneity impact on applications ü Page migration cost analysis
- HeteroOS guest-OS management
ü OS design for direct memory placement and management
- Coordinated management
ü Coordinate management between guest-OS with hypervisor
- Conclusion
Talk Outline
10
SLIDE 12
X-Stream Compute + memory + IO Memory + Network Metis CPU + Memory Network
Applications
Introduction Motivation Analysis Design Conclusion
11
SLIDE 13
Processors
Thermal throttled memory
Node 0 (FastMem) Node 1 (SlowMem)
- Prior techniques used simulators or delaying all memory instruction
- Infeasible for long running application
- Two memory sockets used to represent fast (FastMem) and slow (SlowMem) memory
- SlowMem node thermal throttled to reduce bandwidth reduced by 9x
Emulating Heterogeneous Memory
Introduction Motivation Analysis Design Conclusion
12
SLIDE 14
1 2 3 4 L2:B2 L3:B2 L5:B5 L5:B9 Slowdown relative to using FastMem-only Graphchi Xstream Redis Nginx
13
Latency sensitivity Bandwidth sensitivity Taller the bars, more app. slowdown
ü X-axis - factor by which latency is increased and bandwidth is reduced
- E.g., L2:B2 indicates 2x increase in latency, 2x reduction in bandwidth
ü Y-axis - application slowdown relative to using only FastMem
Memory Latency and Bandwidth Impact?
Introduction Motivation Analysis Design Conclusion
13
SLIDE 15
1 2 3 4 L2:B2 L3:B2 L5:B5 L5:B9 Slowdown relative to using FastMem-only Graphchi Xstream Redis Nginx
14
Latency sensitivity Bandwidth sensitivity Taller the bars, more app. slowdown
Heterogeneity impact significant on applications Current OS-level NUMA mechanisms not sufficient
Memory Latency and Bandwidth Impact?
Introduction Motivation Analysis Design Conclusion
14
SLIDE 16
SlowMem Pages Cold pages Hot pages
- Initially, all pages are placed in slower memory [HeteroVisor:Gupta:VEE15]
FastMem
- Hypervisor identifies hot and cold pages passively
- Moves hot pages to faster memory and cold pages to slower memory
- Management dependent on page migration across memories
- Memory heterogeneity is hidden from guest [Intel Memory Drive Technology]
Page migrations cause significant performance overhead
Migration-based Techniques
Introduction Motivation Analysis Design Conclusion
15
SLIDE 17
Hot page scan in software requires
- Traversing page table and setting page reference bit
- Invalidating TLB to force CPU to access page table
- Clearing bits and traversing page table to count references
- Repeating until hotness threshold reached
Page movement requires
- Allocating memory pages at destination
- Copying pages and invalidating old TLB entries
- Releasing old memory pages
Page migration —> hotness tracking + page movement
Understanding Page Migration Overheads
16
16
Introduction Motivation Analysis Design Conclusion
SLIDE 18
15 30 45 60 75 100ms, 32K 300ms, 32K 500ms, 32K Rutime overhead (%) Sampling frequency, pages sampled Migration Hotness tracking Graphchi Hotness tracking more expensive than migration! Expensive data migrations can impact seamless capacity scaling benefits
17 17
Understanding Page Migration Overheads
Introduction Motivation Analysis Design Conclusion
17
SLIDE 19
18
- Application transparent heterogeneity management
- Reduce page migrations and maximize application performance
- Step 1 (HeteroOS-guest)
ü Directly allocate and manage heterogeneity at the guest-OS
- Step 2 (HeteroOS-coordinated)
ü Guest-OS coordinates with hypervisor for hotness-tracking and migration Design goals Design steps
HeteroOS
Introduction Motivation Analysis Design Conclusion
18
SLIDE 20
19
- Expose heterogeneity to guest-OS via. NUMA abstraction
- Perform direct allocation to FastMem and reduce page migration
Guest OS SlowMem Manager FastMem Manager Unmodified applications Hypervisor On-Demand Alloc Driver BackEnd
Alloc FastMem Failed Alloc SlowMem
Find inactive FastMem pages
Guest-OS Management
Analysis
Introduction Motivation Analysis Design Conclusion
19
SLIDE 21
0% 25% 50% 75% 100% Graphchi Metis Redis Xstream #. Of Pages Heap I/O cache NW-buff Slab Current OSes always prioritize heap pages to faster memory
How Applications Use Memory?
- Applications and OS allocate different types of pages
- Placement of OS-level pages critical in addition to application’s heap pages
20
Introduction Motivation Analysis Design Conclusion
20
SLIDE 22
Heap page Heap page I/O page I/O page I/O page
Heterogeneity-aware OS Placement
Heap page Heap page I/O page I/O page
Move to SlowMem even when I/O page demand is high
- Heap pages always prioritized even when demand for I/O pages is high
Traditional OS virtual memory
Heap page I/O page I/O page I/O page I/O page I/O page I/O page I/O page I/O page
Move to SlowMem memory
- All page types prioritized based on the demand
Heterogeneity-aware OS virtual memory
I/O page
FastMem (DRAM) FastMem (DRAM)
I/O page I/O page
21
Heap page Heap page
Introduction Motivation Analysis Design Conclusion
21
SLIDE 23
Heterogeneity-aware OS Placement
22
- Principle: OS knows how applications and subsystems use memory
- OS knows demand for heap, IO cache, network buffer pages
- Demand represents #. of pages of a page type requested in an epoch
- Directly allocate to ”right memory” based on current demand
- Use migration selectively when direct placement is not possible
22
Introduction Motivation Analysis Design Conclusion
22
SLIDE 24
- X-axis shows FastMem to SlowMem capacity ratio
ü 1/4 -> FastMem 8GB, SlowMem 32GB, 1/8 -> FastMem 4GB, SlowMem 32GB
- Y-axis show the gains (%) relative to using only SlowMem
23
Guest-OS Management Impact
Introduction Motivation Analysis Design Conclusion
50 100 150 200 250 1/4 1/8 1/4 1/8 1/4 1/8 1/4 1/8 Graphchi X-Stream Redis Metis Gains(%) relative to SlowMem-only FastMem to SlowMem capacity ratio Migration-only HeteroOS-guest FastMem-only
23
Taller bars show better perf.
SLIDE 25
24
On-demand allocation not sufficient for memory-hungry apps 69% average gains relative to naïve method, 20% over Migration-only for 1:4 ratio
Guest-OS Management Impact
Introduction Motivation Analysis Design Conclusion
50 100 150 200 250 1/4 1/8 1/4 1/8 1/4 1/8 1/4 1/8 Graphchi X-Stream Redis Metis Gains(%) relative to SlowMem-only FastMem to SlowMem capacity ratio Migration-only HeteroOS-guest FastMem-only
24
Taller bars show better perf.
SLIDE 26
25
- Delegate hotness tracking to the hypervisor and also provide insights
Guest OS SlowMem Manager FastMem Manager Unmodified applications Hypervisor On-Demand Alloc Driver BackEnd Alloc FastMem Failed
Find inactive FastMem pages
Hotpage Mechanism Perform hotness tracking Failed
Shared Mem
Migration Component
Hypervisor-Guest Coordinated
Introduction Motivation Analysis Design Conclusion
25
SLIDE 27
Where to perform hotness tracking?
- Hotness tracking in the hypervisor and not guest-OS
- Frequent TLB invalidation and page table updates
- Hypervisor has direct hardware control, hence lower cost
Why perform migration in the guest-OS?
- Page-level info avoids false positives (e.g., inactive, or freed pages)
When to do page migrations?
- When cache misses are high
ü Hot pages in slower memory can be already in processor cache
Hypervisor-Guest Coordinated
Introduction Motivation Analysis Design Conclusion
26
SLIDE 28
50 100 150 200 250 1/4 1/8 1/4 1/8 1/4 1/8 1/4 1/8 Graphchi X-Stream Redis Metis Gains(%) relative to SlowMem-only FastMem to SlowMem capacity ratio Migration-only HeteroOS-guest HeteroOS-coordinated FastMem-only
27
9% 21% 32% 43%
Coordinated approach provides more gains when FastMem capacity low
- Avg. of 80% gains over naïve placement and 31% over Migration-only for 1:4 ratio
Coordinated Management Impact
Introduction Motivation Analysis Design Conclusion
27
SLIDE 29
Reduction in Migrations
28
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 Graphchi X-Stream Redis Metis Pages migrated in Millions Migration-only HeteroOS-Coordinated
Introduction Motivation Analysis Design Conclusion
28
SLIDE 30
Summary
29
- Goal: Application-transparent data placement and reduce page migration
- Key idea: Expose heterogeneity to guest-OS and make direct page placements
ü Reduce page migrations with coordinated management
- Outcome: 80% average gains over naïve placement and 31% over migration-
- nly approach
Introduction Motivation Analysis Design Conclusion
29
SLIDE 31
Conclusion
30
- Time to think about software heterogeneity management
- Like accelerators, exposing memory heterogeneity to OS (or S/W) critical
- Avoiding page migrations critical for best performance
- S/W and H/W must be co-designed for efficient heterogeneity management
Introduction Motivation Analysis Design Conclusion
30
SLIDE 32
31
SLIDE 33
Conclusion
32
- Time to think about software heterogeneity management
- Like accelerators, exposing memory heterogeneity to OS (or S/W) critical
- Avoiding page migrations critical for best performance
- S/W and H/W must be co-designed for efficient heterogeneity management
Introduction Motivation Analysis Design Conclusion
32
Thank You!
Sudarsun Kannan sudarsun@cs.wisc.edu
HeteroOS @ ISCA ’17
SLIDE 34
HeteroOS - OS design for heterogeneous memory management in datacenter
Sudarsun Kannan (University of Wisconsin-Madison), Ada Gavrilovska (Georgia Tech), Vishal Gupta (VMWare), Karsten Schwan (Gerogia Tech)
SLIDE 35
Backup
34