Coordinated and Efficient Huge Page Management with Ingens
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel
1
Coordinated and Efficient Huge Page Management with Ingens Youngjin - - PowerPoint PPT Presentation
Coordinated and Efficient Huge Page Management with Ingens Youngjin Kwon , Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel 1 High address translation cost Modern applications: large memory footprint, low memory access
Youngjin Kwon, Hangchen Yu, Simon Peter, Christopher J. Rossbach, and Emmett Witchel
1
2
Cpu cycles
0% 10% 20% 30% 40% 50% 60% 70% 429.mcf Graph analytics SVM MongoDB
% of cpu cycles spent by page walk
Virtual address Physical address Page table
3
Cpu cycles
0% 10% 20% 30% 40% 50% 60% 70% 429.mcf Graph analytics SVM MongoDB
Guest page table walk Host page table walk
% of cpu cycles spent by page walk
Virtual address Guest physical address Host physical address Guest page table Host page table
4
0% 1% 2% 3% 4% 5% Sandy Bridge Ivy Bridge Haswell Skylake
4KB page 2MB page
TLB coverage proportional to 64 GB DRAM
0.11% 0.05% 4.6% 0.01% 0.01% 0.1% 3.2% 0.1% 2013 2015 2014 2011
5
Linux FreeBSD LWN.net, 2011
6
Speed up 0% 10% 20% 30% 40% 50% 60% 4 2 9 . m c f ( S p e c C P U ) C a n n e a l ( P A R S E C ) S V M ( L i b l i n e a r ) G r a p h a n a l y t i c s ( P
e r G r a p h ) M a c h i n e l e a r n i n g ( S p a r k M L l i b ) W e b s e r v e r ( C l
d s t
e ) R e d i s M
g
B
B e t t e r Average
7
8
8
8
8
9
10
11
Linux Ingens
Synchronous allocation Asynchronous allocation Greedy allocation Spatial utilization based allocation How to allocate huge pages?
Problems
High page fault latency Memory bloating
12
13
Allocate page(s) Get page(s) from free page list Zero the page(s) Map the page(s) to page table Page fault handler Physical memory manager Application pause Application resume
Page fault latency
14
Get page(s) from free page list Zero the page(s) Map the page(s) to page table Page fault handler Physical memory manager Application pause Application resume Allocate page(s)
14
Get page(s) from free page list Zero the page(s) Map the page(s) to page table Page fault handler Physical memory manager Not enough contiguous memory Application pause Application resume Allocate page(s)
15
Not enough contiguous memory
15
Virtual address Physical address
Huge page boundary
B B B B B B
Allocated Base page
Not enough contiguous memory
B
fragmented
create contiguous pages
15
Virtual address Physical address
Huge page boundary
B B B B
Allocated Base page
Not enough contiguous memory
B B
fragmented
create contiguous pages
fragmented
create contiguous pages
Virtual address Physical address
Huge page boundary
B B B B B B
Allocated Base page
Not enough contiguous memory
H
Not enough contiguous memory
17
Get page(s) from free page list Zero the page(s) Map the page(s) to page table Page fault handler Physical memory manager Not enough contiguous memory Application pause Application resume Allocate page(s)
17
Get page(s) from free page list Zero the page(s) Map the page(s) to page table Page fault handler Physical memory manager Not enough contiguous memory Compact physical memory Application pause Application resume Allocate page(s)
17
Get page(s) from free page list Zero the page(s) Map the page(s) to page table Page fault handler Physical memory manager Not enough contiguous memory Compact physical memory Compaction may
Application pause Application resume Allocate page(s)
18
Page fault handler Asynchronous promotion
allocates base pages
background
background
bit vector 1 bit per base page Read/update
Promotion Kernel thread
Fast page fault handling
https://www.soasta.com/blog/page-bloat-average-web-page-2-mb/
19
performs 461,383 memory compactions
20
Linux Ingens 922.3 1091.9 (+18%)
Throughput (requests/s) Latency (millisecond)
100 200 300 400 500 600 Avg. 90th Avg. 90th
Linux Ingens
View event Visit home page
B e t t e r
performs 461,383 memory compactions
20
Linux Ingens 922.3 1091.9 (+18%)
Throughput (requests/s) Latency (millisecond)
100 200 300 400 500 600 Avg. 90th Avg. 90th
Linux Ingens
View event Visit home page
B e t t e r
21
Application occupies more memory than it uses
fault to huge page region
be fully used
internal fragmentation
22
Virtual address Physical address
Huge page boundary
H H
Used virtual address Unused virtual address Huge page region
H
populating 8KB objects
1KB object with YCSB
23
Using huge page Using only base page Redis 20.7GB (+69%) 12.2GB MongoDB 12.4GB (+23%) 10.1GB
Physical memory consumption
Bloating makes memory consumption unpredictable Memory-intensive applications can’t provision to avoid swap
promotion when the utilization is beyond a threshold (e.g., 90%)
fragmentation
24
Virtual address Physical address
H B B B B
100% utilization 75% utilization 25% utilization
25
Physical memory consumption GET throughput
12.2 GB Linux (base only) 20.7 GB Linux (huge) Ingens Linux (base only) Linux (huge) 21.7K 19.0K Ingens 12.3 GB 20.9K
+ 10% Better Better
Huge : 2MB page Base : 4KB page
26
429.mcf Graph Spark Canneal SVM Redis MongoDB
0.9% 0.9% 0.6% 1.9% 1.3% 0.2% 0.6%
Kernel build Grep Parsec 3.0 Benchmark
0.2% 0.4% 0.8%
Ingens overhead is negligible
27
Linux Ingens
Synchronous allocation Asynchronous allocation Greedy allocation Spatial utilization based allocation
Advantages
No extra page fault latency Bound memory bloating
28
29
SVM Canneal Redis FreeBSD 1.28 1.13 1.02 Linux 1.30 1.21 1.15 Ingens 1.29 1.19 1.15
fragmented
30