TLB misses - The Missing Issue of Adaptive Radix Tree?
Petrie Wong Ziqiang Feng Wenjian Xu Eric Lo Ben Kao
Department of Computer Science, The University of Hong Kong Department of Computing, The Hong Kong Polytechnic University
TLB misses - The Missing Issue of Adaptive Radix Tree? Petrie Wong - - PowerPoint PPT Presentation
TLB misses - The Missing Issue of Adaptive Radix Tree? Petrie Wong Ziqiang Feng Wenjian Xu Eric Lo Ben Kao Department of Computer Science, The University of Hong Kong Department of Computing, The Hong Kong Polytechnic University
Petrie Wong Ziqiang Feng Wenjian Xu Eric Lo Ben Kao
Department of Computer Science, The University of Hong Kong Department of Computing, The Hong Kong Polytechnic University
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
2
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
3
… … … … … … …
… EE …
01 02 03 04 01 02 03 04
key array pointer array
00 01 FF FD FE
Node256
00 01 02
Node256
Data
00 01 02 03
Node256
Data
…
01 02 03 FF 1 2 3 48
index array child pointer
Node48 … Node4
pointer array
small node type (Node4) for nodes with few child pointers large node type (Node256) for nodes with many child pointers
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
5
for program for CPU
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
6
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
7
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
to
8
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
9
Very skew
very skew (Zipf=2 to 3)
table entries in TLB
incurred (0% to 2% of stall time)
5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 stall time due to TLB miss/index lookup latency (%) Zipf Dense Sparse
0% to 2%
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
10
Uniform
not skew (Zipf=0 to 1)
accessed
(5% to 7%)
5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 stall time due to TLB miss/index lookup latency (%) Zipf Dense Sparse
5% to 7%
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
11
up to 23%
workload posses realistic skewness (Zipf = 1 to 2)
spatial locality
23%)
5 10 15 20 25 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 stall time due to TLB miss/index lookup latency (%) Zipf Dense Sparse
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
12
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
13
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
E5)
14
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
15
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
16
Page Table Entries ART Data L2 Cache when using regular page Others
Page Table Entries
ART Data Others L2 Cache when using huge page
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
entries
17
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
quite skew (Zipf < 2)
(Zipf > 2)
miss
18
5 10 15 20 25 30 35 40 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 Throughput Improvement (%) Zipf Dense Sparse
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
reorganization
19
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
20
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
21
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
22
P1 P2 Phot Pcold
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
skew
byte) are used
23
5M 10M 15M 20M 25M 30M 35M 40M 45M 50M 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 Throughput (lookup/s) Zipf ART with reorganization ART
… … … … … … …
… EE … 01 02 03 04 01 02 03 04 key array pointer array 00 01 FF FD FE Node256 00 01 02 Node256 Data 00 01 02 03 Node256 Data…
01 02 03 FF 1 2 3 48 index array child pointer Node48 … Node4 pointer arrayTLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
reorganization applied
few pages
need to be cached (for hot nodes)
throughput increase
24
5M 10M 15M 20M 25M 30M 35M 40M 45M 50M 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 Throughput (lookup/s) Zipf ART with reorganization ART
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
stay in TLB
reorganization immaterial
25
5M 10M 15M 20M 25M 30M 35M 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 2.8 3 Throughput (lookup/s) Zipf ART with reorganization ART
TLB misses - the Missing Issue of Adaptive Radix Tree? - presented by Petrie Wong
possess realistic skew
lookup throughput improvement over the use of regular page
does help when the data to be indexed is sparse
26