Implementation of Direct Segments on a RISC-V Processor
Nikhita Kunati, Michael M. Swift University of Wisconsin-Madison
1
Implementation of Direct Segments on a RISC-V Processor Nikhita - - PowerPoint PPT Presentation
Implementation of Direct Segments on a RISC-V Processor Nikhita Kunati, Michael M. Swift University of Wisconsin-Madison 1 Key Points Past analysis shows TLB misses can spend 5%-50% of execution cycles on TLB misses. Rich features of Paged
Nikhita Kunati, Michael M. Swift University of Wisconsin-Madison
1
TLB misses can spend 5%-50% of execution cycles on TLB misses. Rich features of Paged VM is not needed by most applications
Paged VM as usual where needed and Segmentation where possible Perform Direct Segment Lookup on a TLB Miss.
Contiguous memory allocator to reserve and use a contiguous region of Physical memory Allocate Primary Regions (contiguous range of virtual addresses).
2
0" 5" 10" 15" 20" 25" 30" 35"
graph500" memcached"" MySQL" NPB:BT" NPB:CG" GUPS
Percentage"of"execuCon"cycles"wasted"
83. 51.1$ 4KB$ 2MB$ 1GB$ $Direct$ Segment$ 51.3$
3
4
Dynamically allocated Heap region
Paging Valuable Paging Not Needed
Constants Shared Memory Mapped Files
VA
Stack Code Guard pages
5
6
BASE = Start VA of Direct Segment LIMIT = End VA of Direct Segment OFFSET = BASE – Start PA of Direct Segment
7
8
9
Dynamically allocated Heap region
Paging Valuable Paging Not Needed
VA
TLB misses here are avoided
10
11
vpn
VPN
vpn
PPN
TLB lookup DS lookup
Page table walker
miss miss
Original Direct Segment paper proposes this
12
Original Design
vpn
VPN
vpn
PPN
TLB lookup DS lookup
Page table walker
DS miss
vpn
VPN
vpn
PPN
TLB lookup DS lookup
Page table walker
miss Tlb miss
13
vpn
VPN
vpn
PPN
TLB lookup DS lookup
Page table walker
miss miss
vpn
VPN
vpn
PPN
TLB lookup DS lookup
Page table walker
DS miss Tlb miss
Our Implementation
14
15
Offset VPN Offset PPN
TLB Lookup Page Table Walk
hit/miss
Miss
16
Offset VPN Offset PPN
TLB Lookup Page Table Walk Base Limit ≥ ? < ? Offset +
hit/miss
Miss
17
Segment Limit (SDSL), and Supervisor Direct Segment Offset (SDSO) to store the base, limit and offset.
ease of integrating the Direct Segment lookup into the existing TLB unit in Rocket.
18
base and limit.
19
s_ready s_request s_wait s_wait_inv TLB request PTW resp (refill TLB) Req && tlb_miss sfence PTW req ready sfence PTW req ready && sfence
&& ds_miss
20
21
dma_contiguous_reserve(phys_addr_t limit); Default is 16MB
encountering a primary process
*dma_alloc_from_contiguous(struct device *dev, int count, unsigned int align);
22
switch
23
function.
24
Direct segment logic and RISC-V linux kernel changes were tested on Spike and Qemu first because of the challenges faced with Verilator.
Very slow booting the linux kernel takes ~ 1 day. Lack of useful debug prints in Verilator.
25
26
27
Spike, RISC-V Qemu, Verilator.
28
29
30
31
32
33
34
35