Harmonizing Performance and Isolation in Microkernels with Efficient - - PowerPoint PPT Presentation
Harmonizing Performance and Isolation in Microkernels with Efficient - - PowerPoint PPT Presentation
Harmonizing Performance and Isolation in Microkernels with Efficient Intra-kernel Isolation and Communication Jinyu Gu , Xinyue Wu, Wentai Li, Nian Liu, Zeyu Mi, Yubin Xia, Haibo Chen Monolithic Kernel and Microkernel 2 Monolithic Kernel and
Monolithic Kernel and Microkernel
2
Monolithic Kernel and Microkernel
3
Microkernel’s philosophy: Moving most OS components into isolated user processes
Benefits and Usages of Microkernel
- Achieves good extensibility, security, and fault isolation
- Succeeds in safety-critical scenarios (Airplane, Car)
- For more general-purpose applications (Google Zircon)
4
Expensive Communication Cost
- Tradeoff: Performance and Isolation
– Inter-process communication (IPC) overhead
5
App File System Disk Driver Microkernel IPC
IPC Overhead is Considerable
6
SQLite xv6FS Ramdisk Microkernel
20% 40% 60% 80% 100% Zircon seL4 w/ kpti seL4 w/o kpti
IPC Cost Real Work in Servers
Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU
Direct cost: privilege switch, process switch, … Indirect cost: CPU internal structures pollution
Goal: Both Ends
- Harmonize the tension between Performance
and Isolation in microkernels
– Reducing the IPC overhead – Maintaining the isolation guarantee
7
New Hardware Brings Opportunities
- PKU: Protection Key for Userspace (aka. MPK)
– Assign each page one PKEY (i.e., memory domain ID) – A new register PKRU stores read/write permission
8
[0:15]
Efficient Intra-Process Isolation
- ERIM [Security’19] & Hodor [ATC’19]
– Based on Intel PKU – Build isolate domains in the same process efficiently – Domain switch only takes 28 cycles (modify PKRU)
9
App Part Library-1 Library-2
Intra-Process Isolation + Microkernel
10
Hardware Microkernel
Process IPC Sched App App FS MM Net Drv
System Servers
…
Intel PKU
Design Choice #1
11
Microkernel
App Server-1 Server-2 Server-3
…
Isolate different system servers in a single process. Isolated domains Just as traditional IPCs
Design Choice #2
12
Microkernel
Let’s get more aggressive!
Server-1 Server-2 Server-3
…
App-1 Server-1 Server-2 Server-3
…
App-2
Drawbacks
- 1. Update Server mapping is costly
- 2. IPC connection is also costly
- 3. Less flexibility for applications
- n address space and using PKU
An Observation on Intel PKU
- A misleading name
– Protection Key for Userspace
- It still takes effect when in kernel (ring-0)
– The “Userspace” means user-accessible memory – U/K bit in PTE
13
UnderBridge: Sinking System Servers
14
Hardware Microkernel
App App FS MM Net Drv
System Servers
…
Intel PKU
Intra-kernel isolation
Design Choice #3: UnderBridge
15
User Kernel
Dom-0 Microkernel App Dom-3 Server-3 Dom-1 Server-1 Dom-2 Server-2 App App
- Build execution domains in the kernel page table
Execution Domain
- Execution domain 0 is for the microkernel
– Use memory domain 0 – Can access all the memory
- Others own a private memory domain
– A private MPK memory domain ID
- Shared memory
– Allocate a free MPK memory domain ID
16
Dom-0 Microkernel Dom-1 Server-1 Dom-2 Server-2
IPC Gate
- Connect two servers
– Generated by the microkernel – Resides in memory domain 0 (execute-only for servers)
- Transfer control flow during IPC invocations
– context switch and domain switch
- Connect the microkernel and servers
– System calls
17
Dom-1 Server-1 Dom-2 Server-2 Dom-2 Server-2 Dom-0 Microkernel
Server Migration
- The number of execution domain is limited
– Hardware only provides 16 memory domains – Time-multiplexing is expensive
- Move servers between user and kernel space
– Disjoint virtual memory regions – Runtime migration
18
Privilege Deprivation
- In-kernel servers have supervisor privilege
– Can affect the whole system if compromised – CFI (with binary scanning) incurs runtime overhead – Binary rewriting only is infeasible
- Prevent servers to execute privilege instructions
– Add a tiny secure monitor in hypervisor mode – For instructions rarely execute: VMExits – For instructions that frequently required: Rewriting
19
Other Designs and Implementations
- IPC capability authentication
- Seamless server migration
- Privilege deprivation details
20
Cross-server IPC Round-Trip Latency
21
7500 8000 8500
8151
Cycles
1000 2000 3000 4000 5000 Monolithic ChCore
(UnderBridge)
SkyBridge seL4 seL4
- KPTI
Fiasco.OC Fiasco.OC
- KPTI
Zircon
24 109 437 1450 2035 3057 4145
Cycles
Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU
SQLite Throughput under YCSB-A
22
2 4 6 8 10
Zircon Fiasco.OC seL4 Throughput
Native w/ KPTI Native w/o KPTI SkyBridge UnderBridge Monolithic Monolithic w/o KPTI Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU
1×∼8×
Conclusion & Thanks!
- UnderBridge
– A redesign of the runtime structure of microkernel OSes for faster OS services – The efficient intra-kernel isolation mechanism may also be used to harden the isolation of monolithic kernels
23