Harmonizing Performance and Isolation in Microkernels with Efficient - - PowerPoint PPT Presentation

harmonizing performance and isolation in microkernels
SMART_READER_LITE
LIVE PREVIEW

Harmonizing Performance and Isolation in Microkernels with Efficient - - PowerPoint PPT Presentation

Harmonizing Performance and Isolation in Microkernels with Efficient Intra-kernel Isolation and Communication Jinyu Gu , Xinyue Wu, Wentai Li, Nian Liu, Zeyu Mi, Yubin Xia, Haibo Chen Monolithic Kernel and Microkernel 2 Monolithic Kernel and


slide-1
SLIDE 1

Harmonizing Performance and Isolation in Microkernels with Efficient Intra-kernel Isolation and Communication

Jinyu Gu, Xinyue Wu, Wentai Li, Nian Liu, Zeyu Mi, Yubin Xia, Haibo Chen

slide-2
SLIDE 2

Monolithic Kernel and Microkernel

2

slide-3
SLIDE 3

Monolithic Kernel and Microkernel

3

Microkernel’s philosophy: Moving most OS components into isolated user processes

slide-4
SLIDE 4

Benefits and Usages of Microkernel

  • Achieves good extensibility, security, and fault isolation
  • Succeeds in safety-critical scenarios (Airplane, Car)
  • For more general-purpose applications (Google Zircon)

4

slide-5
SLIDE 5

Expensive Communication Cost

  • Tradeoff: Performance and Isolation

– Inter-process communication (IPC) overhead

5

App File System Disk Driver Microkernel IPC

slide-6
SLIDE 6

IPC Overhead is Considerable

6

SQLite xv6FS Ramdisk Microkernel

20% 40% 60% 80% 100% Zircon seL4 w/ kpti seL4 w/o kpti

IPC Cost Real Work in Servers

Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU

Direct cost: privilege switch, process switch, … Indirect cost: CPU internal structures pollution

slide-7
SLIDE 7

Goal: Both Ends

  • Harmonize the tension between Performance

and Isolation in microkernels

– Reducing the IPC overhead – Maintaining the isolation guarantee

7

slide-8
SLIDE 8

New Hardware Brings Opportunities

  • PKU: Protection Key for Userspace (aka. MPK)

– Assign each page one PKEY (i.e., memory domain ID) – A new register PKRU stores read/write permission

8

[0:15]

slide-9
SLIDE 9

Efficient Intra-Process Isolation

  • ERIM [Security’19] & Hodor [ATC’19]

– Based on Intel PKU – Build isolate domains in the same process efficiently – Domain switch only takes 28 cycles (modify PKRU)

9

App Part Library-1 Library-2

slide-10
SLIDE 10

Intra-Process Isolation + Microkernel

10

Hardware Microkernel

Process IPC Sched App App FS MM Net Drv

System Servers

Intel PKU

slide-11
SLIDE 11

Design Choice #1

11

Microkernel

App Server-1 Server-2 Server-3

Isolate different system servers in a single process. Isolated domains Just as traditional IPCs

slide-12
SLIDE 12

Design Choice #2

12

Microkernel

Let’s get more aggressive!

Server-1 Server-2 Server-3

App-1 Server-1 Server-2 Server-3

App-2

Drawbacks

  • 1. Update Server mapping is costly
  • 2. IPC connection is also costly
  • 3. Less flexibility for applications
  • n address space and using PKU
slide-13
SLIDE 13

An Observation on Intel PKU

  • A misleading name

– Protection Key for Userspace

  • It still takes effect when in kernel (ring-0)

– The “Userspace” means user-accessible memory – U/K bit in PTE

13

slide-14
SLIDE 14

UnderBridge: Sinking System Servers

14

Hardware Microkernel

App App FS MM Net Drv

System Servers

Intel PKU

Intra-kernel isolation

slide-15
SLIDE 15

Design Choice #3: UnderBridge

15

User Kernel

Dom-0 Microkernel App Dom-3 Server-3 Dom-1 Server-1 Dom-2 Server-2 App App

  • Build execution domains in the kernel page table
slide-16
SLIDE 16

Execution Domain

  • Execution domain 0 is for the microkernel

– Use memory domain 0 – Can access all the memory

  • Others own a private memory domain

– A private MPK memory domain ID

  • Shared memory

– Allocate a free MPK memory domain ID

16

Dom-0 Microkernel Dom-1 Server-1 Dom-2 Server-2

slide-17
SLIDE 17

IPC Gate

  • Connect two servers

– Generated by the microkernel – Resides in memory domain 0 (execute-only for servers)

  • Transfer control flow during IPC invocations

– context switch and domain switch

  • Connect the microkernel and servers

– System calls

17

Dom-1 Server-1 Dom-2 Server-2 Dom-2 Server-2 Dom-0 Microkernel

slide-18
SLIDE 18

Server Migration

  • The number of execution domain is limited

– Hardware only provides 16 memory domains – Time-multiplexing is expensive

  • Move servers between user and kernel space

– Disjoint virtual memory regions – Runtime migration

18

slide-19
SLIDE 19

Privilege Deprivation

  • In-kernel servers have supervisor privilege

– Can affect the whole system if compromised – CFI (with binary scanning) incurs runtime overhead – Binary rewriting only is infeasible

  • Prevent servers to execute privilege instructions

– Add a tiny secure monitor in hypervisor mode – For instructions rarely execute: VMExits – For instructions that frequently required: Rewriting

19

slide-20
SLIDE 20

Other Designs and Implementations

  • IPC capability authentication
  • Seamless server migration
  • Privilege deprivation details

20

slide-21
SLIDE 21

Cross-server IPC Round-Trip Latency

21

7500 8000 8500

8151

Cycles

1000 2000 3000 4000 5000 Monolithic ChCore

(UnderBridge)

SkyBridge seL4 seL4

  • KPTI

Fiasco.OC Fiasco.OC

  • KPTI

Zircon

24 109 437 1450 2035 3057 4145

Cycles

Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU

slide-22
SLIDE 22

SQLite Throughput under YCSB-A

22

2 4 6 8 10

Zircon Fiasco.OC seL4 Throughput

Native w/ KPTI Native w/o KPTI SkyBridge UnderBridge Monolithic Monolithic w/o KPTI Evaluated on Dell PowerEdge R640 server with Intel Xeon Gold 6138 CPU

1×∼8×

slide-23
SLIDE 23

Conclusion & Thanks!

  • UnderBridge

– A redesign of the runtime structure of microkernel OSes for faster OS services – The efficient intra-kernel isolation mechanism may also be used to harden the isolation of monolithic kernels

23

Q&A: gujinyu@sjtu.edu.cn