H a r d w a r e / S o f t w a r e C o - D e s - PowerPoint PPT Presentation

H a r d w a r e / S o f t w a r e C o - D e s i g n f o r E f f i c i e n t M i c r o k e r n e l E x e c u t i o n Martjn Děcký martjn.decky@huawei.com February 2019

Who Am I Passionate programmer and operatjng systems enthusiast With a specifjc inclinatjon towards multjserver microkernels HelenOS developer since 2004 Research Scientjst from 2006 to 2018 Charles University (Prague), Distributed Systems Research Group Senior Research Engineer since 2017 Huawei Technologies (Munich), German Research Center, Central Sofuware Instjtute, OS Kernel Lab Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 2

M i c r o k e r n e l M u l t i s e r v e r S y s t e m s a r e b e t t e r t h a n 3 M o n o l i t h i c S y s t e m s Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 3

Monolithic OS Design is Flawed Biggs S., Lee D., Heiser G.: The Jury Is In: Monolithic OS Design Is Flawed: Microkernel-based Designs Improve Security , ACM 9 th Asia- Pacifjc Workshop on Systems (APSys), 2018 “While intuitjve, the benefjts of the small TCB have not been quantjfjed to date. We address this by a study of critjcal Linux CVEs, where we examine whether they would be prevented or mitjgated by a microkernel-based design. We fjnd that almost all exploits are at least mitjgated to less than critjcal severity, and 40 % completely eliminated by an OS design based on a verifjed microkernel, such as seL4.” Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 4

P r o b l e m S t a t e m e n t 5 Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 5

Problem Statement Microkernel design ideas go as back as 1969 RC 4000 Multjprogramming System nucleus (Per Brinch Hansen) Isolatjon of unprivileged processes, inter-process communicatjon, hierarchical control Even afuer 50 years they are not fully accepted as mainstream Hardware and sofuware used to be designed independently Designing CPUs used to be an extremely complicated and costly process Operatjng systems used to be writuen afuer the CPUs were designed Hardware designs used to be rather conservatjve Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 6

Problem Statement (2) Mainstream ISAs used to be designed in a rather conservatjve way Can you name some really revolutjonary ISA features since IBM System/370 Advanced Functjon ? Requirements on the new ISAs usually follow the needs of the mainstream operatjng systems running on the past ISAs No wonder microkernels sufger performance penaltjes compared to monolithic systems The more fjne-grained the architecture, the more penaltjes it sufgers Let us design the hardware with microkernels in mind! Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 7

The Vicious Cycle CPUs do not support microkernels properly Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 8

The Vicious Cycle CPUs do not support microkernels properly Microkernels sufger perfromance penaltjes Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 9

The Vicious Cycle CPUs do not support microkernels properly Microkernels sufger perfromance penaltjes Microkernels are not in the mainstream Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 10

The Vicious Cycle CPUs do not support microkernels properly No requirements on Microkernels sufger CPUs from microkernels perfromance penaltjes Microkernels are not in the mainstream Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 11

The Vicious Cycle CPUs do not support microkernels properly No requirements on Microkernels sufger CPUs from microkernels perfromance penaltjes Microkernels are not in the mainstream Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 12

A n y I d e a s ? Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 13

Communicatjon between Address Spaces Control and data fmow between subsystems Monolithic kernel Functjon calls Passing arguments in registers and on the stack Passing direct pointers to memory structures Multjserver microkernel IPC via microkernel syscalls Passing arguments in a subset of registers Privilege level switch, address space switch Scheduling (in case of asynchronous IPC) Data copying or memory sharing with page granularity Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 14

Communicatjon between Address Spaces (2) Is the kernel round-trip of the IPC necessary? Suggestjon for synchronous IPC: Extended Jump / Call and Return instructjons that also switch the address space Communicatjng partjes identjfjed by a “call gate” (capability) containing the target address space and the PC of the IPC handler (implicit for return) Call gates stored in a TLB-like hardware cache (CLB) CLB populated by the microkernel similarly to TLB-only memory management architecture Suggestjon for asynchronous IPC: Using CPU cache lines as the bufgers for the messages Async Jump / Call , Async Return and Async Receive instructjons Using the CPU cache like an extended register stack engine Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 15

Communicatjon between Address Spaces (3) Bulk data Observatjon: Memory sharing is actually quite effjcient for large amounts of data (multjple pages) Overhead is caused primarily by creatjng and tearing down the shared pages Data needs to be page-aligned Sub-page granularity and dynamic data structures Suggestjon: Using CPU cache lines as shared bufgers Much fjner granularity than pages (typically 64 to 128 bytes) A separate virtual-to-cache mapping mechanism before the standard virtual-to-physical mapping Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 16

Fast Context Switching Current microsecond-scale latency hiding mechanisms Hardware multj-threading Efgectjve Does not scale beyond a few threads Operatjng system context switching Scales for any thread count Too slow (order of 10 µs) Goal: Finding a sweet spot between the two mechanisms Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 17

Fast Context Switching (2) Suggestjon: Hardware cache for contexts Again, similar mechanism to TLB-only memory management Dedicated instructjons for context store, context restore, context switch, context save, context load Context data could be potentjally ABI-optjmized Autonomous mechanism for event-triggered context switch (e.g. external interrupt) Effjcient hardware mechanism for latency hiding The equivalent of fjne/coarse-grained simultaneous multjthreading The sofuware scheduler is in charge of settjng the scheduler policy The CPU is in charge of scheduling the contexts based on ALU, cache and other resource availability Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 18

User Space Interrupt Processing Extension of the fast context switching mechanism Effjcient delivery of interrupt events to user space device drivers Without the routjne microkernel interventjon An interrupt could be directly handled by a preconfjgured hardware context in user space A clear path towards moving even the tjmer interrupt handler and the scheduler from kernel space to user space Going back to interrupt-driven handling of peripherals with extreme low latency requirements (instead of polling) The usual pain point: Level-triggered interrupts Some coordinatjon with the platgorm interrupt controller is probably needed to automatjcally mask the interrupt source Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 19

Capabilitjes as First-Class Entjtjes Capabilitjes as unforgeable object identjfjers But eventually each access to an object needs to be bound-checked and translated into the (fmat) virtual address space Suggestjon: Embedding the capability reference in pointers RV128 (128-bit variant of RISC-V) would provide 64 bits for the capability reference and 64 bits for object ofgset 128-bit fmat pointers are probably useless anyway Besides the (somewhat narrow) use in the microkernel, this could be useful for other purposes Simplifying the implementatjon of managed languages’ VMs Working with multjple virtual address spaces at once Martjn Děcký , FOSDEM, February 3 rd 2019 Hardware/Sofuware Co-Design for Effjcient Microkernel Executjon 20

H a r d w a r e / S o f t w a r e C o - D e s - PowerPoint PPT Presentation

H a r d w a r e / S o f t w a r e C o - D e s i g n f o r E f f i c i e n t M i c r o k e r n e l E x e c u t i o n Martjn Dck martjn.decky@huawei.com February 2019 Who Am I Passionate

Exercise 12: Dependencies Database Theory 2020-07-13 Maximilian Marx, David Carral 1 / 49

Photon emission within a quark meson model F . Wunderlich and B. Kmpfer FAIRNESS 2014 Seite

MALAYSIAN FLYING ACADEMY SDN BHD From Ground to Sky Series - Part 2 CADETS AT MFA ARE EXPECTED

Polyakov chiral quark model N. N. Scoccola Tandar Lab -CNEA Buenos Aires PLAN OF THE TALK

Main Use Cases and Gap Analysis for Network Slicing draft-netslices-usecases-01

Analysis Analysis of of a Real Case Study : a Real Case Study : the WORKPAD Project th th

Email Optimization: How A/B testing generated $500 million in donations Introduction Youve

David Johnson Project X Machine Advisory Committee March 18-19, 2013 Organization of Talk

Taipei 2015 Enable organizations transform their businesses by harnessing the power of

MicroProfile: A Quest for a lightweight and reactive Enterprise Java Platform Ondro Mihlyi,

The Vis ision: Requirements Engineering Im Impacts Society Guenther Ruhe, Maleknaz Nayebi,

Fr From a a Web eb Ser ervic vices es Catalo alog to a a Li Linked Ecosystem of f Se

!"#$%&'()&#*+,-.(3'%+1#"-$

PARTICLE PHYSICS PARTICLE PHYSICS for Cosmologists for Cosmologists Antonio Masiero Univ. of

P A R T 1 I F Y O U B U I L D I T , T H E Y P R O B A B L Y W O N T C O M E P A R T 2 D

A Long-term Study of a Popular MMORPG Wu-chang Feng Debanjan Saha David Brandt W.

Latency Reducing TCP modifications for thin-stream interactive applications Andreas Petlund /

GLP1-RA: Where Do They Fit In CV Risk Management? Professor John Deanfield, UCL ESC Paris: Monday

Chosen-Key Distinguishers on 12-Round Feistel-SP and 11-Round Collision Attacks on Its Hashing

How to give good seminar presentations some hints Friedemann Mattern , ETH Zurich February

2012 MWF Conference Thursday, Sept 6 th @ 3pm Given by: Rose Rocchio, UCLA Mojgan Amini, UCSD

Mobile Applications in Context F. Ricci 2010/2011 Content What is context Rules for

Breaking the hierarchy How Spotify enables engineer decision making Kristian Lindwall, Spotify

Hacking C# CONTACT@ADAMFURMANEK.PL HTTP://BLOG.ADAMFURMANEK.PL FURMANEKADAM 1 19.08.2020

H a r d w a r e / S o f t w a r e C o - D e s - PowerPoint PPT Presentation

H a r d w a r e / S o f t w a r e C o - D e s i g n f o r E f f i c i e n t M i c r o k e r n e l E x e c u t i o n Martjn Dck martjn.decky@huawei.com February 2019 Who Am I Passionate

Exercise 12: Dependencies Database Theory 2020-07-13 Maximilian Marx, David Carral 1 / 49

Photon emission within a quark meson model F . Wunderlich and B. Kmpfer FAIRNESS 2014 Seite

MALAYSIAN FLYING ACADEMY SDN BHD From Ground to Sky Series - Part 2 CADETS AT MFA ARE EXPECTED

Polyakov chiral quark model N. N. Scoccola Tandar Lab -CNEA Buenos Aires PLAN OF THE TALK

Main Use Cases and Gap Analysis for Network Slicing draft-netslices-usecases-01

Analysis Analysis of of a Real Case Study : a Real Case Study : the WORKPAD Project th th

Email Optimization: How A/B testing generated $500 million in donations Introduction Youve

David Johnson Project X Machine Advisory Committee March 18-19, 2013 Organization of Talk

Taipei 2015 Enable organizations transform their businesses by harnessing the power of

MicroProfile: A Quest for a lightweight and reactive Enterprise Java Platform Ondro Mihlyi,

The Vis ision: Requirements Engineering Im Impacts Society Guenther Ruhe, Maleknaz Nayebi,

Fr From a a Web eb Ser ervic vices es Catalo alog to a a Li Linked Ecosystem of f Se

!&quot;#$%&amp;'()&amp;#*+,-.(3'%+1#&quot;-$

PARTICLE PHYSICS PARTICLE PHYSICS for Cosmologists for Cosmologists Antonio Masiero Univ. of

P A R T 1 I F Y O U B U I L D I T , T H E Y P R O B A B L Y W O N T C O M E P A R T 2 D

A Long-term Study of a Popular MMORPG Wu-chang Feng Debanjan Saha David Brandt W.

Latency Reducing TCP modifications for thin-stream interactive applications Andreas Petlund /

GLP1-RA: Where Do They Fit In CV Risk Management? Professor John Deanfield, UCL ESC Paris: Monday

Chosen-Key Distinguishers on 12-Round Feistel-SP and 11-Round Collision Attacks on Its Hashing

How to give good seminar presentations some hints Friedemann Mattern , ETH Zurich February

2012 MWF Conference Thursday, Sept 6 th @ 3pm Given by: Rose Rocchio, UCLA Mojgan Amini, UCSD

Mobile Applications in Context F. Ricci 2010/2011 Content What is context Rules for

Breaking the hierarchy How Spotify enables engineer decision making Kristian Lindwall, Spotify

Hacking C# CONTACT@ADAMFURMANEK.PL HTTP://BLOG.ADAMFURMANEK.PL FURMANEKADAM 1 19.08.2020

!"#$%&'()&#*+,-.(3'%+1#"-$