LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY PROTECTION KEYS (INTEL - - PowerPoint PPT Presentation

libmpk software abstraction for intel memory protection
SMART_READER_LITE
LIVE PREVIEW

LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY PROTECTION KEYS (INTEL - - PowerPoint PPT Presentation

LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY PROTECTION KEYS (INTEL MPK) Soyeon Park, Sangho Lee, Wen Xu, Hyungon Moon and Taesoo Kim INTRODUCTION SECURITY CRITICAL MEMORY REGIONS NEED PROTECTION JIT page To achieve code execution,


slide-1
SLIDE 1

Soyeon Park, Sangho Lee, Wen Xu, Hyungon Moon and Taesoo Kim

LIBMPK: SOFTWARE ABSTRACTION FOR INTEL MEMORY PROTECTION KEYS (INTEL MPK)

slide-2
SLIDE 2

INTRODUCTION

SECURITY CRITICAL MEMORY REGIONS NEED PROTECTION

▸ JIT page


“To achieve code execution, we can simply locate one of these RWX JIT pages and

  • verwrite it with our own shellcode.” - [1]

▸ Personal information ▸ Password ▸ Private key


“We confirmed that all individuals used only the Heartbleed exploit to obtain the private key.” - [2]

[1] Amy Burnett, et al. “Weaponization of a Javascriptcore vulnerability” RET2 Systems Engineering Blog [2] Nick Sullivan “The Results of the CloudFlare Challenge” CloudFlare Blog

slide-3
SLIDE 3

INTRODUCTION

EXAMPLE 1 - HEARTBLEED ATTACK

P r i v a t e k e y H e a r t b l e e d r e q u e s t L e a k e d d a t a i n c l u d i n g p r i v a t e k e y

Web Server

Reply “HELLO” (1000 bytes)

1000 bytes

H E L L O

· · · · · ·

Private key

· · ·

slide-4
SLIDE 4

INTRODUCTION

EXAMPLE 1 : EXISTING SOLUTION TO PROTECT MEMORY

▸ Process separation

Process

MEMORY

[1] Song, Chengyu, et al. "Exploiting and Protecting Dynamic Code Generation”, NDSS 2015.
 [2] Litton, James, et al. "Light-Weight Contexts: An OS Abstraction for Safety and Performance”, OSDI 2016.

Process

MEMORY

slide-5
SLIDE 5

INTRODUCTION

EXAMPLE 2 - EXISTING SOLUTION TO PROTECT JIT PAGE

Process mprotect(W) Write code mprotect(RX) Code Cache Execute Write Write

▸ JIT page W^X protection

slide-6
SLIDE 6

INTRODUCTION

PROBLEMS OF EXISTING SOLUTIONS

▸ Process Separation


▸ W^X Protection

High overhead to spawn new process and synch data Race condition due to permission synchronization Multiple cost to change permission of multiple pages This talk: utilizing a hardware mechanism, Intel Memory Protection Key (MPK), to address these challenges

slide-7
SLIDE 7

OUTLINE

OUTLINE

▸ Introduction

▸ Intel MPK Explained

▸ Challenges ▸ Design ▸ Implementation ▸ Evaluation ▸ Discussion ▸ Related Work ▸ Conclusion

slide-8
SLIDE 8

INTEL MPK EXPLAINED

OVERVIEW

▸ Support fast permission change for page groups

with single instruction

▸ Fast single invocation ▸ Fast permission change for multiple pages

Kernel

mprotect Intel MPK Userspace Latency (ms)

4.5 9 13.5 18

Number of pages

1000 6000 11000 16000 21000 26000 31000 36000

mprotect (contiguous) mprotect (sparse)

slide-9
SLIDE 9

INTEL MPK EXPLAINED

UNDERLINE IMPLEMENTATION

Kernel pkey 2 <- R/W page 120 -> R/W pkey 2 <- R
 page 120 -> R page # pkey ··· perm. 120 2 ··· R/W ··· < Page table>

WRPKRU RDPKRU pkey_mprotect

32-bit register 16 pkeys R W R W

▸ Permissions per cpu ▸ 32-bit PKRU register contains keys/perm ▸ WRPKRU: write key/perm ▸ RDPKRU: read key/perm

slide-10
SLIDE 10

RWX

function init() pkey = pkey_alloc() pkey_mprotect(code_cache, len, RWX, pkey) function JIT() WRPKRU(pkey, W) ... write code cache ... WRPKRU(pkey, R) function fini() pkey_free(pkey)

INTEL MPK EXPLAINED

EXAMPLE - JIT PAGE W^X PROTECTION

Write code in code cache

Grant permission

Revoke permission

pkey = 1

CODE CACHE

RWX

R

1

W

PKRU Register 1

slide-11
SLIDE 11

INTEL MPK EXPLAINED

EXAMPLE : EXECUTABLE-ONLY MEMORY

function init() pkey = pkey_alloc() pkey_mprotect(code_cache, len, RWX, pkey) function JIT() WRPKRU(pkey, W) ... write code cache ... WRPKRU(pkey, R) function fini() pkey_free(pkey)

pkey

CODE CACHE

RWX

slide-12
SLIDE 12

INTEL MPK EXPLAINED

function init() pkey = pkey_alloc() pkey_mprotect(code_cache, len, RWX, pkey) function JIT() WRPKRU(pkey, W) ... write code cache ... WRPKRU(pkey, None) function fini() pkey_free(pkey)

CODE CACHE

RWX RWX

pkey

EXAMPLE : EXECUTABLE-ONLY MEMORY

slide-13
SLIDE 13

OUTLINE

OUTLINE

▸ Introduction ▸ Intel MPK Explained

▸ Challenges

▸ Non-scalable Hardware Resource ▸ Asynchronous Permission Change

▸ Design ▸ Implementation ▸ Evaluation ▸ Discussion ▸ Related Work ▸ Conclusion

slide-14
SLIDE 14

CHALLENGES

NON-SCALABLE HARDWARE RESOURCE

▸ Only 16 keys are provided

Process

Write code cache 1

W R

1 2 3 4 5 … 16 17

Write code cache 16

W R

Write code cache 17

W R

pkey 1 pkey 2 pkey 3 pkey 4 pkey 5 pkey 16 pkey 1 pkey 16 pkey ?

slide-15
SLIDE 15

Process

CHALLENGES

ASYNCHRONOUS PERMISSION CHANGE - PROS

▸ Permission change with MPK is per-thread intrinsically

Code Cache

Write Code Cache

W R

RX RX RX RX RX

slide-16
SLIDE 16

Process

CHALLENGES

ASYNCHRONOUS PERMISSION CHANGE - PROS

Code Cache

Write Code Cache

W R

pkey

RX RX RX RX W Write

▸ Permission change with MPK is per-thread intrinsically

slide-17
SLIDE 17

CHALLENGES

ASYNCHRONOUS PERMISSION CHANGE - CONS

▸ Permission synchronization is necessary in some context

Process Code Cache

Write Code Cache

W None

pkey

RX RX RX RX X

slide-18
SLIDE 18

CHALLENGES

Process Code Cache

Write Code Cache

W None

pkey

RX RX RX RX X Read

ASYNCHRONOUS PERMISSION CHANGE - CONS

▸ Permission synchronization is necessary in some context

slide-19
SLIDE 19

DESIGN

REVISIT : CHALLENGES

▸ Non-scalable Hardware Resources



 


▸ Asynchronous Permission Change

Key virtualization solve by key indirection.

libmpk provide permission synchronization API

slide-20
SLIDE 20

Library Application

DESIGN

KEY VIRTUALIZATION

▸ Decoupling physical keys from user interface ▸ Key indirection working like cache

Write code

W R pkey 1 pkey 16 pkey ? vkey 1 vkey 16 vkey 17 pkey 1 pkey 16

Write code

W R

Write code

W R

😲 😋 Evicted

slide-21
SLIDE 21

➊ call mpk_mprotect() pkey_sync

DESIGN

INTER-THREAD PERMISSION SYNCHRONIZATION

Userspace Kernel

THREAD A

STATE : RUNNING

THREAD B

STATE : RUNNING ➍ return ➌ interrupt SLEEP ➋ add hooks

task_work WRPKRU

➎ update PKRU (rescheduled) RUNNING RX RX X X

slide-22
SLIDE 22

IMPLEMENTATION

IMPLEMENTATION

▸ libmpk is written in C/C++

▸ Userspace library : 663 LoC ▸ Kernel support : 1K LoC

▸ Permission Synchronization ▸ Kernel module for managing metadata

▸ Userspace cannot fabricate metadata


  • We open source at

https://github.com/sslab-gatech/libmpk

slide-23
SLIDE 23

function init() vkey = libmpk_mmap(&code_cache, len, RWX) function JIT() libmpk_begin(vkey, W) ... write code cache ... libmpk_end(vkey) libmpk_mprotect(vkey, X)

CODE CACHE

RWX

IMPLEMENTATION

USE CASE - JIT PAGE W^X PROTECTION

RWX X X X X

Key virtualization Permission synchronization

slide-24
SLIDE 24

OUTLINE

OUTLINE

▸ Introduction ▸ Intel MPK Explained ▸ Challenges ▸ Design ▸ Implementation

▸ Evaluation

▸ Usability ▸ Checking overhead occurred by design ▸ Use cases - applying for memory isolation and protection

▸ Discussion ▸ Related Work ▸ Conclusion

slide-25
SLIDE 25

EVALUATION

LIBMPK IS EASY TO ADOPT

▸ OpenSSL (83 LoC) : protecting private key ▸ Memcached (117 LoC) : protecting slabs ▸ Chakracore (10 LoC) : protecting JIT pages

slide-26
SLIDE 26

EVALUATION

LATENCY - KEY VIRTUALIZATION

▸ Cache miss costs overhead due to eviction

0.0 0.8 1.5 2.3 3.0 25 50 75 100 Hit rate Time (μs) Hit Miss mprotect

Reasonable overhead while providing similar functionality.

slide-27
SLIDE 27

EVALUATION

LATENCY - INTER-THREAD PERMISSION SYNCHRONIZATION

▸ Performance

▸ 1,000 pages : 3.8x ▸ Single page : 1.7x

Latency (μs)

10 20 30 40

Number of threads

1 5 10 15 20 25 30 35 40 mpk_mprotect mprotect (4KB) mprotect (4000KB)

libmpk outperform mprotect regardless of the number of pages.

slide-28
SLIDE 28

EVALUATION

FAST MEMORY ISOLATION - OPENSSL & MEMCACHED

OpenSSL ▸

request/sec: 0.53% slowdown

request/sec 375 750 1125 1500 Size of each request (KB) 1 2 4 8 16 32 64 128 256 5121024

  • riginal

libmpk

Kbyte/sec 125 250 375 500 Number of connections 250 500 750 1000

  • riginal

mpk_inthread mpk_synch mprotect

For 1GB protection : ▸

  • riginal vs mpk_inthread :

0.01%

mpk_synch vs mprotect : 8.1x

slide-29
SLIDE 29

EVALUATION

FAST AND SECURE W⊕X - JIT COMPILATION

▸ Chakracore

mprotect-based protection

Allows race-condition attack

4.39% performance improvement (31.11% at most)

Normalized Score

0.70 0.85 1.00 1.15 1.30

RICHARDS DELTABLUE CRYPTO RAYTRACE EARLEYBOYER REGEXP SPLAY SPLAYLATENCY NAVIERSTOKES PDFJS MANDREEL MANDREELLATENCY GAMEBOY CODELOAD BOX2D ZLIB TYPESCRIPT

mprotect libmpk

slide-30
SLIDE 30

DISCUSSION

DISCUSSION

▸ Rogue data cache load (Meltdown)

▸ MPK is also affected by the Meltdown attack ▸ Hardware or software-level mitigation

▸ Code reuse attack

▸ Arbitrary executed WRPKRU may break the security ▸ Applying sandboxing or control-flow integrity

▸ Protection key use-after-free

▸ pkey_free does not perfectly free the protection key ▸ Pages are still associated with the pkey after free

slide-31
SLIDE 31

RELATED WORK

RELATED WORK

▸ ERIM [1] : Secure wrapper of MPK ▸ Shadow Stack [2] : Shadow stack protected by MPK ▸ XOM-Switch [3] : Code-reuse attack prevention with

execute-only memory supported by MPK

[1] Anjo Vahldiek-Oberwagner, et al. “ERIM: Secure, Efficient In-Process Isolation with Memory Protection Keys”, Security 2019 [2] Nathan Burow, et al. “Shining Light on Shadow Stacks”, Oakland 2019 [3] Mingwei Zhang, et al. “XOM-Switch: Hiding Your Code From Advanced Code Reuse Attacks in One Shot”, Black Hat Asia 2018

slide-32
SLIDE 32

CONCLUSION

CONCLUSION

▸ libmpk is a secure, scalable, and

synchronizable abstraction of MPK for supporting fast memory protection and isolation with little effort.

THANKS!

https://github.com/sslab-gatech/libmpk