Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook - - PowerPoint PPT Presentation

making c less dangerous in the linux kernel
SMART_READER_LITE
LIVE PREVIEW

Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook - - PowerPoint PPT Presentation

Making C Less Dangerous in the Linux Kernel Kees Cook | @keescook LINUX.CONF.AU 21-25 January The Linux of Things | #LCA2019 | @linuxconfau 2019 Christchurch, NZ Making C Less Dangerous in the Linux Kernel Linux Conf AU January 25,


slide-1
SLIDE 1

Making C Less Dangerous in the Linux Kernel

LINUX.CONF.AU 21-25 January 2019 Christchurch, NZ The Linux of Things | #LCA2019 | @linuxconfau

Kees Cook | @keescook

slide-2
SLIDE 2

Making C Less Dangerous in the Linux Kernel

Linux Conf AU January 25, 2019 Christchurch, New Zealand Kees (“Case”) Cook keescook@chromium.org @kees_cook https://outflux.net/slides/2019/lca/danger.pdf

slide-3
SLIDE 3

Agenda

  • Background

– Kernel Self Protection Project – C as a fancy assembler

  • Towards less dangerous C

– Variable Length Arrays are bad and slow – Explicit switch case fall-through – Always-initialized automatic variables – Arithmetic overflow detection – Hope for bounds checking – Control Flow Integrity: forward edges – Control Flow Integrity: backward edges – Where are we now? – How you can help

@Rob_Russell

slide-4
SLIDE 4

Kernel Self Protection Project

  • https://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project
  • KSPP focuses on the kernel protecting the kernel from attack (e.g.

refcount overflow) rather than the kernel protecting userspace from attack (e.g. execve brute force detection) but any area of related development is welcome

  • Currently ~12 organizations and ~10 individuals working on about

~20 technologies

  • Slow and steady
slide-5
SLIDE 5

C as a fancy assembler: almost machine code

  • The kernel wants to be as fast and small as possible
  • At the core, kernel wants to do very architecture-specific things

for memory management, interrupt handling, scheduling, ...

  • No C API for setting up page tables, switching to 64-bit mode …
slide-6
SLIDE 6

C as a fancy assembler: undefined behavior

  • The C langauge comes with some operational baggage, and weak

“standard” libraries

– What are the contents of “uninitialized” variables?

  • … whatever was in memory from before now!

– v

  • i

d pointers have no type yet we can call typed functions through them?

  • … assembly doesn’t care: everything can be an address to call!

– Why does m

e m c p y ( ) have no “max destination length” argument?

  • … just do what I say; memory areas are all the same!
  • “With undefined behavior, anything is possible!”

– https://raphlinus.github.io/programming/rust/2018/08/17/undefined-behavior.html

slide-7
SLIDE 7

Variable Length Arrays (and a l l

  • c

a ( ) ) are bad

  • Exhaust stack, linear overflow: write to things following it
  • Jump over guard pages and write to things following it
  • Easy to find with compiler flag: -

W v l a

  • But if you must (in userspace) please use gcc’s stack probing feature:
  • f

s t a c k

  • c

l a s h

  • p

r

  • t

e c t i

  • n

stack 1 … … ... stack 2 … … ... size = 8192; ... char buf[size]; … strcpy(buf, src, size); stack 1 … … ... stack 2 … … ... guard page size = 8192; ... u8 array[size]; … array[big] = foo;

slide-8
SLIDE 8

Variable Length Arrays are slow

  • This seems conceptually sound: more instructions to change

stack size, but it seems like it would be hard to measure.

  • It is quite measurable … 13% speed up measured during

l i b / b c h . c VLA removal: https://git.kernel.org/linus/02361bc77888 (Ivan Djelic)

B u f f e r a l l

  • c

a t i

  • n

| E n c

  • d

i n g t h r

  • u

g h p u t ( M b i t / s )

  • n
  • s

t a c k , V L A | 3 9 8 8

  • n
  • s

t a c k , f i x e d | 4 4 9 4 k m a l l

  • c

| 1 9 6 7

slide-9
SLIDE 9

Variable Length Arrays: stop it

fixed-size array variable length array

slide-10
SLIDE 10

Switch case fall-through: did I mean it?

  • CWE-484 “Omitted Break Statement in Switch”
  • Semantic weakness in C (“switch” is just assembly test/jump...)
  • Commit logs with “missing break statement”: 67

Did they mean to leave

  • ut “b

r e a k ; ” ??

slide-11
SLIDE 11

Switch case fall-through: new “statement”

  • Use -

W i m p l i c i t

  • f

a l l t h r

  • u

g h to add a new switch “statement”

– Actually a comment, but is parsed by compilers now, following the

lead of static checkers

  • Mark all non-breaks with a “fall through” comment, for example

https://git.kernel.org/linus/4597b62f7a60 (Gustavo A. R. Silva)

slide-12
SLIDE 12

Always-initialized local variables: just do it

  • CWE-200 “Information Exposure”, CWE-457 “Use of Uninitialized Variable”
  • gcc -

f i n i t

  • l
  • c

a l

  • v

a r s not upstream

  • Clang -

f s a n i t i z e = i n i t

  • l
  • c

a l not upstream

  • C

O N F I G _ G C C _ P L U G I N _ . . .

S T R U C T L E A K (for structs with _ _ u s e r pointers)

S T R U C T L E A K _ B Y R E F (when passed into funcs)

Soon, plugin to mimic

  • f

i n i t

  • l
  • c

a l

  • v

a r s too

slide-13
SLIDE 13

Always-initialized local variables: switch gotcha

w a r n i n g : s t a t e m e n t w i l l n e v e r b e e x e c u t e d [

  • W

s w i t c h

  • u

n r e a c h a b l e ]

slide-14
SLIDE 14

Arithmetic overflow detection: gcc?

  • gcc’s -

f s a n i t i z e = s i g n e d

  • i

n t e g e r

  • v

e r f l

  • w

(C O N F I G _ U B S A N )

– Only signed. Fast: in the noise. Big: warnings grow kernel image by

6% (aborts grow it by 0.1%)

  • But we can use explicit single-operation helpers. To quote

Rasmus Villemoes:

slide-15
SLIDE 15

Arithmetic overflow detection: Clang :)

  • Clang can do signed and unsigned instrumentation:
  • f

s a n i t i z e = s i g n e d

  • i

n t e g e r

  • v

e r f l

  • w
  • f

s a n i t i z e = u n s i g n e d

  • i

n t e g e r

  • v

e r f l

  • w
slide-16
SLIDE 16

Bounds checking: explicit checking is slow :(

  • Explicit checks for linear overflows of SLAB objects, stack, etc

– c

  • p

y _ { t

  • ,

f r

  • m

} _ u s e r ( ) checking: <~1% performance hit

– s

t r c p y ( )

  • family checking: ~2% performance hit

– m

e m c p y ( )

  • family checking: ~1% performance hit
  • Can we get better APIs?

– s

t r c p y ( ) is terrible

– s

p r i n t f ( ) is bad

– m

e m c p y ( ) is weak

slide-17
SLIDE 17

Instead of s t r c p y ( ) : s t r s c p y ( )

  • s

t r c p y ( ) no bounds checking on destination nor source!

  • s

t r n c p y ( ) doesn’t always NUL terminate (good for non-C-strings, does NUL pad destination)

c h a r d e s t [ 4 ] ; s t r n c p y ( d e s t , “

  • h

a i ! ” , s i z e

  • f

( d e s t ) ) ; / * u n h e l p f u l l y r e t u r n s d e s t * / d e s t : “

, “ h ” , “ a ” , “ i ” … no trailing NUL byte :(

  • s

t r l c p y ( ) reads source beyond max destination size (returns length of source!)

  • s

t r s c p y ( ) safest (returns bytes copied, not including NUL, or -E2BIG)

s s i z e _ t c

  • u

n t = s t r s c p y ( d e s t , “

  • h

a i ! ” , s i z e

  • f

( d e s t ) ) ; / * r e t u r n s

  • E

2 B I G * / d e s t : “

, “ h ” , “ a ” , N U L

Does not NUL-pad destination … if desired, add explicit m e m s e t ( ) (kernel needs a helper for this...) i f ( c

  • u

n t > & & c

  • u

n t + 1 < s i z e

  • f

( d e s t ) ) m e m s e t ( d e s t + c

  • u

n t + 1 , , s i z e

  • f

( d e s t ) – c

  • u

n t

  • 1

) ;

slide-18
SLIDE 18

Instead of s p r i n t f ( ) : s c n p r i n t f ( )

  • s

p r i n t f ( ) no bounds checking on destination!

  • s

n p r i n t f ( ) always NUL-terminates, but returns how much it WOULD have written :(

i n t c

  • u

n t = s n p r i n t f ( b u f , s i z e

  • f

( b u f ) , f m t … , … ) ; f

  • r

( i = ; i < s

  • m

e t h i n g ; i + + ) c

  • u

n t + = s n p r i n t f ( b u f + c

  • u

n t , s i z e

  • f

( b u f )

  • c
  • u

n t , f m t … , … ) ; c

  • p

y _ t

  • _

u s e r ( u s e r , b u f , c

  • u

n t ) ;

  • s

c n p r i n t f ( ) always NUL-terminates, returns count of bytes copied

Replace in above code!

slide-19
SLIDE 19

Instead of m e m c p y ( ) : uhhh … be … careful?

  • m

e m c p y ( ) has no sense of destination size :(

u i n t 8 _ t b y t e s [ 1 2 8 ] ; s i z e _ t w a n t e d , c

  • p

i e d = ; f

  • r

( i = ; i < s

  • m

e t h i n g & & c

  • p

i e d < s i z e

  • f

( b y t e s ) ; i + + ) { w a n t e d = . . . ; i f ( w a n t e d > s i z e

  • f

( b y t e s )

  • c
  • p

i e d ) w a n t e d = s i z e

  • f

( b y t e s )

  • c
  • p

i e d ; m e m c p y ( b y t e s + c

  • p

i e d , w a n t e d , s

  • u

r c e ) ; c

  • p

i e d + = w a n t e d ; }

slide-20
SLIDE 20

Bounds checking: memory tagging :)

  • Hardware memory tagging/coloring is much faster!

– SPARC Application Data Integrity (ADI) – ARMv8.5 Memory Tagging Extension (MTE) – Intel?

@0x…….beef0000: first byte of 128 byte alloc … data ... @0x…….beef0040: … data ... @0x…….beef007ff: last byte of tagged region char *buf; buf = kmalloc(128, …); /* 0x...5...beef0000 */ buf[0x40] = ‘A’; /* 0x...5...beef0040 */ buf[0xa0] = ‘A’; /* 0x...5...beef00a0 */ @0x…….beef0080: first byte of next alloc … data … @0x…….beef00a0: … data …

  • k:

pointer tag matches F A I L : p

  • i

n t e r t a g m i s m a t c h

Tag 5 Tag 3

slide-21
SLIDE 21

Control Flow Integrity: indirect calls

  • With memory W^X, gaining execution control needs to change

function pointers saved in heap or stack, where all type information was lost!

heap: saved_actions[] … my_action ... stack: … return address ... int action_launch(int idx) { int (*action)(struct thing *); int rc; ... action = saved_actions[idx]; rc = action(info); … } int my_action(struct thing *info) { stuff; and; things; … return 0; } f

  • r

w a r d e d g e backward edge

slide-22
SLIDE 22

CFI, forward edges: just call pointers :(

Ignore function prototype … Normally just a call to a memory address:

slide-23
SLIDE 23

CFI, forward edges: enforce prototype :)

Ignore function prototype … Clang - f s a n i t i z e = c f i will check at runtime:

slide-24
SLIDE 24

CFI, backward edges: two stacks

  • Clang’s Safe Stack

– Clang: -

f s a n i t i z e = s a f e

  • s

t a c k

regular stack: ... all local variables register spills return address ... local variables return address ... safe stack: ... safe variables register spills return address ... unsafe stack: ... buffers by-referenced vars etc ...

slide-25
SLIDE 25

CFI, backward edges: shadow call stack

  • Clang’s Shadow Call Stack

– Clang: -

f s a n i t i z e = s h a d

  • w
  • c

a l l

  • s

t a c k

– Results in two stack registers: s

p and unspilled x 1 8

regular stack: ... all local variables register spills return address ... local variables return address ... x18 stack: ... return address return address ... sp stack: ... all local variables register spills etc ...

slide-26
SLIDE 26

CFI, backward edges: hardware support

  • Intel CET: hardware-based read-only shadow call stack

– Implicit use of otherwise read-only shadow stack during c

a l l and r e t instructions

  • ARM v8.3a Pointer Authentication (“signed return address”)

– New instructions: p

a c i a s p and a u t i a s p

– Clang and gcc: -

m s i g n

  • r

e t u r n

  • a

d d r e s s

slide-27
SLIDE 27

Where is the Linux kernel now?

  • Variable Length Arrays

– Finally eradicated from kernel since v4.20 (Dec 2018)!

  • Explicit switch case fall-through

– Steady progress on full markings (232 of 2311 remain)

  • Always-initialized automatic variables

– Various partial options, needs more compiler work

  • Arithmetic overflow detection

– Memory allocations now doing explicit overflow detection – Needs better kernel support for Clang and gcc

  • Bounds checking

– Crying about performance hits – Waiting (im)patiently for hardware support

  • Control Flow Integrity: forward edges

– Need Clang LTO support in kernel, but works on Android

  • Control Flow Integrity: backward edges

– Shadow Call Stack works on Android – Waiting (im)patiently for hardware support

https://www.flickr.com/photos/wonderlane/5270711224

slide-28
SLIDE 28

Challenges in Kernel Security Development

Cultural: Conservatism, Responsibility, Sacrifice, Patience Technical: Complexity, Innovation, Collaboration Resource: Dedicated Developers, Reviewers, Testers, Backporters

slide-29
SLIDE 29

Thoughts?

Kees (“Case”) Cook keescook@chromium.org keescook@google.com kees@outflux.net https://outflux.net/slides/2019/lca/danger.pdf http://www.openwall.com/lists/kernel-hardening/ http://kernsec.org/wiki/index.php/Kernel_Self_Protection_Project