Deterministic Fast User Space Synchronization Alexander Zpke - - PowerPoint PPT Presentation

deterministic fast user space synchronization
SMART_READER_LITE
LIVE PREVIEW

Deterministic Fast User Space Synchronization Alexander Zpke - - PowerPoint PPT Presentation

Deterministic Fast User Space Synchronization Alexander Zpke alexander.zuepke@hs-rm.de RheinMain University of Applied Sciences Wiesbaden, Germany OSPERT A. Zpke Overview 2013-07-09 2 / 42 Futex Basics Challenge: Futexes for


slide-1
SLIDE 1

Deterministic Fast User Space Synchronization

Alexander Züpke

alexander.zuepke@hs-rm.de

RheinMain University of Applied Sciences Wiesbaden, Germany

slide-2
SLIDE 2

OSPERT 2013-07-09

  • A. Züpke

2 / 42

Overview

 Futex Basics  Challenge: Futexes for Partitioning Systems  New Approach

 Mutexes  Condition Variables  Locking of Wait Queues  Robustness

 Future Work  Summary

slide-3
SLIDE 3

OSPERT 2013-07-09

  • A. Züpke

3 / 42

Mutex State Transitions

 Unlocked ↔ Locked

 Fast path: use atomic ops  No system call involved!

 Contention: first waiter

 Atomically indicate pending waiters  System call: suspend caller  Kernel allocates a wait queue object

 Contention: multiple waiters

 Append to existing wait queue  Wait queue order depends, sorting if necessary

unlocked locked locked w/ contention 1 waiter locked w/ contention 2+ waiters

slide-4
SLIDE 4

OSPERT 2013-07-09

  • A. Züpke

4 / 42

Futexes in Linux

 Futex := 32-bit integer variable in user space  atomic CAS or LL/SC operations in the fast path  Glibc provides:

 Mutexes and Condition Variables  Semaphores, Reader-Writer Locks, Barriers, …

 Linux kernel provides system calls to:

 suspend the caller  wake a given number of waiters

 First prototype in Linux kernel version 2.5.7

slide-5
SLIDE 5

OSPERT 2013-07-09

  • A. Züpke

5 / 42

Futexes in Linux

 Futex API

#include <linux/futex.h> int futex(int *uaddr, int op, int val, const struct timespec *timeout, int *uaddr2, int val3);

 Operations

FUTEX_WAIT

Suspend calling thread on futex uaddr

FUTEX_WAKE

Wake val threads waiting on futex uaddr

FUTEX_REQUEUE

Move threads waiting on uaddr to uaddr2

 … more operations available → see FUTEX(2) man page

slide-6
SLIDE 6

OSPERT 2013-07-09

  • A. Züpke

6 / 42

Motivation

 Linux Implementation

 Requires system calls only on contention  Supports an arbitrary number of futexes  No kernel resources required until suspension  Also supports PI mutexes & condition variables

 Futexes are really nice

… for Un*x Kernels

slide-7
SLIDE 7

OSPERT 2013-07-09

  • A. Züpke

7 / 42

Motivation

 Linux Implementation

 Requires system calls only on contention  Supports an arbitrary number of futexes  No kernel resources required until suspension  Also supports PI mutexes & condition variables

 But:

 Can we use futexes in partitioned environments?  For highly safety critical systems?  Kernels without SLAB allocator?

slide-8
SLIDE 8

OSPERT 2013-07-09

  • A. Züpke

8 / 42

Motivation

 Define ”Partitioning”

 space and time partitioning  Isolated (groups of) processes  kernel resources are partitioned

Partition B Partition A SHM Futex a thread

slide-9
SLIDE 9

OSPERT 2013-07-09

  • A. Züpke

9 / 42

Motivation

 Define ”Partitioning”

 space and time partitioning  Isolated (groups of) processes  kernel resources are partitioned

Partition B Partition A SHM Futex lock lock

slide-10
SLIDE 10

OSPERT 2013-07-09

  • A. Züpke

10 / 42

Motivation

 Define ”Partitioning”

 space and time partitioning  Isolated (groups of) processes  kernel resources are partitioned

 Problem

 Q: Wait queue belongs to

Partition A or Partition B?

 Pre-allocated w. queues?

 Too pessimistic!

Partition B Partition A SHM Futex lock lock Kernel Obj ? ?

slide-11
SLIDE 11

OSPERT 2013-07-09

  • A. Züpke

11 / 42

Motivation

 Define ”Partitioning”

 space and time partitioning  Isolated (groups of) processes  kernel resources are partitioned

 Problem

 Q: Wait queue belongs to

Partition A or Partition B?

 Pre-allocated w. queues?

 Too pessimistic!

 Idea: get rid of kernel object!

Partition B Partition A SHM Futex lock lock Kernel Obj ? ?

slide-12
SLIDE 12

OSPERT 2013-07-09

  • A. Züpke

12 / 42

Motivation

 Get rid of the kernel object!  The Linux Futex implementation uses:

 array of futex hash entries

 lock  list head in-kernel objects

 in-kernel object

 list node in futex hash  key (futex address)  wait queue  lock pointer

slide-13
SLIDE 13

OSPERT 2013-07-09

  • A. Züpke

13 / 42

Motivation

 Get rid of the kernel object!  The Linux Futex implementation uses:

 array of futex hash entries

 lock  list head in-kernel objects

 in-kernel object

 list node in futex hash  key (futex address)  wait queue  lock pointer

put into TCB

slide-14
SLIDE 14

OSPERT 2013-07-09

  • A. Züpke

14 / 42

Requirements

 Identify correct wait queue

 use thread ID of the first waiter  put thread ID into user space, next to futex

 Wait queue implementation in linear space

 a priority sorted wait queue would be nice

 Locking of the wait queue

 assume a single kernel lock for now

→ more on that later

Thread ID

  • f 1st waiter

futex

slide-15
SLIDE 15

OSPERT 2013-07-09

  • A. Züpke

15 / 42

Requirements

 Algorithms need bounded WCET

 depends on # of waiters  # of waiters probably not known in advance

→ tricky across partition boundaries

 Wait Queues

 doubly-linked lists are O(1) ... except searching  sorted wait queues with O(log n) are acceptable

if the upper bound of O(log n) is known

 O(n) is only acceptable if n is bounded  Pick FIFO-ordered doubly-linked list for now

slide-16
SLIDE 16

OSPERT 2013-07-09

  • A. Züpke

16 / 42

Mutex Protocol

 Example

 2 processes  3 threads  futex in shared memory  mutex protocol

 Symbols

 T: lock holder's thread ID  W:bit indicating non-empty wait queue  Q: thread ID of first waiting thread

Process B Process A SHM a Futex b c a thread

Futex Encoding:

Lock Holder ID < T | W > Waiters Bit Wait Queue Q

slide-17
SLIDE 17

OSPERT 2013-07-09

  • A. Züpke

17 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a 0 | 0 b c

slide-18
SLIDE 18

OSPERT 2013-07-09

  • A. Züpke

18 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a 0 | 0 lock b c

slide-19
SLIDE 19

OSPERT 2013-07-09

  • A. Züpke

19 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | 0 b lock holder c

slide-20
SLIDE 20

OSPERT 2013-07-09

  • A. Züpke

20 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | 0 lock b lock holder c

slide-21
SLIDE 21

OSPERT 2013-07-09

  • A. Züpke

21 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | W lock b lock holder c

slide-22
SLIDE 22

OSPERT 2013-07-09

  • A. Züpke

22 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | W b Wait Queue b lock holder c

slide-23
SLIDE 23

OSPERT 2013-07-09

  • A. Züpke

23 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | W b Wait Queue b lock holder c lock

slide-24
SLIDE 24

OSPERT 2013-07-09

  • A. Züpke

24 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | W b Wait Queue c b lock holder

slide-25
SLIDE 25

OSPERT 2013-07-09

  • A. Züpke

25 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | W b Wait Queue c unlock b

slide-26
SLIDE 26

OSPERT 2013-07-09

  • A. Züpke

26 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a a | W c Wait Queue b c

slide-27
SLIDE 27

OSPERT 2013-07-09

  • A. Züpke

27 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a b | W c Wait Queue b c lock holder

slide-28
SLIDE 28

OSPERT 2013-07-09

  • A. Züpke

28 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a b | W c unlock Wait Queue b c

slide-29
SLIDE 29

OSPERT 2013-07-09

  • A. Züpke

29 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a b | W b c

slide-30
SLIDE 30

OSPERT 2013-07-09

  • A. Züpke

30 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a c | 0 b c lock holder

slide-31
SLIDE 31

OSPERT 2013-07-09

  • A. Züpke

31 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a c | 0 c unlock b

slide-32
SLIDE 32

OSPERT 2013-07-09

  • A. Züpke

32 / 42

Mutex Protocol

 Sequence

 0. initial state: mutex unlocked  1. yellow tries to lock & suceeds  2. blue tries & sets W & suspends  3. green tries & suspends  4. yellow unlocks & wakes  5. blue becomes owner  6. blue unlocks & wakes  7. green becomes owner  8. green unlocks → mutex unlocked

Process B Process A SHM a 0 | 0 c unlock b

slide-33
SLIDE 33

OSPERT 2013-07-09

  • A. Züpke

33 / 42

Condition Variables

 Condition Variables have a supporting Mutex  CVs also use futexes

 Futex value:

atomic counter

 Wait queue:

maintain waiting threads

 cond_wait()  Releases mutex (caller of cond_wait() holds mutex)  Calls kernel to suspend on futex  On return: caller is owner of the mutex again

slide-34
SLIDE 34

OSPERT 2013-07-09

  • A. Züpke

34 / 42

Condition Variables

 cond_signal()  Atomically increment futex value  Call kernel to move first waiter

from Condition Variable wait queue to Mutex wait queue

 can be done in O(1) using doubly-linked lists

c b e Cond c Wait Queue d c Mutex a Wait Queue a

slide-35
SLIDE 35

OSPERT 2013-07-09

  • A. Züpke

35 / 42

Condition Variables

 cond_broadcast()  Atomically increment futex value  Call kernel to move all waiters

from Condition Variable wait queue to Mutex wait queue

 can be done in O(1) using doubly-linked lists

e d c b e Cond c Wait Queue d c Mutex a Wait Queue a

}

slide-36
SLIDE 36

OSPERT 2013-07-09

  • A. Züpke

36 / 42

Wait Queue Locking

 Wait queues are maintained by the kernel

→ need proper locking in the kernel

 Futex scope specific approaches:

 single address space

 possibly use an existing per-adspace lock

 single partition

 use an existing per-partition lock

 across partitions

 use a system wide lock or a global kernel lock

 can use existing locks or introduce new ones

slide-37
SLIDE 37

OSPERT 2013-07-09

  • A. Züpke

37 / 42

Wait Queue Locking

 Hashed address approach

 futex address → hash() → select lock in an array

 Single address space → virtual address  Multiple address spaces → physical address

 The lock array needs to be pre-allocated

 Both approaches should be combined

 scope approach ensures proper timing  hashing for scalability

 Also check partition privileges!

slide-38
SLIDE 38

OSPERT 2013-07-09

  • A. Züpke

38 / 42

Robustness

 What happens if the user manipulates

the thread ID of the first waiter in user space?

 ID set to zero or an invalid value

→ no waiters found

 But kernel can still remove waiters safely from the

wait queue

 ID of a thread waiting on another futex

 Sanity checks apply → no waiter woken up

 …

 Errors same as a thread never unlocking a mutex

→ Futex users have to trust each other

slide-39
SLIDE 39

OSPERT 2013-07-09

  • A. Züpke

39 / 42

Future Work

 Sorted wait queues

 for priority inheritance (PI)

  • r other priority inversion protocols

 e.g. sorted by priority, deadline, …

 Other scheduling algorithms

 using dynamic priorities like EDF  for mixed criticality systems

slide-40
SLIDE 40

OSPERT 2013-07-09

  • A. Züpke

40 / 42

Summary

 Implemented features:

 Pthread mutexes in different flavours

 Error Checking  Recursive

 Pthread condition variables  Pthread rwlocks, barriers, pthread_once()  POSIX semaphores  waiting with relative and absolute timeouts  both ”private” and ”shared” futexes

slide-41
SLIDE 41

OSPERT 2013-07-09

  • A. Züpke

41 / 42

Summary

 Mutexes and Condition Variables

with FIFO ordering

 No in-kernel memory allocator required!  Linear kernel memory usage  Using linked lists, all operations are in O(1) time  when adding ”wake arbitrary # of waiters”

and not just migrating queues, we get the same flexibility as in Linux

→ unfortunately this needs O(n) time

slide-42
SLIDE 42

OSPERT 2013-07-09

  • A. Züpke

42 / 42

Thank You! Any Questions?