Deterministic Fast User Space Synchronization Alexander Zpke - - PowerPoint PPT Presentation
Deterministic Fast User Space Synchronization Alexander Zpke - - PowerPoint PPT Presentation
Deterministic Fast User Space Synchronization Alexander Zpke alexander.zuepke@hs-rm.de RheinMain University of Applied Sciences Wiesbaden, Germany OSPERT A. Zpke Overview 2013-07-09 2 / 42 Futex Basics Challenge: Futexes for
OSPERT 2013-07-09
- A. Züpke
2 / 42
Overview
Futex Basics Challenge: Futexes for Partitioning Systems New Approach
Mutexes Condition Variables Locking of Wait Queues Robustness
Future Work Summary
OSPERT 2013-07-09
- A. Züpke
3 / 42
Mutex State Transitions
Unlocked ↔ Locked
Fast path: use atomic ops No system call involved!
Contention: first waiter
Atomically indicate pending waiters System call: suspend caller Kernel allocates a wait queue object
Contention: multiple waiters
Append to existing wait queue Wait queue order depends, sorting if necessary
unlocked locked locked w/ contention 1 waiter locked w/ contention 2+ waiters
OSPERT 2013-07-09
- A. Züpke
4 / 42
Futexes in Linux
Futex := 32-bit integer variable in user space atomic CAS or LL/SC operations in the fast path Glibc provides:
Mutexes and Condition Variables Semaphores, Reader-Writer Locks, Barriers, …
Linux kernel provides system calls to:
suspend the caller wake a given number of waiters
First prototype in Linux kernel version 2.5.7
OSPERT 2013-07-09
- A. Züpke
5 / 42
Futexes in Linux
Futex API
#include <linux/futex.h> int futex(int *uaddr, int op, int val, const struct timespec *timeout, int *uaddr2, int val3);
Operations
FUTEX_WAIT
Suspend calling thread on futex uaddr
FUTEX_WAKE
Wake val threads waiting on futex uaddr
FUTEX_REQUEUE
Move threads waiting on uaddr to uaddr2
… more operations available → see FUTEX(2) man page
OSPERT 2013-07-09
- A. Züpke
6 / 42
Motivation
Linux Implementation
Requires system calls only on contention Supports an arbitrary number of futexes No kernel resources required until suspension Also supports PI mutexes & condition variables
Futexes are really nice
… for Un*x Kernels
OSPERT 2013-07-09
- A. Züpke
7 / 42
Motivation
Linux Implementation
Requires system calls only on contention Supports an arbitrary number of futexes No kernel resources required until suspension Also supports PI mutexes & condition variables
But:
Can we use futexes in partitioned environments? For highly safety critical systems? Kernels without SLAB allocator?
OSPERT 2013-07-09
- A. Züpke
8 / 42
Motivation
Define ”Partitioning”
space and time partitioning Isolated (groups of) processes kernel resources are partitioned
Partition B Partition A SHM Futex a thread
OSPERT 2013-07-09
- A. Züpke
9 / 42
Motivation
Define ”Partitioning”
space and time partitioning Isolated (groups of) processes kernel resources are partitioned
Partition B Partition A SHM Futex lock lock
OSPERT 2013-07-09
- A. Züpke
10 / 42
Motivation
Define ”Partitioning”
space and time partitioning Isolated (groups of) processes kernel resources are partitioned
Problem
Q: Wait queue belongs to
Partition A or Partition B?
Pre-allocated w. queues?
Too pessimistic!
Partition B Partition A SHM Futex lock lock Kernel Obj ? ?
OSPERT 2013-07-09
- A. Züpke
11 / 42
Motivation
Define ”Partitioning”
space and time partitioning Isolated (groups of) processes kernel resources are partitioned
Problem
Q: Wait queue belongs to
Partition A or Partition B?
Pre-allocated w. queues?
Too pessimistic!
Idea: get rid of kernel object!
Partition B Partition A SHM Futex lock lock Kernel Obj ? ?
OSPERT 2013-07-09
- A. Züpke
12 / 42
Motivation
Get rid of the kernel object! The Linux Futex implementation uses:
array of futex hash entries
lock list head in-kernel objects
in-kernel object
list node in futex hash key (futex address) wait queue lock pointer
OSPERT 2013-07-09
- A. Züpke
13 / 42
Motivation
Get rid of the kernel object! The Linux Futex implementation uses:
array of futex hash entries
lock list head in-kernel objects
in-kernel object
list node in futex hash key (futex address) wait queue lock pointer
put into TCB
OSPERT 2013-07-09
- A. Züpke
14 / 42
Requirements
Identify correct wait queue
use thread ID of the first waiter put thread ID into user space, next to futex
Wait queue implementation in linear space
a priority sorted wait queue would be nice
Locking of the wait queue
assume a single kernel lock for now
→ more on that later
Thread ID
- f 1st waiter
futex
OSPERT 2013-07-09
- A. Züpke
15 / 42
Requirements
Algorithms need bounded WCET
depends on # of waiters # of waiters probably not known in advance
→ tricky across partition boundaries
Wait Queues
doubly-linked lists are O(1) ... except searching sorted wait queues with O(log n) are acceptable
if the upper bound of O(log n) is known
O(n) is only acceptable if n is bounded Pick FIFO-ordered doubly-linked list for now
OSPERT 2013-07-09
- A. Züpke
16 / 42
Mutex Protocol
Example
2 processes 3 threads futex in shared memory mutex protocol
Symbols
T: lock holder's thread ID W:bit indicating non-empty wait queue Q: thread ID of first waiting thread
Process B Process A SHM a Futex b c a thread
Futex Encoding:
Lock Holder ID < T | W > Waiters Bit Wait Queue Q
OSPERT 2013-07-09
- A. Züpke
17 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a 0 | 0 b c
OSPERT 2013-07-09
- A. Züpke
18 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a 0 | 0 lock b c
OSPERT 2013-07-09
- A. Züpke
19 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | 0 b lock holder c
OSPERT 2013-07-09
- A. Züpke
20 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | 0 lock b lock holder c
OSPERT 2013-07-09
- A. Züpke
21 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | W lock b lock holder c
OSPERT 2013-07-09
- A. Züpke
22 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | W b Wait Queue b lock holder c
OSPERT 2013-07-09
- A. Züpke
23 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | W b Wait Queue b lock holder c lock
OSPERT 2013-07-09
- A. Züpke
24 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | W b Wait Queue c b lock holder
OSPERT 2013-07-09
- A. Züpke
25 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | W b Wait Queue c unlock b
OSPERT 2013-07-09
- A. Züpke
26 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a a | W c Wait Queue b c
OSPERT 2013-07-09
- A. Züpke
27 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a b | W c Wait Queue b c lock holder
OSPERT 2013-07-09
- A. Züpke
28 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a b | W c unlock Wait Queue b c
OSPERT 2013-07-09
- A. Züpke
29 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a b | W b c
OSPERT 2013-07-09
- A. Züpke
30 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a c | 0 b c lock holder
OSPERT 2013-07-09
- A. Züpke
31 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a c | 0 c unlock b
OSPERT 2013-07-09
- A. Züpke
32 / 42
Mutex Protocol
Sequence
0. initial state: mutex unlocked 1. yellow tries to lock & suceeds 2. blue tries & sets W & suspends 3. green tries & suspends 4. yellow unlocks & wakes 5. blue becomes owner 6. blue unlocks & wakes 7. green becomes owner 8. green unlocks → mutex unlocked
Process B Process A SHM a 0 | 0 c unlock b
OSPERT 2013-07-09
- A. Züpke
33 / 42
Condition Variables
Condition Variables have a supporting Mutex CVs also use futexes
Futex value:
atomic counter
Wait queue:
maintain waiting threads
cond_wait() Releases mutex (caller of cond_wait() holds mutex) Calls kernel to suspend on futex On return: caller is owner of the mutex again
OSPERT 2013-07-09
- A. Züpke
34 / 42
Condition Variables
cond_signal() Atomically increment futex value Call kernel to move first waiter
from Condition Variable wait queue to Mutex wait queue
can be done in O(1) using doubly-linked lists
c b e Cond c Wait Queue d c Mutex a Wait Queue a
OSPERT 2013-07-09
- A. Züpke
35 / 42
Condition Variables
cond_broadcast() Atomically increment futex value Call kernel to move all waiters
from Condition Variable wait queue to Mutex wait queue
can be done in O(1) using doubly-linked lists
e d c b e Cond c Wait Queue d c Mutex a Wait Queue a
}
OSPERT 2013-07-09
- A. Züpke
36 / 42
Wait Queue Locking
Wait queues are maintained by the kernel
→ need proper locking in the kernel
Futex scope specific approaches:
single address space
possibly use an existing per-adspace lock
single partition
use an existing per-partition lock
across partitions
use a system wide lock or a global kernel lock
can use existing locks or introduce new ones
OSPERT 2013-07-09
- A. Züpke
37 / 42
Wait Queue Locking
Hashed address approach
futex address → hash() → select lock in an array
Single address space → virtual address Multiple address spaces → physical address
The lock array needs to be pre-allocated
Both approaches should be combined
scope approach ensures proper timing hashing for scalability
Also check partition privileges!
OSPERT 2013-07-09
- A. Züpke
38 / 42
Robustness
What happens if the user manipulates
the thread ID of the first waiter in user space?
ID set to zero or an invalid value
→ no waiters found
But kernel can still remove waiters safely from the
wait queue
ID of a thread waiting on another futex
Sanity checks apply → no waiter woken up
…
Errors same as a thread never unlocking a mutex
→ Futex users have to trust each other
OSPERT 2013-07-09
- A. Züpke
39 / 42
Future Work
Sorted wait queues
for priority inheritance (PI)
- r other priority inversion protocols
e.g. sorted by priority, deadline, …
Other scheduling algorithms
using dynamic priorities like EDF for mixed criticality systems
OSPERT 2013-07-09
- A. Züpke
40 / 42
Summary
Implemented features:
Pthread mutexes in different flavours
Error Checking Recursive
Pthread condition variables Pthread rwlocks, barriers, pthread_once() POSIX semaphores waiting with relative and absolute timeouts both ”private” and ”shared” futexes
OSPERT 2013-07-09
- A. Züpke
41 / 42
Summary
Mutexes and Condition Variables
with FIFO ordering
No in-kernel memory allocator required! Linear kernel memory usage Using linked lists, all operations are in O(1) time when adding ”wake arbitrary # of waiters”
and not just migrating queues, we get the same flexibility as in Linux
→ unfortunately this needs O(n) time
OSPERT 2013-07-09
- A. Züpke
42 / 42