CSE 506: Opera.ng Systems
Na.ve POSIX Threading Library (NPTL)
Don Porter
1
Na.ve POSIX Threading Library (NPTL) Don Porter 1 CSE 506: - - PowerPoint PPT Presentation
CSE 506: Opera.ng Systems Na.ve POSIX Threading Library (NPTL) Don Porter 1 CSE 506: Opera.ng Systems Logical Diagram Binary Memory Threads Formats Allocators User Todays Lecture Kernel System Calls Scheduling threads RCU File
CSE 506: Opera.ng Systems
1
CSE 506: Opera.ng Systems
2
CSE 506: Opera.ng Systems
3
CSE 506: Opera.ng Systems
– MulLple threads of execuLon in one address space – x86 hardware:
register contexts otherwise (rip, rsp/stack, etc.)
– Linux:
– Does JOS support threading?
4
CSE 506: Opera.ng Systems
– Design reflects performance concerns
libpthread.so Linux System Call pthread_create() clone(CLONE_FS|CLONE_IO| CLONE_THREAD|…) pthread_mutex_lock(), pthread_cond_wait(),… futex() Thread-local storage arch_prctl()
5
CSE 506: Opera.ng Systems
pid: 100
pid: 101
Kernel User mm
Stack Stack 1 .text
Shared Page Tables/Virtual Address Space
rsp 100 rsp 101 rip 101 rip 100
6
CSE 506: Opera.ng Systems
pid: 100
Kernel
t0
User mm
Stack Stack 1 sched: Thr1: Thr0:
Shared Page Tables/Virtual Address Space
rsp rip
t1 regs Convert to Async Read
read()
Call User Scheduler
regs Save t0 regs, Restore t1
7
CSE 506: Opera.ng Systems
– No privileged instrucLons needed – Same for saving and restoring PC (rip)
– OS must provide non-blocking equivalents – Transparent help from libc
8
CSE 506: Opera.ng Systems
– N ooen number of CPUs
9
CSE 506: Opera.ng Systems
– Working around “unfriendly” kernel API
– Second scheduler – SynchronizaLon different
– Certain funcLons (locks) – Timer signals from OS – Signals
10
CSE 506: Opera.ng Systems
11
CSE 506: Opera.ng Systems
– Takes a few hundred cycles to get in/out of kernel
– Time in the scheduler counts against your Lmeslice
– If I can run the context switching code locally (avoiding trap overheads, etc), my threads get to run slightly longer! – Stack switching code works in userspace with few changes
12
CSE 506: Opera.ng Systems
– Thread 1’s quantum expired – Thread 2 just spinning unLl its quantum expires – Wouldn’t it be nice to donate Thread 2’s quantum to Thread 1?
13
CSE 506: Opera.ng Systems
– If A blocks on I/O and B is using the CPU – B gets half the CPU Lme – A’s quantum is “lost” (at least in some schedulers)
– A gets a priority boost – Maybe applicaLon cares more about B’s CPU Lme…
14
CSE 506: Opera.ng Systems
15
CSE 506: Opera.ng Systems
– Not available on Linux – Some BSDs support(ed) scheduler acLvaLons
– Easier noLficaLon of blocking events
– Kernel allocates up to that many scheduler acLvaLons
16
CSE 506: Opera.ng Systems
– A kernel stack and a user-mode stack – Represents the allocaLon of a CPU Lme slice
– Does not automaLcally resume a user thread – Goes to one of a few well-defined “upcalls”
– User scheduler decides what to run
17
CSE 506: Opera.ng Systems
– Not free! – User scheduling must do beter than kernel by a big enough margin to offset these overheads
– PotenLal opLmizaLon: communicate to kernel a preference for which acLvaLon gets preempted to noLfy
18
CSE 506: Opera.ng Systems
– Higher context switching overhead (lots of register copying and upcalls) – Difference of opinion between research and kernel communiLes about how inefficient kernel-level schedulers
– Way more complicated to maintain the code for m:n
thread library!
19
CSE 506: Opera.ng Systems
– E.g., microkernels, extensible OSes, etc.
– High-performance databases generally get direct control
20
CSE 506: Opera.ng Systems
– Correlated with how efficiently the OS creates and context switches threads
– User-level thread packages were hot
– E.g., Most JVMs abandoned user-threads
21
CSE 506: Opera.ng Systems
– Correctness – Performance (SynchronizaLon)
22
CSE 506: Opera.ng Systems
1) The behavior of sending a signal to a mulL-threaded process was not correct. And could never be implemented correctly with kernel-level tools (pre 2.6)
2) Signals were also used to implement blocking
signal to the next blocked task to wake it up.
23
CSE 506: Opera.ng Systems
– 2.4 assigned different PID to each thread – Different TID to disLnguish them
– POSIX says I should be able to send a signal to a mulL- threaded program and any unmasked thread will get the signal, even if the first thread has exited
24
CSE 506: Opera.ng Systems
– Use an atomic instrucLon in user space to implement fast path for a lock (more in later lectures) – If task needs to block, ask the kernel to put you on a given futex wait queue – Task that releases the lock wakes up next task on the futex wait queue
25
CSE 506: Opera.ng Systems
– E.g., cleaning up stacks of dead threads – Scalability botleneck
– The kernel handled several terminaLon edge cases for threads – Kernel would write to a given memory locaLon to allow lazy cleanup of per-thread data
26
CSE 506: Opera.ng Systems
– Used in many systems – Idea: Transparently replace key “Foo” with “Foo:0”. Upon deleLon, require next creaLon to rename “Foo” to “Foo: 1”. Eliminates accidental use of stale data.
27
CSE 506: Opera.ng Systems
– Bits in the segment descriptor. Hardware-level limit
– EssenLally, kernel scheduler swaps them out if needed – Is this the common case? – No, expect 8k to be enough
28
CSE 506: Opera.ng Systems
– /proc file system able to handle more than 64k processes
29
CSE 506: Opera.ng Systems
30
CSE 506: Opera.ng Systems
– I enjoyed this reading very much
– User vs. kernel-level threading
31