SE350: Operating Systems
Lecture 5: Multithreaded Kernels
SE350: Operating Systems Lecture 5: Multithreaded Kernels Outline - - PowerPoint PPT Presentation
SE350: Operating Systems Lecture 5: Multithreaded Kernels Outline Use cases for multithreaded programs Kernel vs. user-mode threads Concurrencys problems Recall: Why Processes & Threads? Go Goals ls: Mu Multiprogramming
Lecture 5: Multithreaded Kernels
Multiprogramming: Run multiple applications concurrently
Protec ection: Don’t want bad applications to crash system! Go Goals ls:
Process ess: unit of execution and allocation
Virtual l Ma Machin ine abstractio ion: give process illusion it owns machine (i.e., CPU, Memory, and IO device multiplexing) So Solu lutio ion:
Ch Chal allenge ge:
Threa ead: Decouple allocation and execution
So Solu lutio ion:
and I/O address tables
multiple users/applications
Client Browser Web Server
Top half: accessed in call path from system calls
Botto ttom hal alf: run as interrupt routine
Device Driver Top Half Device Driver Bottom Half Device Hardware Kernel I/O Subsystem User Program
Kernel User-Level Processes
Heap Code Globals TCB 1 Kernel Thread 1 Stack TCB 2 Kernel Thread 2 Stack TCB 3 Kernel Thread 3 Stack TCB 1.B Stack TCB 1.A Stack Process 1 PCB 1 TCB 2.B Stack TCB 2.A Stack Process 2 PCB 2 Heap Code Globals Stack Thread A Stack Thread B Process 2 Heap Code Globals Stack Thread A Stack Thread B Process 1
space as possible (e.g., Linux vDSO)
Mach, OS/2, Linux Windows 10 Win NT to XP , Solaris, HP- UX, OS X Embedded systems (Geoworks, VxWorks, JavaOS, Pilot(PC), etc.) Traditional UNIX MS/DOS, early Macintosh
Many One # threads Per AS: Many One # of addr spaces:
Memory I/O State (e.g., file, socket contexts) CPU state (PC, SP , registers..) Sequential stream
A(int tmp) { if (tmp<2) B(); printf(tmp); } B() { C(); } C() { A(2); } A(1); …
(Unix) Process Resources
Stack
Stored in OS
high
low
high
high
yes
yes
high (involves at least one context switch)
…
Process 1 Process 2 Process N
CPU scheduler
OS
CPU (1 core)
1 process at a time
CPU state IO state Mem. CPU state IO state Mem. CPU state IO state Mem.
CPU scheduler
medium
low
medium
yes
no
low(ish) (thread switch overhead low)
Process 1 OS
CPU (1 core)
1 thread at a time
IO state Mem.
…
threads
Process N
IO state Mem.
…
threads
…
CPU state CPU state CPU state CPU state
low (only CPU state)
low
yes
no
low (thread switch overhead lo low, may not need to switch at all!)
Core 1 Core 2 Core 3 Core 4
CPU 4 threads at a time
CPU scheduler
Process 1 OS
IO state Mem.
…
threads
Process N
IO state Mem.
…
threads
…
CPU state CPU state CPU state CPU state
Superscalar Architecture Multi-processor Architecture Fine-grained Multithreading Simultaneous Multithreading
Time (cycles)
Thread 1 Thread 2 Colored blocks show executed instructions
PCore 1 PCore 2 PCore 3 PCore 4
between hardware- threads: ve very-lo low (done in hardware)
ALUs/FPUs may hur hurt performance
CPU
8 threads at a time hardware-threads (VCores)
CPU scheduler
Process 1 OS
IO state Mem.
…
threads
Process N
IO state Mem.
…
threads
…
CPU state CPU state CPU state CPU state
Programmer Abstraction Threads Processors 1 2 3 4 5 Physical Reality 1 2 Running Threads Ready Threads
.
Possible Execution #3
. . . x = x + 1 ; y = y + x ; . . . . . . . . . . . . . . . Thread is suspended. Other thread(s) run. Thread is resumed. . . . . . . . . . . . . . . . . z = x + 5 y ;
Execution #2
. . . x = x + 1 ; . . . . . . . . . . . . . . Thread is suspended. Other thread(s) run. Thread is resumed. . . . . . . . . . . . . . . . y = y + x ; z = x + 5 y ;
Execution #1
. . . x = x + 1 ; y = y + x ; z = x + 5 y ; . . .
Programmers View
. . . x = x + 1 ; y = y + x ; z = x + 5 y ; . . .
Thread 1 Thread 2 Thread 3
One Execution
Thread 1 Thread 2 Thread 3
Another Execution Another Execution
Thread 1 Thread 2 Thread 3
Sharin ing resources
Speedup
Modularity
serverLoop() { connection = AcceptCon(); fork(ServiceWebPage, connection); }
serverLoop() { connection = AcceptCon(); thread_create(ServiceWebPage, connection); }
multiprogramming
master() { allocThreads(worker,queue); while(TRUE) { con=AcceptCon(); Enqueue(queue,con); wakeUp(queue); } } worker(queue) { while(TRUE) { con=Dequeue(queue); if (con==null) sleepOn(queue); else ServiceWebPage(con); } }
Ma Master Th Threa ead Queue Thread Pool Cl Client Request Response
BankServer() { while (TRUE) { ReceiveRequest(&op, &acctId, &amount); ProcessRequest(op, acctId, amount); } } ProcessRequest(op, acctId, amount) { if (op == deposit) Deposit(acctId, amount); else if … } Deposit(acctId, amount) { acct = GetAccount(acctId); /* may use disk I/O */ acct->balance += amount; StoreAccount(acct); /* Involves disk I/O */ }
code into non-blocking fragments
Deposit(acctId, amount) { acct = GetAccount(actId); /* May use disk I/O */ acct->balance += amount; StoreAccount(acct); /* Involves disk I/O */ }
Thread 1 Thread 2
load r1, acct->balance load r1, acct->balance add r1, amount2 store r1, acct->balance add r1, amount1 store r1, acct->balance
Thread A Thread B x = 1; y = 2;
Thread A Thread B x = 1; y = 2; x = y + 1; y = y * 2;
Thread A Thread B x = 1; x = y + 1; y = 2; y = y * 2; x = 13
Thread A Thread B x = 1; y = 2;
Thread A Thread B x = 1; y = 2; x = y + 1; y = y * 2;
Thread A Thread B y = 2; y = y * 2; x = 1; x = y + 1; x = 5
Thread A Thread B x = 1; y = 2;
Thread A Thread B x = 1; y = 2; x = y + 1; y = y * 2;
Thread A Thread B y = 2; x = 1; x = y + 1; y = y * 2; x = 3
globaldigitalcitizen.org