Threads / Synchronization 1 1 Changelog Changes made in this - PowerPoint PPT Presentation

what’s wrong with this? string result; result = ComputeString(); return &result; } int main() { pthread_t the_thread; pthread_create(&the_thread, NULL, create_string, NULL); pthread_join(the_thread, &string_ptr); } 17 /* omitted: headers, using statements */ void *create_string( void *ignored_argument) { string *string_ptr; cout << "string is " << *string_ptr;

program memory Heap / other dynamic threads exit/are joined …stacks deallocated when string_ptr pointed to here string result allocated here dynamically allocated stacks Code / Data third thread stack 0xFFFF FFFF FFFF FFFF second thread stack main thread stack Used by OS 0x0000 0000 0040 0000 0x7F… 0xFFFF 8000 0000 0000 18

thread resources to create a thread, allocate: new stack (how big???) thread control block pthreads: by default need to join thread to deallocate everything thread kept around to allow collecting return value 19

pthread_detach void spawn_show_progress_thread() { pthread_t show_progress_thread; pthread_create(&show_progress_thread, NULL, show_progress, NULL); pthread_detach(show_progress_thread); } int main() { spawn_show_progress_thread(); do_other_stuff(); ... } detach = don’t care about return value, etc.system will deallocate when thread terminates 20 void *show_progress( void * ...) { ... }

starting threads detached void spawn_show_progress_thread() { pthread_t show_progress_thread; pthread_attr_t attrs; pthread_attr_init(&attrs); pthread_attr_setdetachstate(&attrs, PTHREAD_CREATE_DETACHED); pthread_create(&show_progress_thread, attrs, show_progress, NULL); pthread_attr_destroy(&attrs); } 21 void *show_progress( void * ...) { ... }

setting stack sizes void spawn_show_progress_thread() { pthread_t show_progress_thread; pthread_attr_t attrs; pthread_attr_init(&attrs); pthread_create(&show_progress_thread, attrs, show_progress, NULL); } 22 void *show_progress( void * ...) { ... } pthread_attr_setstacksize(&attrs, 32 * 1024 /* bytes */ );

sum example (to global) int sum_all() { values, results: global variables — shared } return results[0] + results[1]; pthread_join(threads[i], NULL); for ( int i = 0; i < 2; ++i) } for ( int i = 0; i < 2; ++i) { pthread_t thread[2]; } int values[1024]; return NULL; results[id] = sum; } sum += values[i]; int sum = 0; int id = ( int ) argument; int results[2]; 23 void *sum_thread( void *argument) { for ( int i = id * 512; i < (id + 1) * 512; ++i) { pthread_create(&threads[i], NULL, sum_thread, ( void *) i);

sum example (to main stack, global values) int sum_all() { only okay because sum_all waits! my_info: pointer to sum_all’s stack values: global variable — shared } return info[0].result + info[1].result; pthread_join(threads[i], NULL); for ( int i = 0; i < 2; ++i) } pthread_create(&threads[i], NULL, sum_thread, &info[i]); info[i].start = i*512; info[i].end = (i+1)*512; for ( int i = 0; i < 2; ++i) { pthread_t thread[2]; ThreadInfo info[2]; } int values[1024]; return NULL; my_info->result = sum; } sum += values[i]; for ( int i = my_info->start; i < my_info->end; ++i) { int sum = 0; }; int start, end, result; struct ThreadInfo { 24 void *sum_thread( void *argument) { ThreadInfo *my_info = (ThreadInfo *) argument;

sum example (to main stack, global values) int sum_all() { only okay because sum_all waits! my_info: pointer to sum_all’s stack values: global variable — shared } return info[0].result + info[1].result; pthread_join(threads[i], NULL); for ( int i = 0; i < 2; ++i) } pthread_create(&threads[i], NULL, sum_thread, &info[i]); info[i].start = i*512; info[i].end = (i+1)*512; for ( int i = 0; i < 2; ++i) { } int values[1024]; return NULL; my_info->result = sum; } sum += values[i]; for ( int i = my_info->start; i < my_info->end; ++i) { int sum = 0; }; int start, end, result; struct ThreadInfo { 24 void *sum_thread( void *argument) { ThreadInfo *my_info = (ThreadInfo *) argument; pthread_t thread[2]; ThreadInfo info[2];

sum example (to main stack, global values) int sum_all() { only okay because sum_all waits! my_info: pointer to sum_all’s stack values: global variable — shared } return info[0].result + info[1].result; pthread_join(threads[i], NULL); for ( int i = 0; i < 2; ++i) } pthread_create(&threads[i], NULL, sum_thread, &info[i]); info[i].start = i*512; info[i].end = (i+1)*512; for ( int i = 0; i < 2; ++i) { pthread_t thread[2]; ThreadInfo info[2]; } int values[1024]; return NULL; my_info->result = sum; } sum += values[i]; for ( int i = my_info->start; i < my_info->end; ++i) { int sum = 0; }; int start, end, result; struct ThreadInfo { 24 void *sum_thread( void *argument) { ThreadInfo *my_info = (ThreadInfo *) argument;

program memory (to main stack, global second thread stack values (global) Code / Data Heap / other dynamic my_info third thread stack my_info info array values) main thread stack Used by OS 0x0000 0000 0040 0000 0x7F… 0xFFFF 8000 0000 0000 0xFFFF FFFF FFFF FFFF 25

sum example (to main stack) ThreadInfo info[2]; pthread_t thread[2]; } return info[0].result + info[1].result; pthread_join(threads[i], NULL); for ( int i = 0; i < 2; ++i) } info[i].values = values; info[i].start = i*512; info[i].end = (i+1)*512; for ( int i = 0; i < 2; ++i) { 26 } return NULL; my_info->result = sum; } sum += my_info->values[i]; for ( int i = my_info->start; i < my_info->end; ++i) { int sum = 0; struct ThreadInfo { int *values; int start; int end; int result }; void *sum_thread( void *argument) { ThreadInfo *my_info = (ThreadInfo *) argument; int sum_all( int *values) { pthread_create(&threads[i], NULL, sum_thread, ( void *) &info[i]);

program memory (to main stack) 0xFFFF FFFF FFFF FFFF 0xFFFF 8000 0000 0000 0x7F… 0x0000 0000 0040 0000 Used by OS main thread stack info array values (stack? heap?) second thread stack my_info third thread stack my_info Heap / other dynamic Code / Data 27

sum example (on heap) return info; } return result; delete [] info; int result = info[0].result + info[1].result; pthread_join(info[i].thread, NULL); for ( int i = 0; i < 2; ++i) } } info[i].values = values; info[i].start = i*512; info[i].end = (i+1)*512; for ( int i = 0; i < 2; ++i) { } ... 28 struct ThreadInfo { pthread_t thread; int *values; int start; int end; int result }; void *sum_thread( void *argument) { ThreadInfo *start_sum_all( int *values) { ThreadInfo *info = new ThreadInfo[2]; pthread_create(&info[i].thread, NULL, sum_thread, ( void *) &info[i]); void finish_sum_all(ThreadInfo *info) {

program memory (on heap) 0xFFFF FFFF FFFF FFFF 0xFFFF 8000 0000 0000 0x7F… 0x0000 0000 0040 0000 Used by OS main thread stack second thread stack third thread stack Heap / other dynamic info array values (stack? heap?) my_info my_info Code / Data 29

a note on error checking from pthread_create manpage: special constants for return value same pattern for many other pthreads functions will often omit error checking in slides for brevity 30

error checking pthread_create int error = pthread_create(...); if (error != 0) { } 31 /* print some error message */

the correctness problem schedulers introduce non-determinism scheduler can switch threads at any time worse with threads on multiple cores difgerent cores happen in difgerent order each time makes reliable testing very diffjcult solution: correctness by design 32 scheduler might run threads in any order cores not precisely synchronized (stalling for caches, etc., etc.)

example application: ATM server commands: withdraw, deposit one correctness goal: don’t lose money 33

ATM server (pseudocode) ServerLoop() { while ( true ) { ReceiveRequest(&operation, &accountNumber, &amount); if (operation == DEPOSIT) { Deposit(accountNumber, amount); } else ... } } Deposit(accountNumber, amount) { account = GetAccount(accountId); StoreAccount(account); } 34 account − >balance += amount;

a threaded server? Deposit(accountNumber, amount) { account = GetAccount(accountId); StoreAccount(account); } maybe Get/StoreAccount can be slow? read/write disk sometimes? contact another server sometimes? maybe lots of requests to process? maybe real logic has more checks than Deposit() … all reasons to handle multiple requests at once 35 account − >balance += amount; → many threads all running the server loop

multiple threads while ( true ) { } } Deposit(accountNumber, amount); if (operation == DEPOSIT) { ReceiveRequest(&operation, &accountNumber, &amount); ServerLoop() { main() { } ... } ServerLoop, NULL); pthread_create(&server_loop_threads[i], NULL, for ( int i = 0; i < NumberOfThreads; ++i) { 36 } else ...

a side note why am I spending time justifying this? multiple threads for something like this make things much trickier we’ll be learning why… 37

the lost write context switch lost track of thread A’s money “winner” of the race lost write to balance context switch context switch 38 add amount, %rax Thread B Thread A add amount, %rax account − >balance += amount; (in two threads, same account) mov account − >balance, %rax mov account − >balance, %rax mov %rax, account − >balance mov %rax, account − >balance

thinking about race conditions (1) Thread A Thread B must be 1. Thread B can’t do anything 39 what are the possible values of x ? (initially x = y = 0 ) x ← 1 y ← 2

thinking about race conditions (2) Thread A Thread B 1 or 3 or 5 (non-deterministic) 40 what are the possible values of x ? (initially x = y = 0 ) x ← y + 1 y ← 2 y ← y × 2

thinking about race conditions (3) Thread A Thread B 1 or 2 …but why not 3? maybe each bit of assigned seperately? 41 what are the possible values of x ? (initially x = y = 0 ) x ← 1 x ← 2

thinking about race conditions (3) Thread A Thread B 1 or 2 41 what are the possible values of x ? (initially x = y = 0 ) x ← 1 x ← 2 …but why not 3? maybe each bit of x assigned seperately?

atomic operation atomic operation = operation that runs to completion or not at all we will use these to let threads work together most machines: loading/storing words is atomic but some instructions are not atomic one example: normal x86 add constant to memory 42 so can’t get 3 from x ← 1 and x ← 2 running in parallel

lost adds (program) the_value = 0; } printf("the_value = %d\n", the_value); // expected result: 1000000 + 1000000 = 2000000 pthread_join(B, NULL); pthread_join(A, NULL); pthread_create(&B, NULL, update_loop, ( void *) 1000000); pthread_create(&A, NULL, update_loop, ( void *) 1000000); pthread_t A, B; int main( void ) { .global update_loop int the_value; ret // if argument 1 >= 0 repeat jg update_loop // argument 1 -= 1 dec %rdi update_loop: 43 addl $1, the_value // the_value (global variable) + = 1 extern void *update_loop( void *);

lost adds (results) 44 the_value = ? 5000 4000 frequency 3000 2000 1000 0 800000 1000000 1200000 1400000 1600000 1800000 2000000

but how? probably not possible on single core exceptions can’t occur in the middle of add instruction …but ‘add to memory’ implemented with multiple steps still needs to load, add, store internally can be interleaved with what other cores do (and actually it’s more complicated than that — we’ll talk later) 45

so, what is actually atomic for now we’ll assume: load/stores of ‘words’ (64-bit machine = 64-bits words) their job to design caches, etc. to work as documented 46 in general: processor designer will tell you

too much milk buy milk how can Alice and Bob coordinate better? return home, put milk in fridge 3:30 buy milk 3:25 arrive at store return home, put milk in fridge 3:20 leave for store 3:15 roommates Alice and Bob want to keep fridge stocked with milk: look in fridge. no milk arrive at store 3:10 leave for store 3:05 look in fridge. no milk 3:00 Bob Alice time 47

too much milk “solution” 1 (algorithm) leave a note: “I am buying milk” place before buying remove after buying don’t try buying if there’s a note with atomic load/store of variable if (no milk) { if (no note) { leave note; buy milk; remove note; } } 48 ≈ setting/checking a variable (e.g. “ note = 1 ”)

too much milk “solution” 1 (timeline) } } } remove note; buy milk; leave note; } remove note; if (no milk) { buy milk; leave note; if (no note) { if (no milk) { Bob Alice if (no note) { 49

too much milk “solution” 2 (algorithm) intuition: leave note when buying or checking if need to buy leave note; if (no milk) { if (no note) { buy milk; } } remove note; 50

too much milk: “solution” 2 (timeline) leave note; if (no milk) { if (no note) { Alice buy milk; } } remove note; but there’s always a note …will never buy milk (twice or once) 51

“solution” 3: algorithm leave note from Bob; Bob remove note from Bob; } } buy milk if (no note from Alice) { if (no milk) { Alice intuition: label notes so Alice knows which is hers (and vice-versa) remove note from Alice; } } buy milk if (no note from Bob) { if (no milk) { leave note from Alice; computer equivalent: separate noteFromAlice and noteFromBob variables 52

too much milk: “solution” 3 (timeline) if (no milk) { remove note from Alice remove note from Bob } } buy milk if (no note from Alice) { } leave note from Alice } buy milk if (no note from Bob) { leave note from Bob Bob Alice if (no milk) { 53

too much milk: is it possible is there a solutions with writing/reading notes? yes, but it’s not very elegant 54 ≈ loading/storing from shared memory

too much milk: solution 4 (algorithm) if (no note from Alice) { exercise (hard): extend to three people exercise (hard): prove (in)correctness Bob remove note from Bob } } buy milk if (no milk) { leave note from Bob leave note from Alice Alice remove note from Alice } buy milk if (no milk) { } do nothing while (note from Bob) { 55

Peterson’s algorithm general version of solution see, e.g., Wikipedia we’ll use special hardware support instead 56

some defjnitions mutual exclusion : ensuring only one thread does a particular thing at a time like checking for and, if needed, buying milk critical section : code that exactly one thread can execute at a time result of critical section lock : object only one thread can hold at a time interface for creating critical sections 57

the lock primitive locks: an object with (at least) two operations: acquire or lock — wait until lock is free, then “grab” it release or unlock — let others use lock, wakeup waiters typical usage: everyone acquires lock before using shared resource forget to acquire lock? weird things happen Lock(MilkLock); if (no milk) { buy milk } Unlock(MilkLock); 58

pthread mutex #include <pthread.h> pthread_mutex_t MilkLock; pthread_mutex_init(&MilkLock, NULL); ... pthread_mutex_lock(&MilkLock); if (no milk) { buy milk } pthread_mutex_unlock(&MilkLock); 59

xv6 spinlocks #include "spinlock.h" ... struct spinlock MilkLock; initlock(&MilkLock, "name for debugging"); ... acquire(&MilkLock); if (no milk) { buy milk } release(&MilkLock); 60

C++ containers and locking can you use a vector from multiple threads? …question: how is it implemented? dynamically allocated array reallocated on size changes can access from multiple threads …as long as not being resized? 61

C++ standard rules for containers multiple threads can read anything at the same time can only read element if no other thread is modifying it can only add/remove elements if no other threads are accessing container some exceptions, read documentation really carefully 62

implementing locks: single core intuition: context switch only happens on interrupt timer expiration, I/O, etc. causes OS to run solution: disable them reenable on unlock x86 instructions: cli — disable interrupts sti — enable interrupts 63

problem: user can hang the system: /* waits forever for (disabled) interrupt from disk IO finishing */ naive interrupt enable/disable (1) Lock() { disable interrupts } Unlock() { enable interrupts } Lock(some_lock); while ( true ) {} problem: can’t do I/O within lock Lock(some_lock); read from disk 64

/* waits forever for (disabled) interrupt from disk IO finishing */ naive interrupt enable/disable (1) Lock() { disable interrupts } Unlock() { enable interrupts } Lock(some_lock); while ( true ) {} problem: can’t do I/O within lock Lock(some_lock); read from disk 64 problem: user can hang the system:

Threads / Synchronization 1 1 Changelog Changes made in this - PowerPoint PPT Presentation

Threads / Synchronization 1 1 Changelog Changes made in this version not seen in fjrst lecture: 22 September 2019: thread resources exercise (whats wrong with this): be more consistent about thread function name 30 September 2019: passing

Threads and Concurrency Threads and Concurrency Threads Threads A thread is a schedulable stream

More Threads and Synchronization More Threads and Synchronization Administrivia Administrivia

Unit 14: The Mach Operating System 14.2. Threads and Scheduling in Mach AP 9/01 Threads

1 User Threads Benefits Responsiveness Thread management done by a user-level threads

Synchronization Chapter 5 OSPP Part I Synchronization Motivation When threads concurrently

Content Synchronization Content Synchronization March 2nd 2005 Jukka Honkola T-110.456

Chapter 2: Processes & Threads Chapter 2 Processes and threads n Processes n Threads n

Chapter 2: Processes & Threads Chapter 2 Processes and threads Processes Threads

Operating Systems Threads Maria Hybinette, UGA Maria Hybinette, UGA Chapter: Threads:

Threads Threads Threads vs Processes Multi-threading Models Threading Issues

Programs, Processes, and Threads Programs, Processes, and Threads (Chapter 2) Processes

Chapter 5: Threads I Overview I Multithreading Models I Threading Issues I Pthreads I Solaris 2

Threads: Questions CSCI 1730 Systems Programming How is a thread different from a process?

Chapter: Threads: Ques/ons How is a thread different from a process? Why are threads

Operating System Principles: Threads, IPC, and Synchronization CS 111 Operating Systems Peter

Synchronization CISC3595/5595 Fall 2015 Fordham Univ. Synchronization Motivation When

Attack vectors on mobile devices Tam Hanna aka @tamhanna About /me Tam HANNA

Finding our way round maps, mazes, mathematical graphs . Earlier slides dealt with (rooted

Solving Problems Recursively Print words entered, but backwards Can use a vector, store all

Universal algebra for CSP Lecture 1 Ross Willard University of Waterloo Fields Institute Summer

Chapter 19 Collection Classes Collections are classes designed for holding groups of ob

On the relationship of maximal clones and maximal C-clones Mike Behrisch Institute of Computer

Ceph Snapshots: Diving into Deep Waters Greg Farnum Red hat Vault 2017.03.23 Hi, Im

Presented by Jason A. Donenfeld Who Who Am I? Am I? Jason Donenfeld, president of Edge

Threads / Synchronization 1 1 Changelog Changes made in this - PowerPoint PPT Presentation

Threads / Synchronization 1 1 Changelog Changes made in this version not seen in fjrst lecture: 22 September 2019: thread resources exercise (whats wrong with this): be more consistent about thread function name 30 September 2019: passing

Threads and Concurrency Threads and Concurrency Threads Threads A thread is a schedulable stream

More Threads and Synchronization More Threads and Synchronization Administrivia Administrivia

Unit 14: The Mach Operating System 14.2. Threads and Scheduling in Mach AP 9/01 Threads

1 User Threads Benefits Responsiveness Thread management done by a user-level threads

Synchronization Chapter 5 OSPP Part I Synchronization Motivation When threads concurrently

Content Synchronization Content Synchronization March 2nd 2005 Jukka Honkola T-110.456

Chapter 2: Processes &amp; Threads Chapter 2 Processes and threads n Processes n Threads n

Chapter 2: Processes &amp; Threads Chapter 2 Processes and threads Processes Threads

Operating Systems Threads Maria Hybinette, UGA Maria Hybinette, UGA Chapter: Threads:

Threads Threads Threads vs Processes Multi-threading Models Threading Issues

Programs, Processes, and Threads Programs, Processes, and Threads (Chapter 2) Processes

Chapter 5: Threads I Overview I Multithreading Models I Threading Issues I Pthreads I Solaris 2

Threads: Questions CSCI 1730 Systems Programming How is a thread different from a process?

Chapter: Threads: Ques/ons How is a thread different from a process? Why are threads

Operating System Principles: Threads, IPC, and Synchronization CS 111 Operating Systems Peter

Synchronization CISC3595/5595 Fall 2015 Fordham Univ. Synchronization Motivation When

Attack vectors on mobile devices Tam Hanna aka @tamhanna About /me Tam HANNA

Finding our way round maps, mazes, mathematical graphs . Earlier slides dealt with (rooted

Solving Problems Recursively Print words entered, but backwards Can use a vector, store all

Universal algebra for CSP Lecture 1 Ross Willard University of Waterloo Fields Institute Summer

Chapter 19 Collection Classes Collections are classes designed for holding groups of ob

On the relationship of maximal clones and maximal C-clones Mike Behrisch Institute of Computer

Ceph Snapshots: Diving into Deep Waters Greg Farnum Red hat Vault 2017.03.23 Hi, Im

Presented by Jason A. Donenfeld Who Who Am I? Am I? Jason Donenfeld, president of Edge

Chapter 2: Processes & Threads Chapter 2 Processes and threads n Processes n Threads n

Chapter 2: Processes & Threads Chapter 2 Processes and threads Processes Threads