CPSC 213 data structure that could be accessed by multiple threads - PowerPoint PPT Presentation

Reading Synchronization The Importance of Mutual Exclusion ‣ Companion ‣ Shared data Memory Bus CPUs Memory (Cores) CPSC 213 • data structure that could be accessed by multiple threads •6 (Synchronization) • typically concurrent access to shared data is a bug ‣ Text ‣ Critical Sections •Shared Variables in a Threaded Program, Synchronizing Threads with • sections of code that access shared data Semaphores, Using Threads for Parallelism, Other Concurrency Issues some other thread ‣ Race Condition wait •2ed: 12.4-12.6, parts of 12.7 disk-read thread notify • simultaneous access to critical section section by multiple threads disk controller •1ed: 13.4-13.5, (no equivalent to 12.6), parts of 13.7 Introduction to Computer Systems • conflicting operations on shared data structure are arbitrarily interleaved ‣ We invented Threads to • unpredictable (non-deterministic) program behaviour — usually a bug (a serious bug) • exploit parallelism do things at the same time on different processors ‣ Mutual Exclusion • manage asynchrony do something else while waiting for I/O Controller ‣ But, we now have two problems Unit 2c • a mechanism implemented in software (with some special hardware support) • coordinating access to memory (variables) shared by multiple threads • to ensure critical sections are executed by one thread at a time Synchronization • control flow transfers among threads (wait until notified by another thread) • though reading and writing should be handled differently (more later) ‣ Synchronization is the mechanism threads use to ‣ For example • ensure mutual exclusion of critical sections • consider the implementation of a shared stack by a linked list ... • wait for and notify of the occurrence of events 1 2 3 4 Mutual Exclusion using Locks ‣ Stack implementation ‣ concurrent test doesn’t always work ‣ The bug •push and pop are critical sections on the shared stack void push_st (struct SE* e) { struct SE { et = uthread_create ((void* (*)(void*)) push_driver, (void*) n); ‣ lock semantics e->next = top; struct SE* next; dt = uthread_create ((void* (*)(void*)) pop_driver, (void*) n); •they run in parallel so their operations are arbitrarily interleaved top = e; }; uthread_join (et); } struct SE *top=0; uthread_join (dt); •a lock is either held by a thread or available •sometimes, this interleaving corrupts the data structure assert (top==0); X •at most one thread can hold a lock at a time struct SE* pop_st () { malloc: *** error for object 0x1022a8fa0: pointer being freed was not allocated top •a thread attempting to acquire a lock that is already held is forced to wait struct SE* e = top; top = (top)? top->next: 0; ‣ what is wrong? ‣ lock primitives return e; } • lock acquire lock, wait if necessary void push_st (struct SE* e) { struct SE* pop_st () { e->next = top; ‣ Sequential test works struct SE* e = top; •unlock release lock, allowing another thread to acquire if waiting void push_st (struct SE* e) { struct SE* pop_st () { top = e; top = (top)? top->next: 0; e->next = top; struct SE* e = top; } return e; ‣ using locks for the shared stack top = e; top = (top)? top->next: 0; } void push_driver (long int n) { void pop_driver (long int n) { } return e; struct SE* e; struct SE* e; } while (n--) void push_cs (struct SE* e) { while (n--) { struct SE* pop_cs () { push ((struct SE*) malloc (...)); do { lock (&aLock); struct SE* e; 1. e->next = top } e = pop (); lock (&aLock); push_st (e); 2. e = top } while (!e); unlock (&aLock); e = pop_st (); free (e); 3. top = top->next unlock (&aLock); } } push_driver (n); 4. return e } pop_driver (n); return e; 6. top = e assert (top==0); 5. free e } 5 6 7 8 Implementing Simple Locks Atomic Memory Exchange Instruction ‣ We now have a race in the lock code ‣ The race exists even at the machine-code level •two instructions acquire lock: one to read it free, one to set it held Thread A Thread B ‣ Here’s a first cut ‣ We need a new instruction •but read by another thread and interpose between these two void lock (int* lock) { void lock (int* lock) { • use a shared global variable for synchronization •to atomically read and write a memory location while (*lock==1) {} while (*lock==1) {} ld $lock, r1 •lock loops until the variable is 0 and then sets it to 1 •with no intervening access to that memory location from any other thread *lock = 1; *lock = 1; ld $1, r2 } } allowed •unlock sets the variable to 0 lock appears free loop: ld (r1), r0 ‣ Atomicity beq r0, free Another thread int lock = 0; void lock (int* lock) { 1. read *lock==0, exit loop br loop •is a general property in systems while (*lock==1) {} reads lock *lock = 1; 2. read *lock==0, exit loop •where a group of operations are performed as a single, indivisible unit free: st r2, (r1) acquire lock } 3. *lock = 1 ‣ The Atomic Memory Exchange 4. return with lock held void unlock (int* lock) { Thread A Thread B •one type of atomic memory instruction (there are other types) *lock = 0; 5. *lock = 1, return •group a load and store together atomically 6. return with lock held } ld (r1), r0 •exchanging the value of a register and a memory location • why doesn’t this work? Both threads think they hold the lock ... ld (r1), r0 Name Semantics Assembly st r2, (r1) atomic exchange r[v] ← m[r[a]] xchg (ra), rv st r2, (r1) m[r[a]] ← r[v] 9 10 11 12 Implementing Atomic Exchange Spinlock Implementing Spinlocks Efficiently Blocking Locks ‣ A Spinlock is ‣ Spin first on fast normal read, then try slow atomic ‣ If a thread may wait a long time Memory Bus CPUs exchange Memory • it should block so that other threads can run •a lock where waiter spins on looping memory reads until lock is acquired (Cores) • it will then unblock when it becomes runnable (lock available or event notification) •use normal read in loop until lock appears free •also called “busy waiting” lock ‣ Blocking locks for mutual exclusion ‣ Simple implementation using Atomic Exchange •when lock appears free use exchange to try to grab it • if lock is held, locker puts itself on waiter queue and blocks •if exchange fails then go back to normal read •spin on atomic memory operation • when lock is unlocked, unlocker restarts one thread on waiter queue ‣ Can not be implemented just by CPU ld $lock, %r1 •that attempts to acquire lock while ‣ Blocking locks for event notification loop: ld (%r1), %r0 •atomically reading its old value •must synchronize across multiple CPUs beq %r0, try • waiting thread puts itself on a a waiter queue and blocks br loop •accessing the same memory location at the same time • notifying thread restarts one thread on waiter queue (or perhaps all) try: ld $1, %r0 ld $lock, r1 xchg (%r1), %r0 ‣ Implemented by Memory Bus ld $1, r0 ‣ Implementing blocking locks presents a problem beq %r0, held loop: xchg (r1), r0 br loop •memory bus synchronizes every CPUs access to memory • lock data structure includes a waiter queue and a few other things beq r0, held held: br loop • data structure is shared by multiple threads; lock operations are critical sections •the two parts of the exchange (read + write) are coupled on bus held: ‣ Busy-waiting pros and cons • mutual exclusion can be provided by blocking locks (they aren’t implemented yet) •bus ensures that no other memory transaction can intervene • and so, we need to use spinlocks to implement blocking locks (this gets tricky) •but there is a problem: atomic-exchange is an expensive instruction •Spinlocks are necessary and okay if spinner only waits a short time •this instruction is much slower , higher overhead than normal read or write •But, using a spinlock to wait for a long time, wastes CPU cycles 13 14 15 16

CPSC 213 data structure that could be accessed by multiple threads - PowerPoint PPT Presentation

Reading Synchronization The Importance of Mutual Exclusion Companion Shared data Memory Bus CPUs Memory (Cores) CPSC 213 data structure that could be accessed by multiple threads 6 (Synchronization) typically concurrent

Slide 4 / 213 Slide 4 (Answer) / 213 Slide 5 / 213 Derivatives Exploration Exploration into the

MESSAGE HANDLING MESSAGE HANDLING ICS- -213 213 ICS Presented by Chuck Sprick KE5RAD Feb

CPSC 213 - news, admin details, schedule and readings - lecture slides (always posted before

CPSC 320: NP-Completeness CPSC 320 2013W2 CPSC 320: NP-Completeness Up to now: We have been

July 23, 2019 Utility Fund Summary Audited Utility Fund Ending Balance 09/30/18 $ 11,213,213

ORRICK, HERRINGTON BACKGROUND & SUTCLIFFE INTELLIGENCE, INC. 213/612-2204 213/243-0707

Machine-Level Programming I: Basics 15-213/18-213:

Rate of Change Click here to go to the lab titled "Derivatives Exploration: y = x 2 "

Rate of Change Click here to go to the lab titled "Derivatives allow them to be able to find

CPSC 213 Read assembly code anything that can be determined before execution (by compiler)

CPSC 213 Introduction to Computer Systems Unit 3 Course Review 1 Learning Goals 1 Memory

CPSC 213 2.4.4-2.4.5 Textbook Structures, Dynamic Memory Allocation, Understanding

CPSC 213 2.7.4, 2.7.7-2.7.8 Text Switch Statements, Understanding Pointers 2ed:

CPSC 213 Read assembly code Java weak references, reference objects, reference queues

CPSC 213 Introduction to Computer Systems Unit 1a Numbers and Memory 1 The Big Picture

CPSC 213 Introduction to Computer Systems Unit 2d Virtual Memory 1 Reading Companion 5

POSIX Thread Synchronization Mutex Locks Condition Variables Read-Write Locks

Recursive Program Synthesis Aws Albarghouthi (UToronto), Sumit Gulwani (MSR), and Zachary

Path Invariance Based Partial Loop Un-switching Ashutosh Nema, Shivarama Rao, Dibyendu Das AMD

Verifying Safety and Accuracy of Approximate Parallel Programs via Canonical Sequentialization

Condition Variables & Semaphores Nima Honarmand (Based on slides by Prof. Andrea

Reproducible Tall-Skinny QR J. Demmel and H.D. Nguyen

MontiCore A Framework for DSL-Development Efficient Editor Generation for Compositional DSLs in

Translating SQL to RA Database Systems: The Complete Book Ch 16,16.1 1 The Evaluation Pipeline

CPSC 213 data structure that could be accessed by multiple threads - PowerPoint PPT Presentation

Reading Synchronization The Importance of Mutual Exclusion Companion Shared data Memory Bus CPUs Memory (Cores) CPSC 213 data structure that could be accessed by multiple threads 6 (Synchronization) typically concurrent

Slide 4 / 213 Slide 4 (Answer) / 213 Slide 5 / 213 Derivatives Exploration Exploration into the

MESSAGE HANDLING MESSAGE HANDLING ICS- -213 213 ICS Presented by Chuck Sprick KE5RAD Feb

CPSC 213 - news, admin details, schedule and readings - lecture slides (always posted before

CPSC 320: NP-Completeness CPSC 320 2013W2 CPSC 320: NP-Completeness Up to now: We have been

July 23, 2019 Utility Fund Summary Audited Utility Fund Ending Balance 09/30/18 $ 11,213,213

ORRICK, HERRINGTON BACKGROUND &amp; SUTCLIFFE INTELLIGENCE, INC. 213/612-2204 213/243-0707

Machine-Level Programming I: Basics 15-213/18-213:

Rate of Change Click here to go to the lab titled &quot;Derivatives Exploration: y = x 2 &quot;

Rate of Change Click here to go to the lab titled &quot;Derivatives allow them to be able to find

CPSC 213 Read assembly code anything that can be determined before execution (by compiler)

CPSC 213 Introduction to Computer Systems Unit 3 Course Review 1 Learning Goals 1 Memory

CPSC 213 2.4.4-2.4.5 Textbook Structures, Dynamic Memory Allocation, Understanding

CPSC 213 2.7.4, 2.7.7-2.7.8 Text Switch Statements, Understanding Pointers 2ed:

CPSC 213 Read assembly code Java weak references, reference objects, reference queues

CPSC 213 Introduction to Computer Systems Unit 1a Numbers and Memory 1 The Big Picture

CPSC 213 Introduction to Computer Systems Unit 2d Virtual Memory 1 Reading Companion 5

POSIX Thread Synchronization Mutex Locks Condition Variables Read-Write Locks

Recursive Program Synthesis Aws Albarghouthi (UToronto), Sumit Gulwani (MSR), and Zachary

Path Invariance Based Partial Loop Un-switching Ashutosh Nema, Shivarama Rao, Dibyendu Das AMD

Verifying Safety and Accuracy of Approximate Parallel Programs via Canonical Sequentialization

Condition Variables &amp; Semaphores Nima Honarmand (Based on slides by Prof. Andrea

Reproducible Tall-Skinny QR J. Demmel and H.D. Nguyen

MontiCore A Framework for DSL-Development Efficient Editor Generation for Compositional DSLs in

Translating SQL to RA Database Systems: The Complete Book Ch 16,16.1 1 The Evaluation Pipeline

ORRICK, HERRINGTON BACKGROUND & SUTCLIFFE INTELLIGENCE, INC. 213/612-2204 213/243-0707

Rate of Change Click here to go to the lab titled "Derivatives Exploration: y = x 2 "

Rate of Change Click here to go to the lab titled "Derivatives allow them to be able to find

Condition Variables & Semaphores Nima Honarmand (Based on slides by Prof. Andrea