Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems - PowerPoint PPT Presentation

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization, cont

Recall: Multiprocessor Scheduling: a problem Multiprocessor Scheduling: a problem • Problem with communication between two threads P bl m ith mm ni ti n b t n t th ds – both belong to process A – both running out of phase g f p • Scheduling and synchronization inter-related in 2 multiprocessors

The Priority Inversion Problem Uncontrolled use of locks in RT systems Uncontrolled use of locks in RT systems Possible solution: Limit priority Possible solution: Limit priority can result in unbounded blocking due to Inversions by modifying task priority inversions . priorities . High High Med Med Low Low Time Time t 0 t 1 t 2 t t t 0 1 2 lock Priority Inversion Computation not involving shared object accesses 3

Scheduling and Synchronization g y Priorities + locks may result in: y priority inversion: To cope/avoid this: – use priority inheritance – Avoid locks in synchronization (wait-free, lock-free, id l k i h i i ( i f l k f optimistic synchronization) convoy effect : processes need a resource for short c n y ff ct pr c n a r urc f r h rt time, the process holding it may block them for long time (hence, poor utilization) – Avoiding locks is good here, too A idi l k is d h t 4

Readers Writers and Readers-Writers and non-blocking synchronization (some slides are adapted from J. Anderson’s slides on same topic) 5

The Mutual Exclusion Problem Locking Synchronization while true do hil t d • N processes, each Noncritical Section; Entry Section; with this structure: ith this st t : Critical Section; Exit Section od • Basic Requirements: Basic Requirements: – Exclusion: Invariant(# in CS ≤ 1). – Starvation-freedom: (process i in Entry) leads-to St ti f d : ( ss i i E t ) l ds t (process i in CS). • Can implement by “busy waiting” (spin locks) or using kernel calls. 6

Synchronization without locks y • The problem: – Implement a shared object without mutual mp m nt a har j ct w th ut mutua exclusion . Locking • Shared Object: A data structure ( e.g ., queue) shared by concurrent processes. b – Why? • To avoid performance problems that result when a T v id p rf rm nc pr bl ms th t r sult h n lock-holding task is delayed. • To enable more interleaving (enhancing parallelism) g g p • To avoid priority inversions 7

Synchronization without locks y • Two variants: – Lock-free: L c fr • system-wide progress is guaranteed. • Usually implemented using “retry loops.” – Wait-free: • Individual progress is guaranteed. • More involved algorithmic methods l d l h h d 8

Readers/Writers Problem Readers/Writers Problem [Courtois et al 1971 ] [Courtois, et al. 1971.] • Similar to mutual exclusion, but several readers can execute “critical section” at the same time. t “ iti l s ti ” t th s ti • If a writer is in its critical section, then no other process can be in its critical section. • + no starvation, fairness 9

Solution 1 Readers have “priority”… Reader:: w, mutex: boolean semaphore P( mutex ); P( mutex ); Initially 1 Initially 1 rc := rc + 1; if rc = 1 then P( w ) fi; V( mutex ); Writer:: CS; P( w ); P( P( mutex ); t ) CS CS; rc := rc − 1; V( w ) if rc = 0 then V( w ) fi; if rc 0 then V( w ) fi; V( mutex ) “First” reader executes P(w). “Last” one executes V(w). 10

Concurrent Reading and Writing [L [Lamport ‘77] t ‘77] • Previous solutions to the readers/writers Previous solutions to the readers/writers problem use some form of mutual exclusion. • Lamport considers solutions in which readers and writers access a shared object j concurrently . • Motivation: M ti ti – Don’t want writers to wait for readers. – Readers/writers solution may be needed to / l implement mutual exclusion (circularity problem). 11

Interesting Factoids g • This is the first ever lock-free algorithm: This is the first ever lock free algorithm: guarantees consistency without locks • An algorithm very similar to this has been implemented within an embedded controller in i l t d ithi b dd d t ll i Mercedes automobiles 12

The Problem • Let v be a data item, consisting of one or more sub-items. – For example, • v = 256 consists of three digits, “2”, “5”, and “6”. • String “I love spring” consists of 3 words (or 13 characters) • A book consists of several chapters A b k i t f l h t • …. • Underlying model: subitems can be read and written atomically. m y • Objective: Simulate atomic reads and writes of the data item v . 13

Preliminaries • Definition: v [ i ] , where i ≥ 0, denotes the i th value written to v. (v [0] is v ’s initial value.) • Note: No concurrent writing of v • Note: No concurrent writing of v . • Partitioning of v : v 1 L v m . g 1 m – To start, focus on v being a number – v i may consist of multiple digits. y p g i • To read v : Read each v i (in some order). • To write v : Write each v i (in some order). 14

More Preliminaries read r: L read v m -1 read read v 1 read v 3 read v m � � � v 2 � � write: k write: k + i write: l L L We say: r reads v [ k,l ] . Value is consistent if k = l . 15

Main Theorem Assume that i ≤ j implies that v [ ] ≤ v [ j ] , where v = d 1 … d m . Assume that i ≤ j implies that v [ i ] ≤ v [ j ] where v = d d (a) If v is always written from right to left, then a read from left to ( ) y g , right obtains a value v [ k , l ] ≤ v [ l ] . (b) If v is always written from left to right, then a read from right to (b) If i l i f l f i h h d f i h left obtains a value v [ k , l ] ≥ v [ k ] . discuss why 16

Readers/Writers Solution Writer:: Reader:: → → → → V1 :> V1; repeat temp := V2 write D; read D ← ← ← V2 := V1 until V1 = temp :> means assign larger value. → V1 means “left to right”. ← V2 means “right to left”. 17

Useful Synchronization Primitives y Usually Necessary in Nonblocking Algorithms CAS(var, old, new) ( , , ) CAS2 CAS2 〈 if var ≠ old then return false fi; extends var := new; this return true 〉 return true 〉 LL(var) 〈 establish “link” to var; 〈 establish link to var; return var 〉 SC(var, val) ( , ) 〈 if “link” to var still exists then break all current links of all processes; var := val; var : val; return true else return false return false fi 〉 18

Another Lock-free Example n r L fr E amp Shared Queue type Qtype = record v: valtype; next: pointer to Qtype end shared var Tail: pointer to Qtype; local var old new: pointer to Qtype local var old, new: pointer to Qtype procedure Enqueue (input: valtype) new := (input, NIL); new := (input NIL); repeat old := Tail retry loop until CAS2(Tail, old->next, old, NIL, new, new) ne new new old ld old ld Tail Tail 19

Cache-coherence cache coherency protocols are based on a set of (cache block) states and state transitions : 2 main types of protocols • write-update • write-invalidate write invalidate • Reminds R i d readers/writers? 20

Multiprocessor architectures, memory consistency nsist n • Memory access protocols and cache coherence protocols define memory consistency models • Examples: p – Sequential consistency: e.g. SGI Origin (more and more seldom found now...) – Weak consistency: sequential consistency for special synchronization variables and actions before/after access to such variables. No ordering of other actions. e.g. SPARC architectures • Memory consistency also relevant at compiler- M i l l il level – i.e. The latter may reorder for optimization purposes 21

Distributed OS issues: IPC: Client/Server, RPC mechanisms Clusters load balncing Middleware Clusters, load balncing, Middleware

Multicomputers p • Definition: • Definition: Tightly-coupled CPUs that do not share memory • Also known as – cluster computers clust c mput s – clusters of workstations (COWs) – illusion is one machine – Alternative to symmetric multiprocessing (SMP) Alt ti t t i lti i (SMP) 23

Clusters Benefits of Clusters • Scalability – Can have dozens of machines each of which is a multiprocessor – Add new systems in small increments y • Availability – Failure of one node does not mean loss of service (well, not necessarily at least… why?) necessarily at least… why?) • Superior price/performance – Cluster can offer equal or greater computing power than a single large machine at a much lower cost large machine at a much lower cost BUT: • think about communication!!! • Th b The above picture is changing with multicore systems i i h i i h l i 24

Multicomputer Hardware example p p Network interface boards in a multicomputer Network interface boards in a multicomputer 25

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems - PowerPoint PPT Presentation

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization, cont Recall: Multiprocessor Scheduling: a problem Multiprocessor Scheduling: a problem Problem with communication between two threads P bl m

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization,

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency In addition,

Multiple processor Multiple processor systems systems 1 Multiprocessor Systems Multiprocessor

State of Multicore OCaml KC Sivaramakrishnan University of OCaml Labs Cambridge Outline

The Why, Where and How of Multicore Anant Agarwal MIT and Tilera Corp. What is Multicore?

Multicore Multicore curiculum 1 Motivation Moores Law: the number of transistors double

Multicore Synchronization a pragmatic introduction Multicore Synchronization This is a talk on

Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features:

The Diopsis Multiprocessor Tile of ShApes The Diopsis Multiprocessor Tile of ShApes Pier

Multicore OCaml GC KC Sivaramakrishnan, Stephen Dolan University of OCaml Labs Cambridge

RETHINKING OPERATING SYSTEM DESIGNS FOR A Ken Birman Based heavily MULTICORE WORLD on a slide

Multicore and Multiprocessor Systems: Part IV Jens Saak Scientific Computing II 141/348 Tree

Dispatching Domains for Multiprocessor Platforms and their Representation in Ada Alan Burns and

Debugging Multicore & Shared- Memory Embedded Systems Classes 249 & 269 2007 edition

When Multicore Isnt Enough: Trends and the Future for Multi-Multicore Systems Matt Reilly

OSGi Release 4, version 4.2 Vad r nytt, vad r ndrat ? Christer Larsson VP EMEA OSGi

Using Ad-Hoc Services for Mobile Augmented Reality Systems Asa MacWilliams Summary Concepts

1 Research Approach: the Kokyu Flexible Middleware Scheduling/Dispatching Framework Integration

Creation Topics & Challenges Creating Systems Jrg Cassens Implementing Systems

Vehicle Tracking Syst em 50 nodes on 4 th f loor 5 level ad hoc net 30 sec sampling

Instrumented Environments Andreas Butz, butz@ifi.lmu.de, www.mimuc.de Fri, 12:15-13:45,

An Introduction to UCLP (User Controlled Lightpath Provisioning) Gregor v. Bochmann School of I

& Kolkata Tier-2 Site Name :- IN-DAE-VECC-01 & IN-DAE-VECC-02 VO :- ALICE City:-

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems - PowerPoint PPT Presentation

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization, cont Recall: Multiprocessor Scheduling: a problem Multiprocessor Scheduling: a problem Problem with communication between two threads P bl m

Multipr cess r/Multic re Systems Multiprocessor/Multicore Systems Scheduling, Synchronization,

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency

Multiprocessor Synchronization Multiprocessor Systems Memory Consistency In addition,

Multiple processor Multiple processor systems systems 1 Multiprocessor Systems Multiprocessor

State of Multicore OCaml KC Sivaramakrishnan University of OCaml Labs Cambridge Outline

The Why, Where and How of Multicore Anant Agarwal MIT and Tilera Corp. What is Multicore?

Multicore Multicore curiculum 1 Motivation Moores Law: the number of transistors double

Multicore Synchronization a pragmatic introduction Multicore Synchronization This is a talk on

Multiprocessor Scheduling Will consider only shared memory multiprocessor Salient features:

The Diopsis Multiprocessor Tile of ShApes The Diopsis Multiprocessor Tile of ShApes Pier

Multicore OCaml GC KC Sivaramakrishnan, Stephen Dolan University of OCaml Labs Cambridge

RETHINKING OPERATING SYSTEM DESIGNS FOR A Ken Birman Based heavily MULTICORE WORLD on a slide

Multicore and Multiprocessor Systems: Part IV Jens Saak Scientific Computing II 141/348 Tree

Dispatching Domains for Multiprocessor Platforms and their Representation in Ada Alan Burns and

Debugging Multicore &amp; Shared- Memory Embedded Systems Classes 249 &amp; 269 2007 edition

When Multicore Isnt Enough: Trends and the Future for Multi-Multicore Systems Matt Reilly

OSGi Release 4, version 4.2 Vad r nytt, vad r ndrat ? Christer Larsson VP EMEA OSGi

Using Ad-Hoc Services for Mobile Augmented Reality Systems Asa MacWilliams Summary Concepts

1 Research Approach: the Kokyu Flexible Middleware Scheduling/Dispatching Framework Integration

Creation Topics &amp; Challenges Creating Systems Jrg Cassens Implementing Systems

Vehicle Tracking Syst em 50 nodes on 4 th f loor 5 level ad hoc net 30 sec sampling

Instrumented Environments Andreas Butz, butz@ifi.lmu.de, www.mimuc.de Fri, 12:15-13:45,

An Introduction to UCLP (User Controlled Lightpath Provisioning) Gregor v. Bochmann School of I

&amp; Kolkata Tier-2 Site Name :- IN-DAE-VECC-01 &amp; IN-DAE-VECC-02 VO :- ALICE City:-

Debugging Multicore & Shared- Memory Embedded Systems Classes 249 & 269 2007 edition

Creation Topics & Challenges Creating Systems Jrg Cassens Implementing Systems

& Kolkata Tier-2 Site Name :- IN-DAE-VECC-01 & IN-DAE-VECC-02 VO :- ALICE City:-