NOW Handout Page 1 Strawman Lock Atomic Instructions Specifies a - PDF document

Role of Synchronization • “A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast.” Hardware-Software Trade-offs in Synchronization • Types of Synchronization – Mutual Exclusion – Event synchronization CS 252, Spring 05 » point-to-point » group David E. Culler » global (barriers) Computer Science Division • How much hardware support? U.C. Berkeley – high-level operations? – atomic instructions? – specialized interconnect? 3/29/2005 CS252 S05 2 Layers of synch support Mini-Instruction Set debate • atomic read-modify-write instructions – IBM 370: included atomic compare&swap for multiprogramming Application – x86: any instruction can be prefixed with a lock modifier – High-level language advocates want hardware locks/barriers » but it’s goes against the “RISC” flow,and has other User library problems Operating System Support – SPARC: atomic register-memory ops (swap, compare&swap) – MIPS, IBM Power: no atomic operations but pair of Synchronization Library instructions » load-locked, store-conditional Atomic RMW ops » later used by PowerPC and DEC Alpha too HW Support • Rich set of tradeoffs 3/29/2005 CS252 S05 3 3/29/2005 CS252 S05 4 Other forms of hardware support Components of a Synchronization Event • Separate lock lines on the bus • Acquire method • Lock locations in memory – Acquire right to the synch » enter critical section, go past event • Lock registers (Cray Xmp) • Waiting algorithm • Hardware full/empty bits (Tera) – Wait for synch to become available when it isn’t • Bus support for interrupt dispatch – busy-waiting, blocking, or hybrid • Release method – Enable other processors to acquire right to the synch • Waiting algorithm is independent of type of synchronization – makes no sense to put in hardware 3/29/2005 CS252 S05 5 3/29/2005 CS252 S05 6 NOW Handout Page 1

Strawman Lock Atomic Instructions • Specifies a location, register, & atomic operation Busy-Wait – Value in location read into a register /* copy location to register */ lock: ld register, location – Another value (function of value read or not) stored into /* compare with 0 */ cmp location, #0 location /* if not 0, try again */ bnz lock • Many variants /* store 1 to mark it locked */ st location, #1 /* return control to caller */ ret – Varying degrees of flexibility in second part • Simple example: test&set /* write 0 to location */ unlock: st location, #0 – Value in location read into a specified register /* return control to caller */ ret – Constant 1 stored into location – Successful if value loaded into register is 0 Why doesn’t the acquire method work? – Other constants could be used instead of 1 and 0 Release method? 3/29/2005 CS252 S05 7 3/29/2005 CS252 S05 8 Simple Test&Set Lock Performance Criteria for Synch. Ops • Latency (time per op) lock: t&s register, location /* if not 0, try again */ bnz lock – especially when light contention /* return control to caller */ ret • Bandwidth (ops per sec) /* write 0 to location */ unlock: st location, #0 – especially under high contention /* return control to caller */ ret • Traffic • Other read-modify-write primitives ? e c n – load on critical resources a m – Swap, Exch r o f r e – especially on failures under contention p n – Fetch&op o i t a z • Storage i n o – Compare&swap r h c n y s » Three operands: location, register to compare with, e r u s a ? register to swap with e n m o • Fairness i u a t o u r y D » Not commonly supported by RISC instruction sets o d ? e s l n a o c S t i d i • cacheable or uncacheable ? n n o c o t i t n a e h w n t o e r C d n U • 3/29/2005 CS252 S05 9 3/29/2005 CS252 S05 10 T&S Lock Microbenchmark: SGI Chal. Enhancements to Simple Lock 20 � � Test&set, c = 0 • Reduce frequency of issuing test&sets while � Test&set, exponential backof f, c = 3.64 � 18 � Test&set, exponential backof f, c = 0 � � waiting � Ideal � 16 � � � – Test&set lock with backoff � 14 � � � � – Don’t back off too much or will be backed off when lock � � 12 � Time ( µ s) � � becomes free � � � � – Exponential backoff works quite well empirically: i th time = 10 � � � � � � k*c i 8 lock; � � • Busy-wait with read operations rather than 6 � delay(c); test&set � � 4 unlock; � � � � � – Test-and-test&set lock 2 � � � �� – Keep testing with ordinary load � � � � 0 3 5 7 9 11 13 15 » cached lock variable will be invalidated when release Number of processors • Why does performance degrade? occurs – When value changes (to 0), try to obtain lock with test&set • Bus Transactions on T&S? » only one attemptor will succeed; others will fail and start • Hardware support in CC protocol? testing again 3/29/2005 CS252 S05 11 3/29/2005 CS252 S05 12 NOW Handout Page 2

NOW Handout Page 1 Strawman Lock Atomic Instructions Specifies a - PDF document

Role of Synchronization A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Hardware-Software Trade-offs in Synchronization Types of Synchronization Mutual

Agenda Item 7 Page 107 Page 108 Page 109 Page 110 Page 111 Page 112 Page 113 Page 114 Page

Page 1 of 36 Page 2 of 36 Page 3 of 36 Page 4 of 36 Page 5 of 36 Page 6 of 36 Page 7 of 36

Agenda Item 7 Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9 Page 10

Wednesday, November 30, 2016 3:41 PM General Page 1 General Page 2 General Page 3 General Page

Lecture 8 Friday, June 2, 2017 5:38 PM slide_8 Page 1 slide_8 Page 2 slide_8 Page 3 slide_8

177 Hudson Street Manhattan, NY 10013 Block 219 Lot 21 Historic Photos Page 1 Page 2 Page 3

PAGE 1 PAGE 2 PAGE 3 PAGE 4 Vision PAGE 5 Desire Lines of Cow Paths? PAGE 6

1. Test page This page is for testing. This page is for testing. This page is for testing.

Lecture 12 Sunday, January 27, 2019 5:25 PM Lecture12 Page 1 Lecture12 Page 2 Lecture12 Page 3

KAMPARO page 9 page 16 page 19 page 27 page 34 2 INHOUDSOPGA VE page 4 Cables Chargers

Page 35 Page 36 Page 37 Page 38 Page 39 This page is intentionally left blank

May 26, 2015 Presentation to Council and School Board Page 1 of 24 Page 2 of 24 Page 3 of 24

BRIGHT-LINE TEST Table of Contents page page page page page 3 5 11 15 19 What is the

HANDOUTS 1 Slide 2 Handout contents Page 2-3 Handout contents 4 Introduction 5 - 6 Paying

Contents Nordea Page 3 Integration Page 16 Highlights and market development Page 24

Contents Summary presentation Q3/02 Page 3 Nordea Page 43 Integration Page 54

Programming Rules Appendix H Computer Security: Art and Science, 2 nd Edition Version 1.0 Slide

Pianola: A script-based I/O benchmark John May PSDW08, 17 November 2008 Lawrence Livermore

State Spaces & Partial-Order Planning AI Class 22 (Ch. 10 through 10.4.4 ) Material from Dr.

TDDD04: Integration and System level testing Lena Buffoni lena.buffoni@liu.se Lecture plan

George Palade CSE P 590 A Nov. 19, 1912 -- Oct 8, 2008 Autumn 2008 Lecture 5 Motifs:

Sustainable Use of Risk-Informed Regulation to Improve Plant Safety Nuclear Regulatory Commission

DNA Mo'f Discovery COMPSCI 260 Spring 2016 DNA motif discovery

1 Contact Analysis Contact Analysis How would you compute a direction of motion for the

NOW Handout Page 1 Strawman Lock Atomic Instructions Specifies a - PDF document

Role of Synchronization A parallel computer is a collection of processing elements that cooperate and communicate to solve large problems fast. Hardware-Software Trade-offs in Synchronization Types of Synchronization Mutual

Agenda Item 7 Page 107 Page 108 Page 109 Page 110 Page 111 Page 112 Page 113 Page 114 Page

Page 1 of 36 Page 2 of 36 Page 3 of 36 Page 4 of 36 Page 5 of 36 Page 6 of 36 Page 7 of 36

Agenda Item 7 Page 1 Page 2 Page 3 Page 4 Page 5 Page 6 Page 7 Page 8 Page 9 Page 10

Wednesday, November 30, 2016 3:41 PM General Page 1 General Page 2 General Page 3 General Page

Lecture 8 Friday, June 2, 2017 5:38 PM slide_8 Page 1 slide_8 Page 2 slide_8 Page 3 slide_8

177 Hudson Street Manhattan, NY 10013 Block 219 Lot 21 Historic Photos Page 1 Page 2 Page 3

PAGE 1 PAGE 2 PAGE 3 PAGE 4 Vision PAGE 5 Desire Lines of Cow Paths? PAGE 6

1. Test page This page is for testing. This page is for testing. This page is for testing.

Lecture 12 Sunday, January 27, 2019 5:25 PM Lecture12 Page 1 Lecture12 Page 2 Lecture12 Page 3

KAMPARO page 9 page 16 page 19 page 27 page 34 2 INHOUDSOPGA VE page 4 Cables Chargers

Page 35 Page 36 Page 37 Page 38 Page 39 This page is intentionally left blank

May 26, 2015 Presentation to Council and School Board Page 1 of 24 Page 2 of 24 Page 3 of 24

BRIGHT-LINE TEST Table of Contents page page page page page 3 5 11 15 19 What is the

HANDOUTS 1 Slide 2 Handout contents Page 2-3 Handout contents 4 Introduction 5 - 6 Paying

Contents Nordea Page 3 Integration Page 16 Highlights and market development Page 24

Contents Summary presentation Q3/02 Page 3 Nordea Page 43 Integration Page 54

Programming Rules Appendix H Computer Security: Art and Science, 2 nd Edition Version 1.0 Slide

Pianola: A script-based I/O benchmark John May PSDW08, 17 November 2008 Lawrence Livermore

State Spaces &amp; Partial-Order Planning AI Class 22 (Ch. 10 through 10.4.4 ) Material from Dr.

TDDD04: Integration and System level testing Lena Buffoni lena.buffoni@liu.se Lecture plan

George Palade CSE P 590 A Nov. 19, 1912 -- Oct 8, 2008 Autumn 2008 Lecture 5 Motifs:

Sustainable Use of Risk-Informed Regulation to Improve Plant Safety Nuclear Regulatory Commission

DNA Mo'f Discovery COMPSCI 260 Spring 2016 DNA motif discovery

1 Contact Analysis Contact Analysis How would you compute a direction of motion for the

State Spaces & Partial-Order Planning AI Class 22 (Ch. 10 through 10.4.4 ) Material from Dr.