Term 2 2020 Complete your myExperience and shape the future of - PowerPoint PPT Presentation

Distributed Termination, Global Snapshots and Parallel Scientific Computing Dr Vladimir Z. Tosic 1 Term 2 2020

Complete your myExperience and shape the future of education at UNSW. Click the link in Moodle or login to myExperience.unsw.edu.au (use z1234567@ ad .unsw.edu.au to login) The survey is confidential, your identity will never be released Survey results are not released to teaching staff until after your results are published

MAIN TOPICS IN THE LAST LECTURE… ( BEN-ARI TEXTBOOK CHAPTER 12 ) • Fault lt toler leran ance ce and inc inconsisten istent t inf inform rmation ation in distributed systems – the problem of consen ensus sus • By Byzant ntine General rals algorithm hm explanation • Byzantine Generals algorithm examples and demo in in DA DAJ • King ing algo lgorit ithm hm explanation and examples 3

MAIN TOPICS IN THIS LECTURE… ( BEN-ARI TEXTBOOK CHAPTER 11 ) • Glob lobal l properti rties in a distributed system – the problem of consisten stency cy • Dis Distrib ribute uted d terminati ination using the Dijkstra-Scholten and credit recovery algorithms • Global snapsho hots ts and the Chandy-Lamport algorithm • (Briefly; not in our textbook) Parallel programming in scientific computing and the Gravi avitat tation ional al N-Body y Problem lem 4

WEEK 8 HW CLARIFICATIONS (ATOMICITY IN RICART-AGRAWALA) From Chapter 10 in Ben- Ari’s Textbook 5

RICART-AGRAWALA ALGORITHM – COMPLETE (1/3) 6

RICART-AGRAWALA ALGORITHM – PROMELA FOR Main 9

RICART-AGRAWALA ALGORITHM – PROMELA FOR Receive 10

DISTRIBUTED TERMINATION – INTRODUCTION From Chapter 11 in Ben- Ari’s Textbook and materials by G.R. Andrews 11

GLOBAL PROPERTIES IN A DISTRIBUTED SYSTEM (DS) • DS conundrum 1: determining time and synchronising clocks • DS conundrum 2: information in a node changes while “state” information is collected among multiple nodes • Therefore: not studying simultaneity in DS, but consistency istency – unambiguous accounting of the state of the system 1. Dis 1. Distribute ributed d terminat mination – determine whether computations in all nodes have terminated 2. 2. (Co Consiste sistent) nt) snapshot hot – unambiguously account each message to a particular node/channel 12

TERMINATION – BROADER PERSPECTIVE • Terminatio ination is an important liveness property of programs that are intended to terminate • Sequential programs do not terminate if they diverge (i.e. e. not converge erge) and run forever • Concurrent programs can also deadloc lock (incl. livelock) • Thus: termin rminati ation = convergence ergence + deadlock-freed freedom om 13

THE NEED FOR TERMINATION DETECTION ALGORITHMS • Terminatio ination is a property of union of states of all individual processes and all message channels (“global state”) • As “global state” of a distributed system is not visible to a single node, it is not easy to know w wh when all ll processes esses term rminated inated • Even when all nodes are idle, there might be me messages ages in in transi nsit (sent but not yet received) that will unblock receiving nodes • Several approaches possible, we will study the Dij Dijkstra- Sc Scholte ten algorithm hm and mention some others 14

DISTRIBUTED TERMINATION – DIJKSTRA-SCHOLTEN ALGORITHM From Chapter 11 in Ben- Ari’s Text xtbo book ok 15

DIJKSTRA-SCHOLTEN ALGORITHM – ASSUMPTIONS • Change to previous DS assumptions: Not every 2 nodes have to be connected directly, nodes only have to form a dir irecte cted d graph • Termination algorithm is additional (to regular computations) statements executed when sending/receiving messages • Assume special ial envir ironme nment nt node – no incoming edges, all other nodes can be accessed from it, initiates DS by sending messages (all other nodes inactive), responsible for reporting termination • Node begins computation after receiving 1st message (on any edge), eventually terminates, but can restart on receiving a new message 16

DISTRIBUTED SYSTEM WITH ENVIRONMENT NODE AND BACK EDGES • Assume: for every regular edge from i to j there is a back k edge from j to i carrying special type of message called sign ignal • Assume: each node is at all times able to receive, process and send signals 17

DIJKSTRA-SCHOLTEN ALGORITHM – PRELIM. VERS. DATA STRUCTURES • Requirement: for every ry received ived messag sage, e, sign ignal l back to the source ce • inD inDeficit icit i [E [E]: difference between number of messages received on incoming edge E of node i and number of signals sent back • inDe Deficit i : sum of inDeficit i [E] for ALL edges of node i • outDef Defici icit i : difference between number of messages sent on ALL outgoing edges of node i and number of signals received back • When a node terminates it no longer sends messages, but it can continue sending signals as long as inD inDeficit icit i [E [E ]≠0 for any edge E • DS term rminatio ination when for the environment node: outDe Defic ficit it env env =0 =0 18

DIJKSTRA-SCHOLTEN ALGORITHM – PRELIMINARY V. (1/3): SEND/RECEIVE • Additions to regular sending and receiving of ALL messages 19

DIJKSTRA-SCHOLTEN ALGORITHM – PRELIM. VERS. (2/3): SIGNALS // / note e this! is! • Additional new w processe esses (blocked except when conditions true ) • send sign ignal does s not send the fina inal l sign ignal l wh whil ile the node is a is active! ive! 20

DIJKSTRA-SCHOLTEN ALGORITHM – PRELIM. VERS. (3/3): ENVIRONMENT 21

DIJKSTRA-SCHOLTEN ALGORITHM – PRELIM. V. CORRECTNESS / LIVENESS • For simplicity of proofs only, assume communication is synchron hronou ous • Whether synchronous or asynchronous does not impact correctness, as we assumed that all asynchronous messages are received eventually • Lemma 11.1: Inva varia riants nts 𝑗𝑜𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 ≥0 , 𝑝𝑣𝑢𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 ≥0 at each node i ; σ 𝑗∈𝑜𝑝𝑒𝑓𝑡 𝑗𝑜𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 = σ 𝑗∈𝑜𝑝𝑒𝑓𝑡 𝑝𝑣𝑢𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 • Theorem 11.2: If the system stem term rminate inates, s, the envir ironme nment nt node eventuall tually announce ces s term rminatio ination • Task for you: Try doing this proof yourself, then read the solution from the textbook (page 242) 22

DIJKSTRA-SCHOLTEN ALGORITHM – PRELIM. VERS. IS NOT SAFE • node1 sends to node2 and node3 , which then send to each other • inDefict 2 =2, inDeficit 2 [e2]=1, inDeficit 3 =2, inDeficit 3 [e3]=1 • By p5 and p6 , both node2 and node3 signal node1 , so it will have outDeficit 1 =0 and wil will announce nce termin rminati ation before ore it occurs urs! 23

DIJKSTRA-SCHOLTEN ALGORITHM – VIRTUAL SPANNING TREE • Source of 1st message to arrive at a node is this node’s parent • Node i waits for: signals from all its children, 𝑝𝑣𝑢𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 =0, its own termination; then sends s it its las last sign ignal l to it its parent nt • Variable parent nt stores parent edge (or -1 if it is still unknown) 24

DIJKSTRA-SCHOLTEN ALGORITHM – FINAL VERSION (1/2) • Note: no sending of messages before 1st message received 25

DIJKSTRA-SCHOLTEN ALGORITHM – FINAL VERSION (2/2) // note this! s! // last t signal al always s to parent! nt! // reset t parent; t; new parent t possible ible if re-activa ctivate ted 26

DIJKSTRA-SCHOLTEN ALGORITHM – PARTIAL SCENARIO 27

DIJKSTRA-SCHOLTEN ALGORITHM – DATA STRUCTURES AFTER PARTIAL SCENARIO • 1 ⇒ 2 in the table means: node1 sends message to node2 • (pare rent nt, , inD inDeficit icit[E] [E], , outDef Deficit icit) at each node (Es in order of nodes) • In the figure: outDef Deficit icit wit within in node in ( in (), , inD inDeficit icit on edges 28 • Task for you: add sig ignals ls and decisio isions to term rminate inate (DT DTTs) s)

DIJKSTRA-SCHOLTEN ALGORITHM – PARTIAL SCENARIO SOLUTION 29

DIJKSTRA-SCHOLTEN ALGORITHM – CORRECTNESS / SAFETY • For non-environment node: 𝑞𝑏𝑠𝑓𝑜𝑢 ≠ −1 ⇔ node is activ tive • Lemma 11.3: 𝑗𝑜𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 =0 ⇒ 𝑝𝑣𝑢𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 =0 is invariant at each non-environment node i • Lemma 11.4: parent variables define spanning tree of active nodes with the environment node at root; 𝑗𝑜𝐸𝑓𝑔𝑗𝑑𝑗𝑢 𝑗 ≠0 for each active node • Theorem 11.2: If envir ironment nment node announces ces term rminatio ination, n, the system stem has term rminated inated • Task for you: Try doing these proofs yourself, then read the solutions from the textbook (page 246) 30

DIJKSTRA-SCHOLTEN ALGORITHM – PERFORMANCE • Problem lem: the number of additional signals = the number of messages • Can be HUGE overhead when a big distributed system shuts down • Improvement: sending 1 signal instead of N signals on same edge • Improvement: Initialising all parent vars to point to environment node • Task for you: Examine textbook (page 247) pseudocode for these improvements • Another problem: when deficit count is more than max integer 31 • Solution: credit recovery algorithms

Term 2 2020 Complete your myExperience and shape the future of - PowerPoint PPT Presentation

Distributed Termination, Global Snapshots and Parallel Scientific Computing Dr Vladimir Z. Tosic 1 Term 2 2020 Complete your myExperience and shape the future of education at UNSW. Click the link in Moodle or

The short- -term and long term and long- -term term The short stratospheric and tropospheric

8.6.20 1 English Term 6 Week 2.notebook June 06, 2020 8.6.20 2 English Term 6 Week 2.notebook

InfoPorte by the Numbers (Slide 2) 1. Term Code From : Filled in with the current term 2. Term Code

Presentation Outline 1. Medium Term Fiscal projections 1. The 2011/12 and Medium Term Budget

REZCO CASH: SHORT TERM GAIN = LONG TERM PAIN CASH VS EQUITY 2 CASH VS EQUITY CASH VS EQUITY

South Burlington School District Proposed Long-Term Bond Why issue a long term bond? Entities

Codsall Middle School Year 5 Autumn Term Spring Term Summer Term Story Openers Persuasive

SHORT-TERM RENTALS IN AUSTIN, TX Smart City Policy Summit September 17, 2019 Todd LaRue,

SHS MJ-TERM 2018 Survey MJ-TERM May-June Term: May 21 st June 15 th . (18.5 Days)

Attribute Grammars intermediate syntax semantics representation Language Implementation 2

TERM FACULTY TASK FORCE COMMUNITY FORUM Term Faculty Task Force Update Fall 2017 OUR CHARGE The

Towards Greater International Transparency of Clinical Trials Short Term Efforts for Long Term

University of Applied Sciences Upper Austria 2 3 4 y x G(Expr): Expr Term | Term + Expr

The DSM data matrix DSM data are given as a term-term or term-context matrix: get see use hear

Return To Office Strategy Short-Term Strategy Mid-Term Strategy - Remote Work Long-Term

AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 National

26th November 2018 9:30am-12:00pm co-ordination | response | intelligence | expertise

Variable blocklength communication with feedback Gauri Joshi Graduate Seminar in Area 1 EECS MIT

Last t ime Need f or synchronizat ion primit ives 7: Synchronizat ion Locks and building

Extreme-scale Data Resilience Trade-offs at Experimental Facilities Sadaf Alam Chief

Rate-Based Resource Allocation scheduling in particular, dominates in the real-time Models for

Exercise (could be a quiz) 1 2 Solution 3 CSE 421/521 - Operating Systems Fall 2013 Lecture

Chapter 2: Processes & Threads Part 2 Interprocess Communication (IPC) & Synchronization

Transportation Coordinating Committee (TCC) Vir irtual Meeting April 17, , 2020 GoToWebinar

Term 2 2020 Complete your myExperience and shape the future of - PowerPoint PPT Presentation

Distributed Termination, Global Snapshots and Parallel Scientific Computing Dr Vladimir Z. Tosic 1 Term 2 2020 Complete your myExperience and shape the future of education at UNSW. Click the link in Moodle or

The short- -term and long term and long- -term term The short stratospheric and tropospheric

8.6.20 1 English Term 6 Week 2.notebook June 06, 2020 8.6.20 2 English Term 6 Week 2.notebook

InfoPorte by the Numbers (Slide 2) 1. Term Code From : Filled in with the current term 2. Term Code

Presentation Outline 1. Medium Term Fiscal projections 1. The 2011/12 and Medium Term Budget

REZCO CASH: SHORT TERM GAIN = LONG TERM PAIN CASH VS EQUITY 2 CASH VS EQUITY CASH VS EQUITY

South Burlington School District Proposed Long-Term Bond Why issue a long term bond? Entities

Codsall Middle School Year 5 Autumn Term Spring Term Summer Term Story Openers Persuasive

SHORT-TERM RENTALS IN AUSTIN, TX Smart City Policy Summit September 17, 2019 Todd LaRue,

SHS MJ-TERM 2018 Survey MJ-TERM May-June Term: May 21 st June 15 th . (18.5 Days)

Attribute Grammars intermediate syntax semantics representation Language Implementation 2

TERM FACULTY TASK FORCE COMMUNITY FORUM Term Faculty Task Force Update Fall 2017 OUR CHARGE The

Towards Greater International Transparency of Clinical Trials Short Term Efforts for Long Term

University of Applied Sciences Upper Austria 2 3 4 y x G(Expr): Expr Term | Term + Expr

The DSM data matrix DSM data are given as a term-term or term-context matrix: get see use hear

Return To Office Strategy Short-Term Strategy Mid-Term Strategy - Remote Work Long-Term

AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 AHMF 2020 National

26th November 2018 9:30am-12:00pm co-ordination | response | intelligence | expertise

Variable blocklength communication with feedback Gauri Joshi Graduate Seminar in Area 1 EECS MIT

Last t ime Need f or synchronizat ion primit ives 7: Synchronizat ion Locks and building

Extreme-scale Data Resilience Trade-offs at Experimental Facilities Sadaf Alam Chief

Rate-Based Resource Allocation scheduling in particular, dominates in the real-time Models for

Exercise (could be a quiz) 1 2 Solution 3 CSE 421/521 - Operating Systems Fall 2013 Lecture

Chapter 2: Processes &amp; Threads Part 2 Interprocess Communication (IPC) &amp; Synchronization

Transportation Coordinating Committee (TCC) Vir irtual Meeting April 17, , 2020 GoToWebinar

Chapter 2: Processes & Threads Part 2 Interprocess Communication (IPC) & Synchronization