Transactional Memory Gokarna Sharma and Costas Busch Louisiana - PowerPoint PPT Presentation

Towards Load Balanced Distributed Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University Euro- Par’12, August 31, 2012

Distributed Transactional Memory (DTM) • Transactions run on network nodes • They ask for shared objects distributed over the network for either read or write • The reads and writes on shared objects are supported through three operations:  Publish  Lookup  Move

Suppose the object ξ is at node and is a requesting node Requesting node Predecessor node ξ Data-flow model: transactions are immobile and the objects are mobile

Lookup operation Read-only copy Main copy ξ ξ Replicates the object to the requesting node

Lookup operation Read-only copy ξ Read-only copy Main copy ξ ξ Replicates the object to the requesting nodes

Move operation Main copy Invalidated ξ ξ Relocates the object explicitly to the requesting node

Move operation Main copy ξ Invalidated Invalidated ξ ξ - Relocates the object explicitly to the requesting node - Invalidates also the read-only copies (if available)

General routing: choose paths from sources to destinations v 3 u u 1 v 2 2 u 3 v 1 Routing in DTM: source node of the predecessor request in the total order is the destination of a successor request

Edge congestion Node congestion C C edge node maximum number of maximum number of paths that use any node paths that use any edge

Length of chosen path Stretch = Length of shortest path 12  stretch 1 . 5  8 shortest path u v chosen path

Oblivious Routing Each request path choice is independent of other request path choices

Problem Statement • Given a d -dimensional mesh and a finite set of operations R = { r 0 ,r 1 ,…, r l } on an object ξ • Design a DTM algorithm that: – Minimizes congestion C = max e | { i : 𝑞 𝑗 ϶ e } | on any edge e 𝑚 – Minimizes total communication cost A( R ) = σ 𝑗=1 |𝑞𝑗| for all the operations

Related Work Protocol Stretch Network Kind Runs on Arrow O( S ST )=O( D ) General Spanning tree [DISC’98] Relay O( S ST )=O( D ) General Spanning tree [OPODIS’09] Combine General Overlay tree O( S OT )=O( D ) [SSS’10] Ballistic Constant-doubling Hierarchical directory with O(log D ) [DISC’05] dimension independent sets O(log 2 n log D ) Spiral General Hierarchical directory with [IPDPS’12] sparse covers ➢ D is the diameter of the network kind ➢ S * is the stretch of the tree used

Limitations and Motivations • These protocols only minimize stretch and they cannot control congestion • Congestion can also be a major bottleneck – may affect the overall performance of the algorithm • A natural question is whether stretch and congestion can be controlled simultaneously • Congestion and stretch can not be minimized simultaneously in arbitrary networks

Our Contributions • MultiBend DTM algorithm for mesh networks • For 2-dimensional mesh, MultiBend has both stretch and (edge) congestion O ( log n) • For d -dimensional mesh, MultiBend has stretch O( d log n ) and (edge) congestion O( d 2 log n ) • For fixed d , – stretch is within O(log log n ) factor and – congestion is within O(1) factor far from optimal

In the Remaining… • Model • General Approach • Analogy to a Distributed Queue • Hierarchical decomposition for MultiBend • MultiBend Analysis • Stretch • Congestion • Discussion

Model • Mesh network G = ( V,E ) of n reliable nodes • One shared object • Nodes receive-compute-send atomically • Nodes are uniquely identified • Node u can send to node v if it knows v • One node executes one request at a time

General Approach

Hierarchical clustering Network graph

Hierarchical clustering Alternative representation as a hierarchy tree with leader nodes

At the lowest level (level 0) every node is a cluster Directories at each level cluster, downward pointer if object locality known

A Publish operation root Owner node ξ ➢ Assume that is the creator of which invokes the Publish operation ξ ➢ Nodes know their parent in the hierarchy

Send request to the leader root

Continue up phase root Sets downward pointer while going up

Root node found, stop up phase root

root Predecessor node ξ A successful Publish operation

Supporting a Move operation root Requesting node Predecessor node ξ ➢ Initially, nodes point downward to object owner (predecessor node) due to Publish operation ➢ Nodes know their parent in the hierarchy

Send request to leader node of the cluster upward in hierarchy root

Continue up phase until downward pointer found root Sets downward path while going up

Continue up phase root Sets downward path while going up

Downward pointer found, start down phase root Discards path while going down

Continue down phase root Discards path while going down

Predecessor reached, object is moved from node to node root Lookup is similar without change in the directory structure and only a read-only copy of the object is sent

Distributed Queue Analogy

Distributed Queue root u tail head u

Distributed Queue root u v tail head v u

Distributed Queue root u v w tail head u w v

Distributed Queue root v w tail head u w v

Distributed Queue root w tail head u w v

Results on Mesh Networks

Type-1 Mesh Decomposition 2-dimensional mesh

Type-1 Mesh Decomposition

Type-2 Mesh Decomposition

Decomposition for 2 3 x2 3 2-dimensional mesh (i+1,2) (i+1,1) (i,2) (i,1) Hierarchy levels

MultiBend Hierarchy • Find a predecessor node via multi-bend paths for each leaf node u – by visiting leaders of all the clusters that contain u from level 0 to the root level root u v p(u) p(v)

MultiBend Hierarchy (2) • The hierarchy guarantees: (1) For any two nodes u,v , their multi-bend paths p ( u ) and p ( v ) meet at root level min{h, log(dist( u,v ))+2} (2) length(p i ( u )) is at most 2 i+3 u v p(u) p(v)

(Canonical) downward Paths root root u u v p(v) p(u) p(u) p(v) is a (canonical) downward path

Load Balancing • Through a leader election procedure – Every time we access the leader of a sub-mesh, we replace it with another leader chosen uniformly at random among its nodes • The directory is updated appropriately by updating parent and child leaders – Locking may needed in concurrent executions • The update cost is low in comparison to the cost of serving requests • This step is necessary to control congestion – With fixed leader, edge congestion can be O( l ), the number of requests • If congestion requirement can be relaxed by a factor of ρ , the leader change is needed after every ρ requests

Analysis of MultiBend

Analysis on (move) Stretch Level Assume a sequential execution R of l +1 Move requests, where r 0 is an initial Publish r 0 h request. . . r 2 . . . . . r 2 . A*( R ) ≥ max 1≤k≤h (S k -1) 2 k-1 r 0 k r 2 r l . . r 1 r 2 . ℎ (Sk−1) 2 k+3 . A( R ) ≥ σ k=1 . . . . . . . . . r 0 r l-1 2 r 1 r 2 r l r 0 . . . r l-1 1 r 1 r 2 r l r 0 r l-1 0 r 1 r 2 r l y w x u v Thus, request ℎ (Sk−1) 2 k+3 / max 1≤k≤h (S k -1) 2 k-1 C( R )/C*( R ) = σ k=1 = 16 h max 1≤k≤h (S k -1) 2 k-1 / max 1≤k≤h (S k -1) 2 k-1 = O(log n )

Analysis on (Edge) Congestion • A sub-path uses edge e with probability 2/m l • P’ : set of paths from M 1 to M 2 or vice-versa M 2 • C’(e) : Congestion caused by P’ on e • E[C’(e)] ≤ 2|P’|/m l M 1 • B ≥ |P’|/out(M 1 ) e • out(M 1 ) ≤ 4m l • C* ≥ B m l ==> E[C’(e)] ≤ 8C* Assume M 1 is a type-1 submesh

Analysis on (Edge) Congestion (2) • As M 1 at level ( i ,2) is always completely contained in M 2 at level ( i +1,2) • log n +2 levels • E [ C(e) ] ≤ 8 C* (log n + 2) • Considering type-2 submeshes – exactly one type-2 submesh between every two type-1 submeshes – the type-2 submeshes may not be proper subset of type-1 submeshes – 4 possible type-2 submesh choices (path may bend at most 2 times) – Increases the load by a factor of 4 Thus, using standard Chernoff bound, C = O( C* log n ) , w.h.p.

d -Dimensional Mesh

Extensions to d -dimensional mesh • 2-dimesional decomposition can be directly generalized to a d -dimensional mesh – Problem: the congestion become O(2 d log n ) • Another decomposition is used to control congestion in O( C* d 2 log n ) and stretch O( d log n ) • We set appropriate λ and shift the type-1 submeshes by ( j -1) λ nodes in each dimension to get type - j submeshes

Transactional Memory Gokarna Sharma and Costas Busch Louisiana - PowerPoint PPT Presentation

Towards Load Balanced Distributed Transactional Memory Gokarna Sharma and Costas Busch Louisiana State University Euro- Par12, August 31, 2012 Distributed Transactional Memory (DTM) Transactions run on network nodes They ask for

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 25 November 2016 Lecture 8

NON-BLOCKING DATA STRUCTURES AND TRANSACTIONAL MEMORY Tim Harris, 27 November 2015 Lecture 8

Transactional Memory: Architectural support for Lock-Free Data Structure Transactional Memory:

Transactional memory with data Transactional memory with data invariants: or putting the

Hardware Transactional Memory Shao-Hung Chiu, Upasana Sridhar Transactional Memory - Where did

Memory II. Memory improvement III. Problems with memory 3 systems/stages of Memory: memory

Transactional Memory 1 To read more This days papers: Herlihy and Moss, Transactional

Extending Hardware Transactional Memory to Support Non-busy Waiting and Non-transactional Actions

Verification of Transactional Memories that support Non-Transactional Memory Accesses Ariel Cohen

Evaluating the Impact of Transactional Characteristics on the Performance of Transactional Memory

Time-Warp: Lightweight Abort Minimization in Transactional Memory Nuno Diegues and Paolo Romano

1 Memory SoC Persistent Memory-Driven Memory Memory Processor-Centric Memory SoC SoC

Networks Computer-Computer Comm CPU CPU CPU CPU Memory Device Device Memory Memory

DHTM: Durable Hardware Transactional Memory Arpit Joshi , Vijay Nagarajan, Marcelo Cintra, Stratis

Enhancing Permissiveness of Transactional Memory via Time-Warp Nuno Diegues and Paolo Romano

Inevitability Mechanisms for Inevitability Mechanisms for Software Transactional Memory Software

Simultaneous embeddings with few bends and crossings Fabrizio Frati Michael Hoffmann Vincent

ten top squishing bottom stretching beams: what are the stress across the section?

Petros P. Soukoulias Scientific/Advisory Board Prisma Electronics Established : 1991

Introduction to Mobile Robotics Probabilistic Sensor Models Wolfram Burgard, Diego Tipaldi,

Probabilistic Models of Cognition: Generative models Table of Contents Chapter

Game Theory: Lecture #1 Outline: Sociotechnical systems Social models Game theory

Neural Network Part 5: Unsupervised Models CS 760@UW-Madison Goals for the lecture you should

Computational Complexity Lecture 4 in which Diagonalization takes on itself, and we enter Space