i u eff i u pre i u u c c a b a b c a c is reachable from
play

I u eff I u PRE I u u C C A B A B C * A : C is - PowerPoint PPT Presentation

File System Replication Pictures Pictures Co-design and Verification of Tool Tool an Available File System Pictures Mahsa Najafzadeh, Marc Shapiro, and Patrick Eugster Low latency Tool High availability Fault tolerance Mahsa


  1. File System Replication Pictures Pictures Co-design and Verification of Tool Tool an Available File System Pictures Mahsa Najafzadeh, Marc Shapiro, and Patrick Eugster – Low latency Tool – High availability – Fault tolerance Mahsa Najafzadeh 2 Conflict Example= removing a directory POSIX File Systems vs. Distribution while adding a file into the directory POSIX: Remove Pictures • Assumes operations occur in a total order Update/Remove Conflict Pictures • Requires a synchronous, strong consistency model Pictures Tools • Synchronisation is costly and not available under partition IMG_1234.jpg Tools • In practice, concurrency conflicts are rare IMG_1234.jpg Distribution: Tools • No synchronisation: processes an update locally, propagates Add Photo Pictures effects to other replicas later. Pictures • Weakens consistency and causes conflicts Tools Tools Mahsa Najafzadeh 3 4

  2. Safety Tree Invariant • Convergent: do replicas that delivered the same updates have the same state? • Has a fixed root node • Is the invariant preserved? • Root is an ancestor of every node in the tree (reachability) Sequential: single operation in isolation maintains • the invariant • Every node, which has a name has exactly one parent, Concurrent execution maintains the invariant • except the root • No cycle in the directory structure • Unique names within a directory Mahsa Najafzadeh Mahsa Najafzadeh 6 5 Example= sequential move operation Example= do not move directory fails under self root C is NOT ancestor of A C ¬ (C ↓ * A ) mvDir(C,A) mvDir(C,A) A root B root I ✘ I ✘ u eff I u PRE I u u C C A B A B C ↓ * A : C is reachable from A Mahsa Najafzadeh Mahsa Najafzadeh 7 8

  3. Example= concurrent moves fails Example= concurrent moves fails B is NOT ancestor of A B is NOT ancestor of A root root mvDir(B,A) mvDir(B,A) A B A B mvDir PRE : ¬ (B ↓ * A ) mvDir PRE : ¬ (B ↓ * A ) u PRE u PRE r 1 r 1 root root mvDir(A,B) A B A B r 2 r 2 root B ↓ * A : A is reachable from B B ↓ * A : A is reachable from B A B Mahsa Najafzadeh Mahsa Najafzadeh 9 10 Example= concurrent moves fails Concurrency Control B is NOT ancestor of A root Tokens ≈ concurrency control abstractions mvDir(B,A) A B mvDir PRE : ¬ (B ↓ * A ) Tokens = { τ , …} u PRE r 1 Conflict relation ⋈ ⊆ Tokens × Tokens root root Example - mutual exclusion tokens: B A I ✘ mvDir(A,B) Tokens = { τ }; τ ⋈ τ A B r 2 An operation’s generator may acquire a set of tokens root Operations associated with conflicting tokens cannot A be concurrent B Mahsa Najafzadeh Mahsa Najafzadeh 12 11

  4. Example= moving a directory while Example= moving a directory while updating its content is safe updating its content is ok root root mvDir(B, A) mvDir(B, A) A B A B u PRE u PRE r 1 r 1 root root A B A B r 2 r 2 root addFile(f,B) A B f Mahsa Najafzadeh Mahsa Najafzadeh 13 14 Example= moving a directory while When is Synchronization Necessary? updating its content is ok • CAP theorem: Either (Strong) Consistency or root root Availability, not both, when Partitions occur A B mvDir(B, A) A B • u PRE This is a design trade-off r 1 f root Our approach: u PRE A B • Synchronize (CP) only operations where strictly r 2 necessary for safety root root • Other operations are asynchronous (AP) A B A B addFile(f,B) Safety = convergent + invariants f f Mahsa Najafzadeh Mahsa Najafzadeh 15 16

  5. Model Model Precondition Precondition u val Safety u val Safety client client u PRE u eff u PRE u eff r 1 r 1 origin replica origin replica u u v eff u eff u eff r 2 r 2 other replica other replica u eff u eff v eff r 3 r 3 other replica other replica v Generator (@origin) reads state from one copy and maps operation u to: Deliver(@all replicas): causally dependent messages delivered in order Return value: u val ∈ State ➞ Value Effects: u eff ∈ State ➞ (State ➞ State) Mahsa Najafzadeh Mahsa Najafzadeh Mahsa Najafzadeh Mahsa Najafzadeh 17 18 Add-wins directory= removing a directory A Mostly-Available, Convergent and while adding a file into the directory Correct File System Design • Allows common file system operations can run without Remove Pictures Update/Remove Conflict Pictures synchronization except for moves Pictures IMG_1234.jpg Tools • Maintains the tree invariant Tools IMG_1234.jpg Pictures • Guarantees convergence using replicated data types [Shapiro + 2011] Tools Add Photo • Name conflicts: Pictures • Merge directories Pictures • Rename files • Update/Remove conflicts: add-wins directory Tools Tools Mahsa Najafzadeh 19 20

  6. Effector Safety: CISE Analysis: Proves Application is Correct Example= move requires precondition • Rely-Guarantee reasoning for a causally-consistent system with root only polynomial complexity C • Consists of three analysis rules: Effector Safety: mvdir(C,A) A root B Every effect in isolation execution maintains the invariant I (sequential I u eff I u PRE u C safety) Commutativity: A B invariant invariant Concurrent operations commute (convergence) Stability: Preconditions are stable under concurrency (concurrent safety) • do not move directory under self If satisfied: the invariant I is guaranteed in every possible execution [Gotsman et al. POPL 2016 ’Cause I’m Strong Enough: Reasoning about Consistency Choices in Distributed Systems] Mahsa Najafzadeh Mahsa Najafzadeh 21 22 Stability Rule: Stability Rule: precondition is stable under concurrent effect precondition is stable under concurrent effect 1. Effector Safety: u eff preserves I when executed 1. Effector Safety: u eff preserves I when executed in any state satisfying u PRE in any state satisfying u PRE precondition of u holds precondition of u holds I I u PRE u PRE u eff u eff u u σ σ r 1 r 1 I ? I I u eff v eff σ σ r 2 r 2 v eff Mahsa Najafzadeh Mahsa Najafzadeh 23 24

  7. Stability Rule: Stability Rule: precondition is stable under concurrent effect precondition is stable under concurrent effect 1. Effector Safety: u eff preserves I when executed 1. Effector Safety: u eff preserves I when executed in any state satisfying u PRE in any state satisfying u PRE 2. P recondition Stability: u PRE will hold when u eff is 2. P recondition Stability: u PRE will hold when u eff is applied at any replica applied at any replica I u PRE I u PRE u eff u eff u u σ σ r 1 r 1 I ? I I I u eff u eff v eff v eff σ σ r 2 r 2 v eff v eff Is it preserved u PRE after executing v? u PRE ? Mahsa Najafzadeh Mahsa Najafzadeh 25 26 Necessary and Sufficient Concurrency Example: avoid conflicting moves Controls for Move LCA(A,B) { τ (B), τ (A) } T T T T mvDir(B,A) root ✔ A B r 1 mvDir(A,B) A B r 1 root ( τ (A) ⋈ τ (A) ) { τ (A), τ (B) } ( τ (B) ⋈ τ (B) ) mvDir(A,B) ✘ r 2 A B r 2 root • Add tokens, avoid mvDir || mvDir • A mutually exclusive token for each A B directory d ∈ Dir : ( τ (d) ⋈ τ (d) ) Mahsa Najafzadeh Mahsa Najafzadeh 27 28

  8. Conclusion Verification Results • A rigorous approach for modeling file system #O #Tokens #Invarian Average Applications Anomaly behavior for both centralized/synchronous and P ts Time(ms) replicated asynchronous semantics Sequential 7 7 1 NO 278 • Common operations except move to run without safety concurrency controls Concurrent 7 0 1 1297 violation • A hierarchical least-common ancestor concurrency Fully-Asynchronous 7 0 1 duplication 2350 control mechanism is necessary and sufficient for move operations Mostly-Asynchronous 7 2 1 1570 NO Mahsa Najafzadeh Mahsa Najafzadeh 29 30 Future Work • Translate the move concurrency controls into an efficient implementation • Integrate hard links, devices, and mounts into model Backup Slides • Reason about the file system behavior in the presence of failures Q/A Mahsa Najafzadeh 31

  9. Removing Token Over Source Removing Token Over Source Directory Directory root root { τ (B), τ (C)} { τ (B), τ (C)} D F D F mvDir(A,B) mvDir(A,B) A A C C r 1 r 1 B B H H { τ (F) } mvDir(A,F) r 2 r 2 root root D F D F A C A C B H Mahsa Najafzadeh Mahsa Najafzadeh B 22/04/16 33 H 34 Removing Token Over Source Removing Token Over Destination Directory Directory root root { τ (B), τ (C)} { τ (A), τ (C)} D F D F mvDir(A,B) mvDir(A,B) A A C C r 1 B B r 1 H H root D F { τ (F) } A C mvDir(A,F) B H r 2 r 2 root root D F D F A C A C B H B Mahsa Najafzadeh H Mahsa Najafzadeh 35 36

  10. Removing Token Over Destination Removing Token Over Destination Directory Directory root root { τ (A), τ (C)} { τ (A), τ (C)} D F D F mvDir(A,B) mvDir(A,B) A A C C B B r 1 H r 1 H root D F { τ (B), τ (A) } { τ (B), τ (A) } A C mvDir(B,H) mvDir(B,H) H B r 2 r 2 root root root root D D F F F F D D A A A C A C C C H B H B Mahsa Najafzadeh Mahsa Najafzadeh 37 38 B B H H Removing Token Over Ancestors Removing Token Over Ancestors root root up to LCA up to LCA D D F F { τ (A), τ (B)} { τ (A), τ (B)} A A C C mvDir(A,B) mvDir(A,B) B B H H r 1 r 1 { τ (C), τ (H) } mvDir(C,H) r 2 r 2 root root D F F D A A C C B Mahsa Najafzadeh B Mahsa Najafzadeh H 39 H 40

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend