27 04 2017
play

27/04/2017 Advanced Network Security 6. Self-stabilisation - PDF document

27/04/2017 Advanced Network Security 6. Self-stabilisation Jaap-Henk Hoepman Digital Security (DS) Radboud University Nijmegen, the Netherlands @xotoxot // * jhh@cs.ru.nl // 8 www.cs.ru.nl/~jhh Self-stabilisation: a different failure model n


  1. 27/04/2017 Advanced Network Security 6. Self-stabilisation Jaap-Henk Hoepman Digital Security (DS) Radboud University Nijmegen, the Netherlands @xotoxot // * jhh@cs.ru.nl // 8 www.cs.ru.nl/~jhh Self-stabilisation: a different failure model n Instead of (permanent) processing failures we study transient memory failures network ● State of a node stored in RAM, which can be changed arbitrarily ● Program code stored in ROM, never changed RAM system state n State of network can also transient change ROM program code error ● So study shared memory systems to simplify analysis node ● But self-stabilisation in message passing systems is also possible Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 2 System model n 𝑜 nodes ● Uniform (all with the same state) or non-uniform ● With or without known node identifiers (stored in ROM, i.e. cannot change) n Communicating through shared memory ● Modelled through graph 𝐻 = (𝑊, 𝐹) ● State reading model: 𝑤, 𝑥 ∈ 𝐹 means 𝑥 can read entire state of 𝑤 ● Link register model: 𝑤, 𝑥 ∈ 𝐹 means 𝑤 writes a register 𝑆 -. read by 𝑥 n Configuration 𝐷 ∈ 𝒟 consists of the Cartesian product of states of all nodes (and the registers on the edges). Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 3 1

  2. 27/04/2017 Synchronous system n System 𝒟, 𝐺 proceeds in rounds ● Program of a node 𝑗 is a function 𝑔 5 ∈ 𝐺 describing the resulting state (plus registers it writes) given the current state (and the registers it writes) ● Uniform: 𝑔 5 = 𝑔 6 for all 𝑗, 𝑘 ● Known identities: 𝑔 5 𝐷 may depend on 𝑗 n Central daemon 5 → 𝐷’ (i.e. 𝐷 : = 𝑔 ● Scheduler fairly selects one node 𝑗 to take a step: 𝐷 5 (𝐷) ) n Distributed daemon ● Scheduler fairly selects one or more nodes 𝐽 to take a step: first all nodes read their own state and the states/registers they can read, then compute the < new state and then store the new state and the registers they write: 𝐷 → 𝐷’ Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 4 Self-stabilisation n Consider a system S = 𝒟, 𝐺 n Define some set of legitimate configurations ℒ ⊂ 𝒟 ● Typically defined using a predicate ● These are the good configurations that we want the system to be in ● Note that a node may not be able to determine whether its local state is part of a global configuration that is legitimate! n System 𝑇 is self-stabilising to ℒ if ● Convergence : when we start the system in an arbitrary initial configuration 𝐷 ∈ 𝒟 it always reaches a legitimate configuration 𝐷 : ∈ ℒ within a finite number of steps (the convergence time) ● Closure : Once the system is in a legitimate configuration it stays in a legitimate configuration after each system step Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 5 Some questions n Why does a self-stabilising system recover from transient memory faults? n Can a self-stabilising system terminate? Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 6 2

  3. 27/04/2017 Some toy problems n Suppose the system consists of a single node ● Then you can simply reset the node to a legitimate configuration n Suppose the system is a complete graph, and the legitimate configurations are those where all nodes have the same state ● The transition function simply checks this condition, and if false sets the state of a node to a default state n Let’s add a progress condition and suppose the state of a node is a clock that needs to by in sync with all other nodes and increases in every round ● How to stabilise in this case… Take majority? You can only change your own state… What about unwanted parallelism? Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 7 Mutual exclusion on a ring (Dijkstra) n Refresher: what is mutual exclusion? ● Here: a token circulating on a ring n Impossibility of symmetric solution for rings of non-prime size ● Non-prime: break ring in equal size chords >1; ● Assume symmetric states for all chords ● Schedule same location nodes one by one or all at once: still symmetric Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 8 Mutual exclusion on a ring: construction n 𝑂 + 1 nodes 0, … , 𝑂 , node 0 is special ● On an oriented, state reading ring: node 𝑗 + 1 𝑛𝑝𝑒 𝑂 reads state of node 𝑗 n Each node has state 𝑦 𝑗 ∈ {0, … , 𝐿 − 1} for 𝐿 > 𝑂 n Protocol ● Node 0 : if 𝑦[0] = 𝑦 𝑂 then 𝑦 0 ← 𝑦 0 + 1 𝑛𝑝𝑒 𝐿 ● Node 𝑗 ≠ 0 : if 𝑦[𝑗] ≠ 𝑦 𝑗 − 1 then 𝑦 𝑗 ← 𝑦 𝑗 − 1 n Privileged (i.e. has the token) ● Node 0 if 𝑦 0 = 𝑦 𝑂 ● Node 𝑗 ≠ 0 if 𝑦 𝑗 ≠ 𝑦[𝑗 − 1] ● I.e. node is privileged if it can take a step (which turns it unprivileged n Legitimate states ℒ ● All states where there is an 𝑗 , 0 ≤ 𝑗 ≤ 𝑂 and 𝑑 ∈ {0, …, 𝐿 − 1} such that for all 𝑘, 0 ≤ 𝑘 ≤ 𝑗 we have 𝑦 𝑘 = 𝑑 + 1 𝑛𝑝𝑒 𝐿 and for all 𝑘, 𝑗 < 𝑘 ≤ 𝑂 we have 𝑦 𝑘 = 𝑑 Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 9 3

  4. 27/04/2017 Proof of correctness (1) n Note: always at least one node enabled, so there is no deadlock n In a legitimate configuration exactly one node enabled/privileged n Assume a central daemon, and 𝐿 ≥ 𝑂 n Closure ● Let the system be in a legitimate configuration for some choice of 𝑗, 𝑑, i.e. 𝑄(𝑗, 𝑑) = “for all 𝑘, 0 ≤ 𝑘 ≤ 𝑗 we have 𝑦 𝑘 = 𝑑 + 1 𝑛𝑝𝑒 𝐿 and for all 𝑘, 𝑗 < 𝑘 ≤ 𝑂 we have 𝑦 𝑘 = 𝑑 ” holds. ● Then only 𝑙 = 𝑗 + 1 𝑛𝑝𝑒 𝑂 is enabled ● If 𝑙 ≠ 0 , then after the step 𝑦 𝑙 = 𝑑 + 1 𝑛𝑝𝑒 𝐿 and hence 𝑄 𝑙, 𝑑 holds ● If 𝑙 = 0 , then after the step 𝑦 𝑙 = 𝑑 + 2 𝑛𝑝𝑒 𝐿 while for all other 𝑗 still 𝑦 𝑗 = 𝑑 + 1 𝑛𝑝𝑒 𝐿 . Hence 𝑄(0, 𝑑 + 1 𝑛𝑝𝑒 𝐿) holds Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 10 Proof of correctness (2) n Convergence ● Initially colour all nodes white ● Colour node 0 blue the first time it takes a step; after that it stays blue forever ● Nodes colour blue when they copy a blue value from their counterclockwise neighbour (and then stay blue forever) ● Let ℎ be the number of times node 0 takes a step while node 𝑂 is still white ● Then ℎ ≤ 𝑂 : after the first step of 0 , there are at most 𝑂 − 1 white nodes that can provide 𝑂 with at most 𝑂 − 1 new white states ● W.l.o.g. let 𝑦[0] initially be 𝐿 − 1 ; after the first step 𝑦[0] becomes 0 ; so after 𝑗 -th step, 𝑦[0] = 𝑗 − 1 ● Starting at 0 , we observe that all blue nodes form a decreasing chain of values ● Now let 0 be about to take the ℎ + 1 th step (i.e. 𝑂 is blue). Then before that step 𝑦[0] = ℎ − 1. As ℎ ≤ 𝑂 ≤ 𝐿 we see that 𝑦 0 did not wrap around. ● All nodes are blue at this point, and 𝑦 𝑂 = 𝑦 0 . ● As now all nodes are blue, and have decreasing values, we must have 𝑦 𝑗 = ℎ − 1 for all nodes Jaap-Henk Hoepman // Radboud University Nijmegen // 7-3-2016 // Fault Tolerance - Self-stabilisation 11 4

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend