Time Optimal Asynchronous Self-stabilizing Spanning Tree
Janna Burman and Shay Kutten
Dept. of Industrial Engineering & Management Technion, Haifa 32000, Israel. bjanna@tx.technion.ac.il, kutten@ie.technion.ac.il
Abstract. This paper presents an improved and time-optimal self-stabilizing algorithm for a major task in distributed computing- a rooted spanning tree construction. Our solution is decentralized (“truly distributed”), uses a bounded memory and is not based on the assumption that either n (the number of nodes), or diam (the actual diameter of the network), or an existence of cycles in the network are known. The algorithm assumes asynchronous and reliable FIFO message passing and unique identi?ers, and works in dynamic networks and for any network topology. One of the previous time-optimal algorithms for this task was designed for a model with coarse-grained atomic operations and can be shown not to work properly for the totally asynchronous model (with just “read” or “receive” atomicity, and “write” or “send” atomicity). We revised the algorithm and proved it for a more realistic model of totally asynchronous networks. The state in the presented algorithm does not stabilize until long after the required output does. For such an algorithm, an increased asynchrony poses much increased hardness in the proof.
1
Introduction
A system that reaches a legal state starting from an arbitrary one is called selfstabilizing [15]. The stabilization time is the time from the moment of the faults till the system reaches a legal state. The task of a directed spanning tree construction requires the marking, in each node, of some of the node’s edges such that the collection of marked edges forms a tree. Moreover, we mark in each node the edge leading to its parent on the tree. Given a spanning tree, most of the major tasks for distributed network algorithms become much easier, including the tasks of reset, broadcast, topology learning and updating, mutual exclusion, voting, committing, querying, scheduling, leader election, and others. In this extended abstract, we directly address only the task of constructing a spanning tree with diam height in O(diam) time, but this, together with the method of [9], also yield an O(diam) time reset algorithm. In the context of self stabilization, it was observed that a self stabilizing reset protocol can translate
a non-self stabilizing protocol into a self stabilizing one [4, 9, 6, 7]. Another application of a reset protocol is to translate protocols that use unbounded event counters (e.g. sequence numbers of messages) to use bounded ones [8]. These applications of the reset simplify protocols’ design. An optimal self stabilizing algorithm for constructing a spanning tree was presented in [7]. As opposed to some previous protocols, it was not based on the assumption that either n (the number of nodes) or diam (the actual diameter of the network) or an existence of cycles in the network were known. It used a bounded memory. For that, it assumed that some bound on the diameter was known. This bound may have been very large, but it did not a?ect the time complexity which was O(diam). The e?ect of the bound on the size of the memory was only polylogarithmic. The algorithm of [7], however, was designed for a model with a coarse-grained atomicity. That is, it assumed that a node could read a value of a variable written by a neighbor, and also perform an operation based on that read value in one atomic step. Only after both actions, could the neighbor change the value of its own variable. We show a scenario where that algorithm does not yield a correct result in a totally asynchronous model, that is, when an atomic operation contains either a read, or a write, but not both. In this paper, we modify the algorithm of [7] so that it functions correctly also in a fully asynchronous environment. We kept the main algorithmic ideas of [7], while adding a few tricks that may prove useful in translating also other such protocols. The main contribution of this paper is in the proof that the algorithm after the changes is correct for the more realistic asynchronous send/receive atomicity model (similar to ”Atomic Read/ Atomic Write” model, but for a message passing model). Note, that combining the algorithm in [7] with some existing e?cient transformer (e.g. [10] or [26]) to re?ne the atomicity would not have yielded optimal stabilization times in our case. When such a transformer is combined with a coarse-grained atomicity algorithm, a resulted ?ne-grained atomicity algorithm su?ers from a reduction in concurrency due to the mutual exclusion procedures used in the implementation of the transformer. This loss of concurrency results in a higher than a constant delay (up to ? (n)) between two successive atomic step executions by a particular process. Numerous self stabilizing reset and spanning tree algorithms that were less e?cient than [7] also appeared. We mention some of them below. Many (starting with [17]) championed the claim that a self stabilizing algorithm should use ?negrained atomic operations. We did not see how to use the methods used in [17] (for making their O(n) time algorithm work under ?ne-grained atomicity) to make the algorithm of [7] also work for ?ne-grained atomicity. We note that moving from a coarse-grained atomicity to ?ne-grain atomicity of operations is even impossible for some tasks in some models, and for other tasks it is tricky. This is especially tricky for the algorithm of [7], since the latter keeps multiple trees that do not stabilize in the required time O(diam), although the output in their algorithm does stabilize in O(diam) time in the coarse-grained operations 2
model. The proof requires us to reason about these not yet stabilized trees (to show that they do not prevent also the output tree from stabilizing in O(diam) time). Proving a property of a not yet stabilized tree is made more di?cult by the asynchronicity. It would have been much easier to prove that some condition holds for a node in a tree if it was certain (as it is in a coarse-grain atomicity model) that this node has not lost its parent or children without yet knowing about the loss. The current paper makes this necessary step from coarse-grained atomicity to asynchronous networks for the major tasks it solves. Other related work. Due to the lack of space, we give here a somewhat limited survey of the related work for the well-studied problem of a spanning tree construction. A thorough survey is deferred to the full paper. In [3], a spanning tree construction algorithm with the stabilization time of O(n2 ) (where n is unknown) is given. In [18], a randomized spanning tree protocol is given implicitly, with the expected stabilization time of O(diam · log n), where diam is unknown. (If n is known in advance then the stabilization time is O(diam)). In [5], the time complexity is O(diam) (where diam is unknown), but the memory space (and the length of a message) is not bounded. In [2] and [19, 13, 14], generic self-stabilizing solutions solving also the task of a spanning tree construction are given. These papers present algorithms for weaker models (with unidirectional communication links and even with unreliable communications in [13, 14]), but the time-optimal stabilization is not achieved in them. In [20], an algorithm for maintaining a spanning tree for a completely connected topology stabilizes in O(log n/loglog n) with high probability. In [24], a completely connected topology is assumed too, but the model is weaker than in [20]. In [25], a self-stabilizing algorithm for a minimum spanning tree construction is presented for an asynchronous message-passing reliable network. The time complexity in [25] depends on n (and hence, cannot be time-optimal). In addition, that algorithm of [25] requires a bound on the time that it takes to travel over a path of n nodes in the network. In [23] and [1], spanning tree algorithms for ad hoc networks with larger than O(diam) stabilization times are given. For an additional survey one can refer to [21]. Finally, we note that the algorithm presented in this paper can be combined with some hierarchial structure (e.g. [22]) to improve the stabilization time in some favorable settings (see [22]), but not in the worst case. 1.1 Notations and Model of Computation
System Model. The system topology is represented by an undirected graph G = (V, E ), where nodes represent processors and edges represent communication links. Each node has a unique identi?er denoted by ID. The number of the nodes is denoted by n = |V |. The actual diameter of the graph is denoted by diam. We assume that there is a known upper bound on the diameter of the network, denoted by D and called the bound. This upper bound serves only for the purpose of having ?nite space protocols, and does not appear in the time complexity. For v ∈ V , we de?ne N(v ) = {u | (v, u) ∈ E }, called the neighbors of v . We assume that the topology is dynamically changing- node/link addition or removal are possible (and modeled as faults). We consider an asynchronous 3
message passing network model. The message delivery time can be arbitrary, but for the purpose of time analysis only, we assume that each message is delivered in at most one time unit. On each link, messages are delivered in the same order as they have been sent. The number of messages that may be in transit on any link in each direction and at the same time is bounded by some parameter B (independent of the network size). It is necessary, as shown in [16], for self stabilization. We adopt the usual de?nitions of the following: a local state of a node (an assignment of values to the local variables and the program counter); a global state of a system (the cross product of the local states of all the nodes, plus the contents of the links); the semantics of protocol actions (possible atomic (computation) steps and their associated state changes); an execution of a protocol P (a possibly in?nite sequence of global states in which each element follows from its predecessor by the execution of a single atomic step of P). Informally, a distributed system allows the processors to execute steps concurrently; however, when processors execute steps concurrently, we assume that there is no in?uence of these concurrent steps on each other. To catch this formally, we assume that at each given time, only a single processor executes a step. Each step consists of an internal computation and a single communication operation: a send or receive. Every state transition of a process is due to a communication step execution (including the following local computations). Let us call this the send/receive atomicity network model. Fault Model. The legality predicate (de?ned on the set of global states) of our protocol becomes true when the collection of internal variables of the nodes de?nes a spanning tree with diam height, rooted at the minimal ID node. A protocol is called self-stabilizing if starting from a state with an arbitrary number of faults (or from an arbitrary state) such that no additional faults hit the system for “long enough”, the protocol reaches a legal global state eventually and remains in legal global states thereafter. When this happens, we say that the stabilization has occurred. The maximum number of time units that takes to reach the stabilization is called the stabilization time of the protocol.
2
The Algorithm
Due to the ?ne-grained communication atomicity, we use the notion of base and image variables. Each node v has a set of internal variables it writes to- the base variables. Consider some base variable of v , varv . Every neighbor u of v maintains an internal copy of varv in its corresponding image variable varu [v ] at u. The copies get their values from InfoMsg messages sent from v to u repeatedly. Node u reads varu [v ] for algorithm computations. By then, this copy can have a di?erent value then varv at v , if v has changed it meanwhile. This is the main di?culty encountered by the current paper, as opposed to [7]. 2.1 Ideas Adopted from [7] The algorithm runs multiple versions of the Bellman-Ford’s algorithm ([11], see Rule 1 bellow). When running a single version alone, a stabilized tree results 4
(this was observed by [17] for their algorithm which is similar to Bellman-Ford’s, and by [7] and others for Bellman-Ford’s algorithm itself). The stabilized tree is rooted at the minimal ID node in the network and the stabilization occurs in O(D) time when D is given. This is similar e.g. to [6, 17]. Therefore, if the bound parameter D is close to the actual diameter diam, Rule 1 is close to optimal. Unfortunately, typically, a hardwired bound will be much larger than the actual diameter, to accommodate for extreme cases, and to have room for scaling up the network size. Nevertheless, coming up with some bound is pretty easy. Hence, log D + 1 versions are run in parallel. Each version i, 0 ≤ i ≤ log D, executes the Bellman-Ford version with bound parameter 2i . Versions with ”large enough“ bound parameters 2i (the “higher versions”) will stabilize to a desired spanning tree in O(2i ) time. However, the other versions (the “lower versions”) can stabilize only in Θ(n) as demonstrated in [7]. The trick there was that one does not need the smaller i versions to stabilize. One only needs them to detect and inform each node that the version has not yet constructed the required spanning tree which contains all the nodes. This is done by the standard technique of a broadcast over a tree. That is, each version i maintains at each node two bits: up cover and down cover. Using the up cover bit, nodes report towards the root that the version tree has not yet spanned all the nodes; this information is propagated up the tree by having each node take a logical and of the up cover bits of its children repeatedly. The purpose of the down cover bit is to disseminate this fact down the tree, by having each node copy the down cover bit of its parent repeatedly. See Rule 2. Then, each node v selects its output by ?nding the minimum i such that down coverv = 1 for version i. The tree edges of that version are the output of the combined protocol. A Counter Example. In the sequel we mention some problems that prevent one from using the solution of [7] in the weaker model we address in this paper. This is demonstrated in [12]. We just mention here that the problems manifest in the trees that have not stabilized yet, when the output was supposed to have stabilized already. In an asynchronous network, such a not yet stabilized tree may “pretend” successfully to have stabilized, and to cover the network. 2.2 Revising the Algorithm
Now, we introduce three mechanisms we incorporate in the new algorithm to make the ideas above also work in the weaker model we use. ? Non-stabilization Detector: The algorithm of [7] propagated up a tree the information that a node in that tree has a neighbor not in the tree. This information turned out to be unstable, causing the counter example mentioned above. Hence, we augment the above with a new notion of local “non-stabilization detector”: if it looks as if any of the neighbors of a node v may still have to change its state, then v observes that the current con?guration is not yet stable, and propagates this observation upwards. Below, we give the formal de?nition of this idea- the consistentv predicate for each node v (see Def. 1). Now, in contrast 5
with the previous solution, the de?nitions of the rules for the up cover and down cover bits use the consistent predicate. They are given in Rule 2. ? Strict Communication Discipline: We adopt this module as is from [4] (see Sec. 2.3 below). The main property of the discipline is that before a node v may change any of its base variables, all its neighbors “know” the value of v ’s base variables (see Lem. 3). The proof of Lem. 4 uses this property heavily. ? Local Reset: We use this mechanism to ensure the following. Whenever a node v changes the values of its tree variables (line 6, Fig. 1), no neighbor u considers v as its parent (that is, paru = v ). Note, that in our model this is not a condition that v can check directly, since by the time v reads u’s variables, u may have changed them. We found this property very useful in several places in the proof. For example, whenever a node joins a tree on some speci?c path called a legal branch (see Def. 4), it joins initially with up cover = 0 (Lem. 7). This helps to prove that even though the lower versions trees do not stabilize (at least, not in O(diam) time), their up cover does stabilize to zero (Lem. 8). Lines 6-11 (Fig. 1) implement the local reset mechanism. Intuitively, the main di?culty the revised algorithm overcomes is the following. Consider a branch of a tree whose nodes have a root value of v , however, they are disconnected from v . (A formal de?nition of this structure we call a “sprig” will follow (Def. 3).). Note, that v does exist. This phenomena is di?erent than branches of a tree of a ghost root- that is, branches with nodes whose root variable contains an ID of a node that does not exist. The latter branches disappear by time Tvalid = O(2i ) for version i (see Def. 2 and Obs. 1), but new sprigs can be created by the lower versions of the algorithm up until time ? (n). We needed the above revisions in the algorithm to show that the sprigs do not confuse themselves to have up cover = 1 (Lem. 10). Nor do they confuse the nodes of the legal tree of v (nodes that are indeed connected (via a chain of parent markings) to v where v is the value of their root variable). See Lem. 8. 2.3 Algorithm Details
Variables. Graph property variables are represented using a boldface font, as in dist(v, u), which denotes the true distance in the graph between nodes v and u. Protocol variables are represented using teletype font. The variables appearing and manipulated in the code of Fig. 1 are local protocol variables of process v . The subscript v of each variable in the ?gure emphasizes that fact. Each node u ∈ V sends a set of values of its internal variables (its base variables) periodically to all its neighbors by an InfoMsgu message. Each neighbor v of u copies these values to its local copies (to simplify the code, we omit this simple operation from the algorithm code, but we assume it is performed for each message receive event). The copy of a variable varu at v (an image variable of u at v ) is denoted by varv [u]. Node u does not send its neighbors any of its image variables. Note, that for each v ∈ V, in addition to N(v ), we use a variable Nlistv which is a local list of node identities such that the incident link from v to each 6
node u in the list is believed to be operational and the processor at each such node u is also believed to be up. A detailed verbal explanation of all the variables appears in the appendix. Communication Discipline. Every node sends the InfoMsg messages repeatedly, using the communication discipline mentioned above. We treat the discipline as being embodied by a lower layer process. Whenever any InfoMsg arrives at some node v , the discipline (layer) copies the content of the message into the appropriate image variables. Then, the communication discipline can decide either to pass the message to the higher level algorithm or to discard it. If it decides to pass the message, we say that InfoMsg is accepted (then, this message is processed according the code in Fig. 1). Otherwise, we say that InfoMsg is received (and no further processing takes place). Note, that in both cases, v copies the message content (receives InfoMsg). For the complexity analysis, we de?ne a time step, which is the maximum time for a message to get accepted (it is equivalent to 3 time units). Lem. 3 states the communication discipline property formally. Algorithm Formal De?nition. The formal code for version i of the algorithm in node v appears in Fig. 1; it applies Rules 1 and 2 below. In each iteration, node v outputs its tree edges according to the lowest version in v for which up cover = 1. Rule 1. (Bellman-Ford with IDs and Bound Parameter D (used in [11, 17, 6] for a single version)) Let v be a node. 1. rv ← min {IDv , rv [u]} where u ∈ Nlistv and dv [u] < D. 1 + min {dv [u] : u ∈ Nlistv and rv = rv [u]} , if rv = IDv , 2. dv ← 0, if rv = IDv . De?nition 1. (Used in Rule 2 below) ? The parent of a node v , denoted parv , is (supposed to be) the smallest ID neighbor u of v for which rv = rv [u] and dv [u] = dv ? 1, if rv = v . Otherwise, if rv = v , the parent of v is null. ? The children of v , denoted childv , are the set {u | parv [u] = v }. ? Given a node v , the predicate consistentv is de?ned by u∈Nlistv (rv [u] = rv ) ∧ (|dv [u] ? dv | ≤ 1) ∧ (u = parv ?? u is the minimal ID node such that dv ? dv [u] = 1) ∧ (dv [u] ? dv = 1 =? parv [u] ≤ v ) ? Node v is a consistent node, if consistentv = true.
7
Rule 2. Calculate parv , childv , and consistentv such that they conform to Def. 1. ? consistentv , if childv = ? ? ? ? ? ? up coverv [u] , if childv = ? ∧ consistentv up coverv ← ? ? u ∈ child v ? ? ? 0, otherwise down coverv ← down coverv [parv ] ∧ consistentv , if rv = v up coverv ∧ consistentv , if rv = v
Procedure Send() (* sending InfoMsgv *) 1 Send InfoMsgv ≡ [ rv , dv , parv , local resetv , up coverv , down coverv ] to Nlistv Procedure LocalReset() (* performing a local reset *) 2 rv ← v, dv ← 0, down coverv ← up coverv ← 0, local resetv ← true 3 Send() Do forever: 4 Send()
(* this line is executed atomically *)
Upon accepting InfoMsgv [u] message from neighbor u ∈ Nlistv (* the following is executed atomically (not including reception) *) 5 Use Rules 1, 2 to calculate temporary variables as follows: a temporary variable t varv for each varv computed in the rules. *) *) *) *)
(* tree changes generate a local reset if [ ?local resetv ∧ (parv = t parv ∨ rv = t rv ∨ dv = t dv ) ] (* reset propagates down the tree 7 [ ?u ∈ Nlistv : local resetv [u] ∧ u = parv ] (* local reset not yet completed 8 [ local resetv ∧ ?u ∈ Nlistv : parv [u] = v ] (* candidate parent reset not yet completed 9 [ local resetv ∧ t parv = null ∧ local resetv [t parv ] ] then 10 LocalReset() 11 return end if (* a local reset exit (if local resetv switches from true to false ) 12 local resetv ← false 6 13 Update each variable varv by the value of the temporary variable t varv . 14 Send() Fig. 1. Algorithm for version i at node v .
*)
8
3
Preliminary Analysis
In the following analysis, we prove stabilization assuming that there are no faults or topological changes after some time t0 = 0 (at least till the time when algorithm reaches a global legal state). From now on, let us consider the execution of the algorithm after time t = 2 (2 time steps after time t0 ). It is easy to see that after 2 time steps, no damaged (by faults) messages exist in the network and at least one authentic InfoMsg message has been accepted at each node from each neighbor. De?nition 2. Consider a node v ∈ V and some time T . ? Let Tvalid be time t0 + 2 + 2i . ? Node v is a root node, if rv = IDv (≡ v ). ? If rv ∈ / {IDu |u ∈ V}, then rv is a ghost root. ? Let u ∈ N(v ). If at time T , varu [v ] = varv , we say that node u knows the value of varv at T . def ? The depth of v is depth(v ) = max {dist(v, u) | u ∈ V }. ? Let us denote by vmin the node with the minimum ID that exists in the network. ? Let us denote by a local reset operation of node v an invocation of the LocalReset() procedure at line 10 of the algorithm code. ? Assume that node v performs a local reset operation at time T . We denote by a local reset exit the ?rst time after time T when v executes line 12 of the algorithm code. ? Let us denote by a local reset mode the state of node v between a local reset operation and the subsequent local reset exit at v . Lem. 1 is a rather known property of Bellman-Ford’s algorithm. Its proof, as well as some of the following proofs, appear in the appendix. Lemma 1. Starting from any initial assignments of the d and r variables, for any node v , after t time steps, dv ≥ min(t, dist(rv , v )). The following observation follows from Lem. 1 and Rule 1-1. (Note, that for every ghost root v , dist(rv , v ) = ∞.) Observation 1. After time Tvalid , for every node v , rv is not a ghost root and dv ≥ dist(v, rv ). De?nition 3. Let 0 ≤ k ≤ n ? 1. Let v, xj ∈ V for each 0 ≤ j ≤ k . ? A parent path of v to x0 is a path of nodes (x0 , x1 , x2 , . . . , xk = v ) such that for each j , rxj = rv and for each 1 ≤ j ≤ k , parxj = xj ?1 . ? We say that v is a descendant of x0 and x0 is an ancestor of v . ? A connection path of v is a parent path of v to rv = x0 such that x0 is a root node. Let us denote a connection path of v (to x0 ) by Cv (x0 ). ? Node v is connected (to node x0 ) if there is a connection path of v (to x0 ). ? Let rv = x0 . If node v has no connection path, node v is disconnected (from node x0 ). ? Let a sprig of v be a maximal set of nodes X ? V that satis?es the following 9
conditions: (1) v ∈ / X ; (2) ?x ∈ X , rx = v and x is disconnected; (3) Every ancestor or descendant x of any x0 ∈ X is in X . ? Let us denote some sprig A of v at time t by Av (t). The following is one of our main lemmas. Informally, we found it harder to ensure properties for sprigs than for legal trees. We use the following lemma to show a property of sprigs that is somewhat similar to the property of the d values in a legal tree (shown in Lem. 1). Informally, the lemma shows that a node’s d is high if it remains in the same sprig for a long time (an “old” sprig). This may not hold for a node who leaves one sprig (having a high d) and joins another (with a lower d). Such a leave and join may happen even very late in the execution since new sprigs may be created very late in the execution. However, we handle such “new” sprigs later. Lemma 2. Consider nodes v, u ∈ V. Let t ≤ 2i and t1 > 2. Let α be the following set of assumptions on u: (1) node u is disconnected from ru = v ; (2) node u does not change its ru value. If α holds for u during the whole time interval [t1 , t1 + 2t], then at time t1 + 2t, du ≥ t. Proof: By induction on the time t. For t = 0, the lemma holds since du ≥ 0 always. Assume, that the lemma holds for t = k for every node. For t = k + 1, assume that α holds for some node u for the time interval [t1 , t1 +2(k +1)]. By Def. 3, node u is disconnected and is not a root. Hence, paru = null. When α holds, node u cannot change its paru . Otherwise, the condition at line 6 must hold beforehand and then, u must perform a local reset and assign ru ← u (becoming a root) in line 2- a contradiction to α. Hence, for some node w ∈ Nlist(u), w = paru throughout [t1 , t1 + 2k + 2]. Let δ be an InfoMsg message sent by w at time tsent and accepted at u at time trcv such that δ is the last such message accepted at u by time t1 + 2k + 2. By our model assumptions on communication links , trcv ≥ t1 + 2k + 1. Hence, δ is sent by w at tsent ≥ t1 + 2k . Clearly, α holds at w during [t1 , tsent ], otherwise it would not have held in u during [t1 , trcv ] since in that case, either u would have been connected (if w would have been connected) or some InfoMsgw and line 6 of the algorithm would have caused u to perform a local reset and become a root, violating α. Hence ,by the induction hypothesis, dw ≥ k at time t1 + 2k . Moreover, if dw is changed at w in [t1 , tsent ], then InfoMsgw stating this fact and line 6 of the algorithm would have caused u to perform a local reset and become a root, violating α. Thus, dw stays unchanged during [t1 , tsent ]. Hence, dw ≥ k during the whole interval [t1 , tsent ] and du [w] ≥ k at u during the whole of [t1 , t1 + 2k + 2]. Now, since paru = w during whole [t1 , t1 + 2k + 2], by Rule 1, du = du [w] + 1 during whole this interval. Thus, at time t1 + 2k + 2, du = du [w] + 1 ≥ k + 1. Hence, the lemma holds for t = k + 1 too. The following lemma is ensured by the communication discipline, which we assume for the algorithm (Sec. 2.3). The lemma is proved in [4]- Lem. 4.2 (we reworded it somewhat).
10
Lemma 3. [4] If at some time T , after time t0 + 2, some node v changes any of its base variables, then, at time T , ?u ∈ Nlist(v ), for every base variable varv of v , varu [v ] = varv (u’s image variable of varv equals varv at time T ; or in other words, node u knows the value of varv at time T ). The following lemma can be proven for our algorithm, but not for the previous one [7]. This proved to be a major reason why our algorithm can function in the atomic send / atomic receive model. Lemma 4. Assume that at some time T , some node u with paru = w assigns paru ← w. Then, at time T , ?v ∈ Nlist(u), parv = u. Proof: (see Fig. 2) To prove the lemma, we assume, by way of contradiction, that some neighbor v of u assigns1 parv ← u before or at time T and parv = u holds till time T inclusively. To assign a new value to paru , node u must execute line 13 at time T (also see line 5 and Rule 2). This requires that just before that, u was at a local reset mode and performed a local reset exit at line 12. Since u changes a base variable at time T , by Lem. 3, every neighbor v of u (and v in particular) knows at time T that u is in a local reset mode (local resetv [u] = true). Let time T ? be the ?rst time when v assigns local resetv [u] ← true such that local resetv [u] stays true until T inclusively. Recall, that we consider the execution of the algorithm after time t=2. Thus, due to line 9, it is guaranteed that v does not change its parent pointer to point at u (v cannot perform parv ← u) throughout [T ? , T ]. (A local reset operation and then its exit must precede any tree structure base variables change at any node; but, a local reset exit is impossible at v in [T ? , T ], since the condition at line 9 holds throughout this time interval.) Let T x be the time when v sends message InfoMsgv x that is the last one to be received by u from v before time T . This message must cause u to set paru [v ] = u, otherwise, by the condition at line 8, u cannot become a descendant of w at time T . Hence, at time T x , parv = u holds at v . Thus, and due to the guarantee of the time interval [T ? , T ] that we have shown above, we must assume that v assigns parv ← u (at line 13) in the time interval (T x , T ? ). When this happens, v sends an InfoMsgv (= y ) (line 14) with this new information (parv = u) to u. Recall, that by Lem. 3, node u learns at some time T ? < T ?? < T that its neighbor v knows that u is in a reset mode (at T ?? , u learns that local resetv [u] = true). Clearly, for u, to learn this, a message of the communication discipline should be sent from v at or after time T ? and should be received at u before time T . Let z be such a message. We proved above that message y is sent before T ? . Hence, and because of the FIFO assumption for each link, message y is received at u before message z , and thus, before time T ?? and before T. Since InfoMsg message y is sent after InfoMsg message x, it should arrive at u after x, but we have assumed above that x is the last InfoMsg to be received from v just before time T . Thus, y = x - a contradiction since these messages bear di?erent information about the value of parv .
1
The algorithm calculates the par pointers by invoking Rule 2 at line 5 and then assigns them at line 13 (Fig.1).
11
Fig. 2. Illustration for the proof of Lem. 4.
4
Analysis of the Lower Version Case (2i < depth(vmin ))
In this section we prove for every node in every lower version that down cover = 0 holds in O(2i ) time steps and remains thereafter. First, we prove this for legal trees and then for sprigs. To prove this for legal trees, we use the notion of a legal branch. There may be several shortest pathes between two nodes, however, only one of them is a legal branch as de?ned in the following de?nition. (This matches the way parent pointers are chosen by the algorithm. See Def. 1.) De?nition 4. Let u, v ∈ V. ? Node u is foreign to node v if after time Tvalid , ru = v always. ? Let u be foreign to v . A legal branch Rv (u) of v via u is a shortest path between v and u (x1 = v, x2 , . . . , xk = u) s.t. 1 ≤ k ≤ dist(v, u)+1 and for each 2 ≤ j ≤ k , xj ?1 is the smallest ID neighbor of xj with dist(v, xj ?1 ) = dist(v, xj )?1. Denote the length of Rv (u) by |Rv (u)|. Lemma 5. If 2i < depth(vmin ), then after time Tvalid , for each node v , there exists a node u such that u is foreign to v and either (1) dist(v, u) = 2i + 1 or (2) dist(v, u) < 2i + 1 and v > u. De?nition 5. Let v, w ∈ V. ? Let fv = u for some u as de?ned in Lem. 5. ? A zero path of node v , denoted by Zv , is a maximal parent path of nodes {x|?Cx (v ) ∧ x ∈ Rv (fv )}. A fringe node of Zv is w ∈ Zv which is the furthest node (in the number of hops) from v . Then, we also denote a zero path of v by Zv (w) and its length by |Zv (w)| (or |Zv |). Note, that every root node has a zero path (possibly containing only a root). A zero path may change dynamically in time. Nodes may join and leave a zero path. A zero path may “disappear” (if a root node stops to be a root) and “reappear” (if a node becomes a root again). The set of a zero paths stabilizes only in ? (n) in some cases. Yet, by the following lemmas 6 to 8, we show that up cover = 0 holds at every node of every zero path that exists after a certain time that is O(2i ) time after the starting time t0 . The proofs of lemmas 6 - 8 establish an induction on the order of the nodes on a zero path, starting from a fringe node neighboring to a foreign node and proceeding to the root. This induction shows that up cover stabilizes fast to zero over each legal tree (one 12
that has a root, as opposed to a sprig that is disconnected from its root). Let us comment that we needed to introduce the notion of a legal branch for these proofs. Moreover, we needed to add the check (in the consistent predicate, in Rule 2) that a branch is indeed legal when up cover is updated. The proofs of lemmas 6 - 8 appear in the appendix. Lemma 6. Let v, u ∈ V and let u be a fringe node of Zv (a zero path of v ), at some time T after Tvalid . Then, if u stays a fringe node of Zv , in at most 1 time step, at time T ≤ T 1 ≤ T + 1, up coveru ← 0. Lemma 7. Let v, u ∈ V and assume that u joins a Zv (a zero path of v ) at some time T after Tvalid (u was not in Zv just before T ). Then, at time T , up coveru ← 0. Lemma 8. Let v, u, x ∈ V and Zv (x) be a zero path of v . Let u ∈ Zv (x) be such that dist(v, u) = dist(v, fv ) ? j (for some 1 ≤ j ≤ dist(v, fv ) ≤ 2i + 1). Then, after Tvalid + j time steps, up coveru = 0. Let Tccover be the time that is Tvalid + 2 · 2i time steps after time t0 . By Lem. 8 (that establishes that after Tvalid + 2i for each root node up cover = 0), by Rule 2 (that establishes a down the tree propagation of down cover bits) and by Lem. 4 (that establishes that a node joins a tree alone), it is easy to show that down cover also stabilizes fast to zero on legal trees. Thus, Lem. 9 also follows. Lemma 9. After Tccover time for any connected node u ∈ V the following holds: down coveru = 0. To conclude the analysis for versions for which 2i ≤ depth(vmin ), we need to show that for every node u that is disconnected from ru (u is a sprig node), down coveru = 0 holds in O(2i ) time steps and remains such thereafter too. First, let us formalize all the possible modi?cations that a sprig can encounter. Note, that there may exist several sprigs of v at the same time. De?nition 6. Consider a sequence of events in the execution, and let tj be the time of the j -th event in the sequence. Consider some sprig Av (tj ). At time tj +1 , we consider sprigs that are non-empty and that have non-empty intersection of nodes with the original sprig Av (tj ). That is, we consider all the possible modi?cations of Av at time tj +1 : (1) Sprig Av (tj +1 ) = ? is one of the following: – Av (tj ). – Av (tj ) ∪ {x} (a join of node) for some x ∈ / Av (tj ). Note, that by Lem. 4, x has no descendants at tj +1 . (Also, w.l.o.g., no two events happen at the same time.) – Av (tj ) \ {y } (a loss of node) for some y ∈ Av (tj ). (2) Non-empty sprigs A1v (tj +1 ), A2v (tj +1 ), · · · such that A1v (tj +1 ) ∪ A2v (tj +1 ) ∪ · · · {z } = Av (tj ) (a split of sprig Av ) for some node z who left sprig Av at tj . Lemma 10. Let u ∈ V be disconnected from ru at some time after Tccover + 2 · 2i + 1. Then, down coveru = 0. 13
Proof: Any disconnected node u belongs to some sprig. Let ru ≡ v . Only two kinds of sprigs exist after time Tccover by Obs. 1: (1) The old sprigs- these that already exist at Tccover . We consider them old even when they get modi?ed later. Moreover, if Av (tj ) is an old sprig at some time tj ≥ Tccover , then any resulting sprig Av (tj +1 ) (see Def. 6) at time tj +1 > tj is an old sprig too. (2) The new sprigs- those that are newly created after Tccover by the following event. Let w ∈ V be a root node and X be a set of the nodes connected to w (X is a tree rooted at w). When some node x ∈ X (not a leaf node) leaves X , a new sprig (or sprigs) of w is (or are) created. When a new sprig gets modi?ed, the resulting sprig (or sprigs) is (or are) considered a new sprig (or sprigs). First, let us consider a set of the old sprigs Φ after Tccover . We show that after at most 2 · 2i +1 time steps Φ = ?. By Lem. 2, in at most 2 · 2i time steps, for any disconnected node x ∈ B ∈ Φ, dx ≥ 2i . Hence, x leaves sprig B in at most additional 1 time step by Rule 1-1, if x has not left sprig B before that. Any node y ∈ / B ∈ Φ that joins sprig B , assigns dy ← dy [z ] + 1 for z ∈ B such that pary ← z . Thus, by this and Lem. 2, for any x ∈ B ∈ Φ, at time Tccover + 2t, dx ≥ t. Hence, after at most Tccover + 2 · 2i + 1, Φ = ?. Finally, let us consider a new sprig A that is newly created at some time T > Tccover . By Lem. 9, at time T for each x ∈ A, down coverx = 0. After time T sprig A can change in time: split or join/loose nodes. Note, that no sprig has a root node and thus, by Rule 2, the down coverx bit is calculated by down coverx ← (down coverx [parx ] ∧ consistentx ). Hence, a split or a loss of nodes cannot switch the down cover bit (from 0 to 1) at the resulting sprig/s. Now, consider the case that some node y ∈ / A joins A. Node y becomes a child of some node x ∈ A. At that time, by Lem. 4, a local reset exit occurs at y (line 12) and then, at line 13, node y assigns pary ← x and by Rule 2, adopts the down coverpary which is 0 as shown above. Thus, any node that joins A adopts 0 in its down cover (recall that by Lem. 4, it joins alone). Hence, after time Tccover + 2 · 2i + 1, only new sprigs exist (beside legal trees) and every disconnected node u in such a sprig has down coveru = 0.
5
Analysis of the Higher Version Case (2i ≥ depth(vmin ))
Lem. 11 is important to prove the stabilization of the higher versions (Lem. 12). Speci?cally, it helps to show that a local reset mode in each node has ?nite duration. Lemma 11. Assume a node v performing a local reset operation at some time T . (1) Then, in at most 2 time steps after time T , unless the condition of line 9 holds, v performs line 12 of the algorithm (a local reset exit occurs at v ). (2) During the local reset mode at v (starting at time T and till the local reset exit at v ), up coverv = 0 and down coverv = 0. Lemma 12. If 2i ≥ depth(vmin ), then version i stabilizes in O(2i ). 14
Lemma 13. If 2i ≥ depth(vmin ), then in O(2i ) time, at every node v , down coverv = 1 for version i. Lem. 12 above establishes that there exists some higher version that stabilizes at time O(diam). Lem. 13 establishes that in this version down cover ← 1. Hence, the algorithm can output the tree this version produces. Recall that Sec. 4 shows that in lower versions down cover stabilizes to zero. All these establish the following theorem. Theorem 1. In O(diam) time units, the algorithm produces a shortest paths tree rooted at the minimal ID node in the network. The proofs (see the appendix) in this section are rather similar to the proofs for the higher versions in [7] except for one important point. The local reset we use here has the potential to destabilize these versions. We show that a local reset always ends. Moreover, since a local reset is transferred to children, not to parents, the reset does not destabilize the tree rooted in the minimal ID node.
References
1. S. Abbas, M. Mosbah, A. Zemmari. Distributed Computation of a Spanning Tree in a Dynamic Graph by Mobile Agents. IEEEIS’06. 2. Y. Afek, A. Bremler-Barr. Self-stabilizing Unidirectional Network Algorithms by Power-Supply. SODA’97. 3. Y. Afek, S. Kutten, M. Yung. Memory-E?cient Self-Stabilizing Protocols for General Networks. WDAG’90. 4. Y. Afek, S. Kutten, M. Yung. The Local Detection Paradigm and its Applications to SelfStabilization. TCS 97, 186(1–2). 5. S. Aggarwal, S. Kutten. Time Optimal Self-stabilizing Spanning Tree Algorithms. FSTTCS’93. 6. A. Arora, M. G. Gouda. Distributed Reset. FSTTCS’90. 7. B. Awerbuch, S. Kutten, Y. Mansour, B. Patt-Shamir, G. Varghese. Time Optimal Self-stabilizing Syncronization. STOC’93. 8. B. Awerbuch, B. Patt-Shamir, G. Varghese. Bounding the Unbounded. INFOCOM’94. 9. B. Awerbuch, B. Patt-Shamir, G. Varghese. Self-stabilization by Local Checking and Correction. FOCS’91. 10. J. Beauquier, A. K. Datta, M. Gradinariu, F. Magniette. Self-stabilization Local Mutual Exclution and Daemon Re?nement. DISC’00. 11. R. Bellman. On routing problem. Qurterly of Applied Mathematics, 16(1):87–90, 1958. 12. A Counter Example. http://tx.technion.ac.il/?bjanna/. 13. S. Dela¨ et, B. Ducourthial, S. Tixeuil. Self-stabilization with r-operators in Unreliable Directed Networks. TR 1361, LRI, 2003. et, B. Ducourthial, S. Tixeuil. Self-stabilization with r-operators revised. JACIC, 2006. 14. S. Dela¨ 15. E. W. Dijkstra. Self-stabilization in spite of Distributed Control. Comm. ACM, 17:643–644, 1974. 16. S. Dolev, A. Israeli, S. Moran. Resource Bounds for Self-stabilizing Message Driven Protocols. PODC’91. 17. S. Dolev, A. Israeli, S. Moran. Self-stabilization of Dynamic Systems Assuming Only Read/Write Atomicity. DC 94, 7:3–16. 18. S. Dolev, A. Israeli, S. Moran. Uniform Dynamic Self-stabilizing Leader Election (extended abstract). WDAG’91. 19. B. Ducourthial, S. Tixeuil. Self-stabilization with r-operators. DC, 14(3):147-162, 2001. 20. V. K. Garg, A. Agarwal. Distributed Maintenance of A Spanning-Tree Using Labeled Tree Encoding. Euro-Par’05. artner. A Survey of Self-Stabilizing Spanning-Tree Construction Algorithms. TR, EPFL, 21. F. C. G¨ Oct 2003. artner, H. Pagnia. Time-E?cient Self-stabilizing Algorithms Through Hierarchical Struc22. F. C. G¨ tures. SSS’03. 23. S. K. S. Gupta, P. K. Srimani. Self-stabilizing Multicast Protocols for Ad Hoc Networks. JPDC, 63:1, 2003.
15
24. T. Herault, P. Lemarinier, O. Peres, L. Pilard, J. Beauquier. Self-stabilizing Spanning Tree Algorithm for Large Scale Systems. TR 1457, LRI, Aug. 2006. 25. L. Higham, Z. Liang. Self-stabilizing Minimum Spanning Tree Construction on Message-Passing Networks. DISC’01. 26. M. Nesterenko, A. Arora. Stabilization-Preserving Atomicity Re?nement. J. Parallel Distrib. Comput., 62(5):766–791, 2002.
16
Appendix
Algorithm Variables. Each node v in the system has the following set of variables (excluding the image variables) for every version i of the protocol: Variables related to the tree structure: 1. rv is a node identity which in legal states is supposed to be the root identity of the tree to which node v belongs (root variable in short); (we omit the word “supposed” in the sequel). 2. dv is a non-negative integer which is the distance, in the tree, from node v to the tree root rv . 3. parv is the identity of the parent of v in the tree (parent variable in short). See Def. 1. 4. childv is a local list of node identities such that each node u in the list is the child of v in the tree (children set in short). See Def. 1. Variables related to the detection and broadcasting the event that version i is covering the whole network (by one spanning tree): 1. consistentv is the value of the local predicate de?ned in Def. 1. The purpose of maintaining such a variable is to be able to detect locally an illegal state of the spanning tree construction. 2. up coverv is a bit that is used to propagate information (about the consistency of the tree and its covering the whole network) up the tree, by taking repeated logical and of the up coverv bits of the children of the node. In addition, the consistency of v (the value of consistentv ) is taken in consideration in each up cover bit calculation. See Rule 2. 3. down coverv is a bit that is used to propagate information down the tree, by copying the down cover bit of the parent. In addition, the consistency of v (the value of consistentv ) is taken in consideration in each down coverv bit calculation. In the root node the down cover bit is calculated only by the consistency predicate. See Rule 2. 4. The temporary variables set. Each variable in the set corresponds to each (non-temporary) variable varv that already given above and is denoted by t varv . The variables in the set are: t rv , t dv , t parv , t childv , t consistentv , t up coverv , t down coverv . 5. resetv is a bit that is used to denote the fact that the node is performing a local reset of the tree structure (by a reset of the tree structure variables above). This bit is used to ensure that when any node v changes its tree structure variables, it have no neighbor thinking that v is its parent. More variables... 1. InfoMsgv is a composite variable the components of which are message ?elds that are sent from node v to all its neighbors periodically (Fig. 1, line 1). InfoMsgv ≡ [rv , dv , parv , local resetv , up coverv , down coverv ]. i
2. Nlistv is a local list of node identities such that the incident link from v to each node u in the list is believed to be operational and the processor at each such node u is also believed to be up. This list is maintained by a lower layer self-stabilizing protocol which is beyond the scope of this paper. The lower layer protocol guaranties that each change in a link or node status is eventually re?ected in Nlist. 3. IDv is the read-only (hardwired) identity of node v .
Preliminary Analysis
Lemma 1. Starting from any initial assignments of the d and r variables, for any node v , after t time steps, dv ≥ min(t, dist(rv , v )). Proof: By induction on the time t. First, note that the lemma holds trivially for any node with rv = v and for t = 0 since dv ≥ 0 always. For the inductive step, let v be any node and consider time (t + 1). By the induction hypothesis, after t time steps, we have that for every neighbor u of v , du ≥ min(t, dist(ru , u)). By the algorithm code, there are two options for node v to update its dv and rv variables: 1. to assign rv ← v by performing a local reset operation or by executing line 13 (by Rule 1); 2. to assign rv ← x = v (for some x) by performing line 13, which assigns to dv and rv values using image variables of a neighbor u (see Rule 1). Thus, if at some time after t + 1, v performs option 1 (becomes a root and assigns rv ← v ), the lemma holds trivially as already stated above. Hence, it remains to prove the lemma considering option 2 only. After time t, in at most 1 time step, node v accepts at least one InfoMsgv [u] from each neighbor u such that the InfoMsgu has been sent by u after time t. By the induction hypothesis, the lemma holds for each neighbor u after time t and thus, it also holds for the image variables of u at v at time t + 1 and after. Assume that at time t + 1, rv = x = v for some x. Now, we consider only the set of neighbors X ? N(v ) such that rv [u ] = x at time t + 1 for each u ∈ X . By Rule 1 and the code, at time t + 1, dv = minu ∈X {dv [u ]} + 1. By the induction hypothesis, dv ≥ minu ∈X {min{t, dist(x, u )}} + 1 = min{t + 1, minu ∈X {dist(x, u )} + 1}, at time t + 1. Note that dist(x, v ) ≤ minu ∈X {dist(x, u )} + 1 and thus, dv ≥ min{t + 1, minu ∈X {dist(x, u )} + 1} ≥ min{t + 1, dist(x, v )} at time t + 1. Hence, the lemma hold for v at time t + 1 and in the same way it holds for v after time t + 1. Observation 1. After time Tvalid , for every node v , rv is not a ghost root and dv ≥ dist(rv , v ). Proof: Note, that any node v with a ghost root is disconnected from its ghost root rv and hence, dist(v, rv ) = ∞. Then, by Lem. 1, after 2i time steps, at any node v with a ghost root, it holds that dv is at least 2i . Recall that in version i, a node v refuses to assign a value larger than 2i to dv (see Rule 1, which is ii
used by the code at line 4). Hence, after 2i time steps, the ghost root at node v is purged from version i of the algorithm. Again, recall that in version i, at any node v , dv ≤ 2i . Hence and by Lem. 1, after 2i time steps, for every node v , dv ≥ dist(v, rv ) holds.
Analysis of the Lower Version Case (2i < depth(vmin ))
Lemma 5. If 2i < depth(vmin ), then after time Tvalid , for each node v , there exists a node u such that u is foreign to v and either (1) dist(v, u) = 2i + 1 or (2) dist(v, u) < 2i + 1 and v > u. Proof: Assume that given a node v there exists a node w such that dist(v, w) = 2i + 1. Then, after time Tvalid , by Obs. 1, if rw = v , then dw ≥ 2i + 1, but this is a contradiction to Rule 1. Thus, rw = v . Thus, w is foreign to v . Let u = w. The lemma holds in this case (case (1)). Otherwise, dist(v, w) < 2i + 1 for every w. In this case, v = vmin , otherwise 2i ≥ depth(vmin ). Then, there is node w such that w < v (at least w = vmin ). Let u = w. The lemma holds in that case (case (2)) too. Lemma 6. Let v, u ∈ V and let u be a fringe node of Zv (a zero path of v ), at some time T after Tvalid . Then, if u stays a fringe node of Zv , in at most 1 time step, at time T ≤ T 1 ≤ T + 1, up coveru ← 0. Proof: Note that the farthest node from v on Rv (fv ) is fv . By the de?nition of a fringe node, u is in Rv (fv ), but is not fv itself (since fv is not in Zv , which is a set of nodes for which the root is v while the root of fv can no longer be v at that time, by Lem. 5). Hence, there exists a node w0 in Rv (fv ) whose distance from v is larger (by one) than dist(v, u). As long as u stays a fringe node of Zv there are two possible cases: – There is some w ∈ Nlist(u) such that rw = v . Then, in at most 1 time step, at time T 1, ru [w] = v and t consistentu ← false and thus, t up coveru ← 0 at line 4 of the code (due to Rule 2) and at line 13, up coveru ← 0 (in all the other lines of the code (lines 5 and 10), where up coveru can be assigned, up coveru ← 0 always). – Otherwise, ?w ∈ Nlist(u) (at least w0 ∈ Nlist(u)), rw = v and in particular rw0 = v . Note, that in this case w0 = fv , otherwise rfv = v and this is a contradiction to Def. 5. If parw0 = u, then by Def. 5, u is not a fringe node of Zv any more. Hence, parw0 = u. Then, in at most 1 time step, at time T 1, paru [w0 ] = u. By Lem. 1, after time Tvalid , dw0 ≥ dist(rw0 = v, w0 ). Thus, by Def. 4, at time T 1, (du [w0 ] > du + 1) ∨ (du [w0 ] = du + 1 ∧ u < paru [w0 ]). This ensures that at time T 1, at line 4 of the code, t consistentu ← false (see Def. 1) and thus, t up coveru ← 0 (due to Rule 2) and at line 13, up coveru ← 0 as is required by the lemma. iii
Lemma 7. Let v, u ∈ V and assume that u joins a Zv (a zero path of v ) at some time T after Tvalid (u was not in Zv just before T ). Then, at time T , up coveru ← 0. Proof Sketch: By Lem. 4, u joins Zv alone (no descendant of u exists at time T ); hence, u becomes a fringe of Zv at time T . Thus, there exists a node w0 ∈ Nlist(u) in Rv (fv ) such that dist(v, w0 ) = dist(v, u) + 1 (see explanation in Lem. 6 above). At time T , the conditions at node u at lines 6-9 do not hold. (Hence, a local reset exit occurs at line 12 and then, at line 13, u joins Zv (the statement u ∈ Zv becomes true).) Since the condition at line 8 does not hold, for each w ∈ Nlist(u), paru [w] = u. There are two possible cases at time T : ?w ∈ Nlist(u) such that ru [w] = v ; or ?w ∈ Nlist(u) (at least w0 ∈ Nlist(u)), ru [w] = v . These cases are similar to those that appear in the proof of Lem. 6 above. The only di?erence here is that the cases hold already at time T , while in the proof of Lem. 6, the cases may hold only at time T 1, when the image variables of each neighbor w are updated at u. Thus, similarly to the proof of Lem. 6, up coveru ← 0 at time T . Lemma 8. Let v, u, x ∈ V and Zv (x) be a zero path of v . Let u ∈ Zv (x) such that dist(v, u) = dist(v, fv ) ? j (for some 1 ≤ j ≤ dist(v, fv ) ≤ 2i + 1). Then, after Tvalid + j time steps, up coveru = 0. Proof: Let T ? be the ?rst time when u has joined Zv and let T ≥ T ? be the time when up coveru ← 0. First, by Lem. 7, if T ? > Tvalid , then up coveru ← 0 at time T = T ? . Now, let us prove by induction on j that if T ≤ Tvalid , then up coveru ← 0 by time Tvalid + j . For j = 1, fv ∈ N(u) and by Def. 5, rfv = v . Thus, u is a fringe node of Zv (x) (u = x) and by Lem. 6 at most at time Tvalid + 1, up coveru ← 0. Assume that the lemma holds for j = k ≥ 1. Let us prove that it also holds for j = k + 1. Consider a node u ∈ Zv (x) such that dist(v, u) = dist(v, fv ) ? (k + 1). There exists a node w ∈ Rv (fv ) such that dist(v, w) = dist(v, fv ) ? k . 1. Let us denote the condition w ∈ / Zv (x) by “condition 1”. If condition 1 holds, u is a fringe node of Zv (x) (u = x). Then, by Lem. 6, in at most 1 time step, at some time Tvalid + k < T ≤ Tvalid + k + 1, up coveru ← 0. 2. Let us denote the condition w ∈ Zv (x) by “condition 2”. If condition 2 holds, then parw = u (and thus, u is not a fringe node of Zv ). In this case, u copies up cover = 0 from w. More formally, by the induction hypothesis, by Rule 2 and the code (Fig. 1), in at most 1 time step, at some time Tvalid + k < T ≤ Tvalid + k + 1, up coveru ← 0. Now, let us show that the value of up coveru at u ∈ Zv stays 0 after time T too in the face of possible changes of Zv . The following cases are possible: a. Assume that at some time T after T , up coveru = 0 holds and Zv changes such that the state of u switches from implying condition 1 to implying condition 2 or vice versa (u ∈ Zv ). In this case the up coveru value stays unchanged (0) (by cases 1 and 2 above). iv
b. Assume that condition 1 (or 2) holds at time T and continues to hold indefinitely, the up coveru value stays unchanged (0) too. c. If u ∈ / Zv (x), the lemma holds trivially. d. Assume that at some time after T , u leaves Zv and then, later, at time T , u joins Zv again and condition 1 holds at T . Then, by Lem. 7, the lemma holds at time T . e. Now, assume that at some time after T , u leaves Zv and then, later, at time T , u joins Zv (u becomes a descendant of v ) again and condition 2 holds at T . This assumption contradicts Lem. 4. Thus, this case is impossible.
Analysis of the Higher Version Case (2i ≥ depth(vmin ))
Lemma 11. Assume that node v performs a local reset operation at some time T. 1. Then, in at most 2 time steps after time T , unless the condition of line 9 holds, v performs line 12 of the algorithm (a local reset exit occurs at v ). 2. During the local reset mode at v (starting at time T and till the local reset exit at v ), up coverv = 0 and down coverv = 0. Proof: 1. Assume that v performs a local reset operation at time T . As a part of that operation, v sets local resetv ← true, resets other base variables and sends an InfoMsgv message with the new values to all the neighbors of v . Note, that due to the local reset operation at v at time T and Def. 1, parv ← null. Hence, after time T (and till the time v performs line 13) the condition at line 7 cannot hold at v and thus cannot prevent it from executing line 12. Then, in at most an additional 1 time step after time T , each neighbor u ∈ N(v ) such that paru = v accepts an InfoMsgu [v ] message informing that v performs a local reset (local resetu [v ] = true). Every such node u invokes LocalReset() at line 10 (a local reset operation too), since the condition at line 7 holds at u. Due to the LocalReset() invocation at u, paru ← null. Now, as long as u thinks that v is still in a local reset mode (see condition at line 9), u cannot switch back its parent pointer to point on v . Thus, in at most an additional 1 time step, at most at time T +2, v accepts an InfoMsgv [u] message from each u ∈ N(v ) informing that now parv [u] = v . At that time, no condition at lines 6-8 at v holds. Then, unless the condition of line 9 holds, node v executes line 12 and thus item 1 of the lemma holds. 2. At time T , node v executes line 2 of the algorithm. At this line, up coverv ← 0 and down coverv ← 0 always. Then, trivially by the code, from time T and till the ?rst time line 12 is executed (local reset exit), up coverv and down coverv are set at line 2 of the algorithm only and thus can be set to 0 only. v
Lemma 12. If 2i ≥ depth(vmin ), then version i stabilizes in O(2i ) (in Tvalid + 2 + 2 · depth(vmin )). Proof: Let us show that in O(2i ) time steps after time Tvalid , a minimal hop tree of the network rooted at vmin is stabilized. Consider a node v ∈ V. We use induction on dist(vmin , v ). By Obs. 1, after Tvalid , there are no ghost roots and thus, by de?nition of vmin , ?x ∈ V : rx ≤ vmin . Thus, by Rule 1 used by the code, after Tvalid , rvmin = vmin , parvmin = null and dvmin = 0 holds at vmin . Now, let us show that local resetvmin is stabilized after time Tvalid too, to the value false. If it so happened that after Tvalid , local resetvmin = false holds already, then the local reset operation is impossible at vmin (the condition at line 6 cannot hold due to Rule 1 (the tree structure parameters are optimal at vmin ); clearly, the conditions at lines 7-9 cannot hold either) and thus, local resetvmin cannot be set to true. Otherwise, if after time Tvalid , local resetvmin = true, by Lem. 11, in at most 2 time steps, vmin performs line 12 (because t parvmin = parvmin = null and thus, the condition at line 9 does not hold) and local resetvmin ← false. Now, by the conclusion for the previous case, local resetvmin cannot switch its value back to true. Thus, after at most Tvalid + 2 time steps the root node v = vmin (dist(vmin , v ) = 0) is stabilized and local resetv = false holds. By this, we proved the induction basis. For the inductive step, consider a node v such that dist(vmin , v ) = k + 1 > 0. By the induction hypothesis, by Tvalid + 2 + 2 · k time steps, the set of nodes X = {u|dist(vmin , u) ≤ k } forms a minimal hop subtree rooted at vmin and for every u ∈ X , local resetu is stabilized to false. In particular, for every node u such that dist(vmin , u) = k the following holds: ru = vmin , du = k , paru points on the appropriate minimal ID node (according to Def. 1) and local resetu = false. Recall, that by Obs. 1, after time Tvalid , for every x ∈ V : dx ≥ dist(x, rx ). If at time Tvalid + 2 + 2 · k , rv = vmin , dv = k + 1, parv is appropriately minimal and local resetv = false, then we done. By Rule 1 and the code, no change is possible in these values, since they are optimal after Tvalid . Otherwise, in at most 1 time step, v receives InfoMsg from every neighbor u and by Rule 1, the temporary variables of v are assigned as follows (using an image variables of the variables of the stabilized (and optimally con?gured) neighbors u placed at distance k from vmin ): t rv ← vmin , t dv = k + 1 (recall, that ?x ∈ V : dx ≥ dist(x, rx )) and t parv is appropriately (by Def. 1) minimal. Then, if local resetv = false, v invokes the LocalReset() procedure due to the condition at line 6; otherwise, local resetv = true and v already performed a local reset operation. By Lem. 11 and by the induction hypothesis (the condition at line 9 does not hold, because the induction hypothesis holds for t parv and thus, local resetv [t parv ] = false), in at most additional 2 time steps, a local reset exit occurs at v and local resetv ← false at line 12. Right after that, by the induction hypothesis, at line 13, v assigns rv = t rv = vmin , dv = t dv = k + 1 and sets parv to the optimum. Then, after vi
Tvalid + 2 + 2 · (k + 1), the values of the local resetv , rv , dvmin and parv variables are stable (unchanged). Thus, the lemma follows for dist(vmin , v ) = k + 1 too. Hence, every version i stabilizes in Tvalid + 2 + 2 · depth(vmin ). Lemma 13. If 2i ≥ depth(vmin ), then in O(2i ) time, at every node v , down coverv = 1 for version i. Proof: By Lem. 12, after O(2i ) time steps, the version i tree is stable, and hence, all the nodes are consistent (see Def. 1). It follows from Rule 2 that in additional depth(vmin ) time steps, all the up coverv bits are set to 1, and ?nally, in additional depth(vmin ) time steps, all down coverv bits will be set to 1 as well.
vii