Self-stabilizing Byzantine Multivalued Consensus: (extended abstract)

Consensus, abstracting a myriad of problems in which processes have to agree on a single value, is one of the most celebrated problems of fault-tolerant distributed computing. Consensus applications include fundamental services for the environments of the Cloud and Blockchain, and in such challenging environments, malicious behaviors are often modeled as adversarial Byzantine faults. At OPODIS 2010, Mostéfaoui and Raynal (in short MR) presented a Byzantine-tolerant solution to consensus in which the decided value cannot be a value proposed only by Byzantine processes. MR has optimal resilience coping with up to t < n/3 Byzantine nodes over n processes. MR provides this multivalued consensus object (which accepts proposals taken from a finite set of values) assuming the availability of a single Binary consensus object (which accepts proposals taken from the set {0, 1}). This work, which focuses on multivalued consensus, aims at the design of an even more robust solution than MR. Our proposal expands MR’s fault-model with self-stabilization, a vigorous notion of fault-tolerance. In addition to tolerating Byzantine, self-stabilizing systems can automatically recover after the occurrence of arbitrary transient-faults. These faults represent any violation of the assumptions according to which the system was designed to operate (provided that the algorithm code remains intact). To the best of our knowledge, we propose the first self-stabilizing solution for intrusion-tolerant multivalued consensus for asynchronous message-passing systems prone to Byzantine failures. Our solution has a Math 1 stabilization time from arbitrary transient faults.


Introduction
We present in this work a novel self-stabilizing algorithm for multivalued consensus in signaturefree asynchronous message-passing systems that can tolerate Byzantine faults.We provide rigorous correctness proofs to demonstrate that our solution is correct and outperforms all previous approaches in terms of its fault tolerance capabilities, and further analyze its recovery time.Compared to existing solutions, our proposed algorithm represents a significant advancement in the state of the art, as it can effectively handle a wider range of faults, including both benign and malicious failures, as well as arbitrary, transient, and possibly unforeseen violations of the assumptions according to which the system was designed to operate.Our proposed solution can hence further facilitate the design of new fault-tolerant building blocks for distributed systems.

Task Requirements and Fault Models
Multivalued Consensus (MVC) The consensus problem is one of the most challenging tasks in fault-tolerant distributed computing.The problem definition is rather simple.It assumes that each non-faulty process advocates for a single value from a given set V .The problem of Byzantine-tolerant Consensus (BC) requires BC-completion (R1), i.e., all non-faulty processes decide a value, BC-agreement (R2), i.e., no two non-faulty processes can decide different values, and BC-validity (R3), i.e., if all non-faulty processes propose the same value v ∈ V , only v can be decided.When the set, V , from which the proposed values are taken is {0, 1}, the problem is called Binary consensus and otherwise, MVC.We study MVC solutions that assume access to a single Binary consensus object.
Byzantine fault-tolerance (BFT) Lamport et al. [28] say that a process commits a Byzantine failure if it deviates from the algorithm instructions, say, by deferring or omitting messages that were sent by the algorithm or sending fake messages, which the algorithm never sent.Such malicious behaviors include crashes and may be the result of hardware or software malfunctions as well as coordinated malware attacks.In order to safeguard against such attacks, Mostéfaoui and Raynal [33], MR from now on, suggested the BC-no-intrusion (R4) validity requirement (aka intrusion-tolerance). Specifically, the decided value cannot be a value that was proposed only by faulty processes.Also, when it is not possible to decide on a value, the error symbol is returned instead.For the sake of deterministic solvability [23,28,35,36], we assume that there are at most t < n/3 Byzantine processes in the system, where n is the total number of processes.It is also well-known that no deterministic (multivalued or Binary) consensus solution exists for asynchronous systems in which at least one process may crash (or be Byzantine) [24].Our self-stabilizing MVC algorithm circumvents this impossibility by assuming that the system is enriched with a Binary consensus object, as in the studied (non-self-stabilizing) solution by MR [33], i.e., reducing MVC to Binary consensus.

Self-stabilization
We study an asynchronous message-passing system that has no guarantees on the communication delay and the algorithm cannot explicitly access the local clock.Our fault model includes undetectable Byzantine failures.In addition, we aim to recover from arbitrary transient-faults, i.e., any temporary violation of assumptions according to which the system was designed to operate.This includes the corruption of control variables, such as the program counter and message payloads, as well as operational assumptions, such as that there are more than t faulty processes.We note that non-self-stabilizing BFT systems do not consider recovery after the occurrence of such violations.Since the occurrence of these failures can be arbitrarily combined, we assume these transient-faults can alter the system state in unpredictable ways.In particular, when modeling the system, Dijkstra [15] assumes that these violations bring the system to an arbitrary state from which a self-stabilizing system should recover [3,16].I.e., Dijkstra requires (i) recovery after the last occurrence of a transient-fault and (ii) once the system has recovered, it must never violate the task requirements.Arora and Gouda [5] refer to the former requirement as Convergence and the latter as Closure.

Related work
Ever since the seminal work of Lamport, Shostak, and Pease [28] four decades ago, BFT consensus has been an active research subject, see [13] and references therein.The recent rise of distributed ledger technologies, e.g., [1], brought phenomenal attention to the subject.We aim to provide a degree of dependability that is higher than existing solutions.
Ben-Or, Kelmer, and Rabin [6] were the first to show that BFT MVC can be reduced to Binary consensus.Correia, Neves, and Veríssimo [12,34] later established the connection between intrusion tolerance and Byzantine resistance.These ideas form the basis of the MR algorithm [33].MR is a leaderless consensus algorithm [4] and as such, it avoids the key weakness of leader-based algorithms [11] when the leader is slow and delays termination.There exist self-stabilizing solutions for MVC but only crash-tolerant ones [9,18,30,29,25], whereas, the existing BFT solutions are not self-stabilizing [37].For example, the recent self-stabilizing crash-tolerant MVC in [30] solves a less challenging problem than the SSBFT problem studied here since it does not account for malicious behaviors.Mostéfaoui, Moumen, and Raynal [32] (MMR in short) presented BFT algorithms for solving Binary consensus using common coins, of which [26,27] recently introduced a self-stabilizing variation that satisfies the safety requirements, i.e., agreement and validity, with an exponentially high probability that depends only on a predefined constant, which safeguards safety.The related work also includes SSBFT state-machine replication by Binun et al. [7,8] for synchronous systems and Dolev et al. [17] for practically-self-stabilizing partially-synchronous systems.Note that both Binun et al. and Dolev et al. study another problem for another kind of system setting.In [19], the problems of SSBFT topology discovery and message delivery were studied.Self-stabilizing atomic memory under semi-Byzantine adversary is studied in [20].

A brief overview of the MR algorithm
The MR algorithm assumes that all (non-faulty) processes eventually propose a value.Upon the proposal of value v, the algorithm utilizes a Validated Byzantine Broadcast protocol, known as VBB, to enable each process to reliably deliver v.The VBB-delivered value could be either the message, v, which was VBB-broadcast, or ⊥ when v could not be validated.For a value v to be valid, it must be VBB-broadcast by at least one non-faulty process.
Following the VBB-delivery from at least n − t different processes, MR undergoes a local test, which is detailed in Section 3. 2. If at least one non-faulty process passes this test, it implies that all non-faulty processes can ultimately agree on a single value proposed by at least one non-faulty process.Therefore, the MR algorithm employs Byzantine-tolerant Binary consensus to reach consensus on the outcome of the local test.If the agreed value indicates the existence of at least one non-faulty process that has passed the test, then each non-faulty process waits until it receives at least n − 2t VBB-arrivals with the same value, v, which is the decided value in this instance of multivalued consensus.If no such indication is represented by the agreed value, the MR algorithm reports its inability to decide in this MVC invocation.For further information, please refer to Section 3.

Our SSBFT variation on MR
This work considers transformers that take algorithms as input and output their self-stabilizing variations.For example, Duvignau, Raynal, and Schiller [22] (referred to as DRS) proposed a transformation for converting the Byzantine Reliable Broadcast (referred to as BRB) algorithm, originally introduced by Bracha and Toueg [10], into a Self-Stabilizing BFT (in short, SSBFT) variation.Another transformation, proposed by Georgiou, Marcoullis, Raynal, and Schiller [26,27] (referred to as GMRS), presented the SSBFT variation of the BFT Binary consensus algorithm of MMR.
Our transformation builds upon the works of DRS and GMRS when transforming the (nonstabilizing) BFT MR algorithm into its self-stabilizing variation.The design of SSBFT solutions requires addressing considerations that BFT solutions do not need to handle, as they do not consider transient faults.
For instance, MR uses a (non-self-stabilizing) BFT Binary consensus object, denoted as obj.In MR, obj returns a value that is proposed by at least one non-faulty process, which corresponds to a test result (as mentioned in Section 1.3 and detailed in Section 4.1.1).However, a single transient fault can change obj's value from False, i.e., not passing the test, to True.Such an event would cause MR, which was not designed to tolerate transient faults, to wait indefinitely for messages that are never sent.Our solution addresses this issue by carefully integrating GMRS's SSBFT Binary-values broadcast (in short, BV-broadcast).This subroutine ensures that obj's value is proposed by at least one non-faulty node even in the presence of transient faults.
The vulnerability of consensus objects to corruption by transient faults holds true regardless of whether we consider Binary or multivalued consensus (MVC).Thus, our SSBFT MVC solution is required to decide even when starting from an arbitrary state.To achieve this, our correctness proof demonstrates that our solution always terminates.We borrow from GMRS the concept of consensus object recycling, which refers to reusing the object (space in the local memory of all non-faulty processes) for a later MVC invocation.Even when starting from an arbitrary state, the proposed solution decides on a value that is eventually delivered to all non-faulty processes, albeit potentially violating safety due to the occurrence of transient faults.Then, utilizing GMRS's subroutine for recycling consensus objects, the MVC object is recycled.Starting from a post-recycling state, the MVC object guarantees both safety and liveness for an unbounded number of invocations.This is one of the principal arguments behind our correctness proof.
We clarify that GMRS's recycling subroutine relies on synchrony assumptions.To mitigate the impact of these assumptions, a single recycling action can be performed for a batch of δ objects, where δ is a predefined constant determined by the memory available for consensus objects.This approach allows for asynchronous networking in communication-intensive compo-nents, such as the consensus objects, while the synchronous recycling actions occur according to the predefined load parameter, δ.
We want to emphasize to the reader that, although our solution is built upon the previous works of DRS [22] and GMRS [26,27] (which addressed different problems than the one under study), we encounter similar challenges in the transformation of code from non-self-stabilizing to self-stabilizing algorithms.Nevertheless, achieving the desired self-stabilizing properties in our construction necessitates a thoughtful combination of SSBFT building blocks and a meticulous analysis of the transformed algorithms.This combination process cannot be derived from the DRS and GMRS transformations.The self-stabilizing issues to tackle that are inherent to the studied algorithms are further explained in Section 4.1.1 for the VBB algorithm and in Section 4. 2.1 for the MVC algorithm.

Our contribution
We present a fundamental module for dependable distributed systems: an SSBFT MVC algorithm for asynchronous message-passing systems.Hence, we advance the state of the art w.r.t. the dependability degree.We obtain this new self-stabilizing algorithm via a transformation of the non-self-stabilizing MR algorithm.MR offers optimal resilience by assuming t < n/3, where t is the number of faulty processes and n is the total number of processes.Our solution preserves this optimality.
In the absence of transient faults, our solution achieves consensus within a constant number of communication rounds during arbitrary executions and without fairness assumptions.After the occurrence of any finite number of arbitrary transient faults, the system recovers within a constant number of invocations of the underlying communication abstractions.This implies recovery within a constant time (in terms of asynchronous cycles), assuming execution fairness among the non-faulty processes.We clarify that these fairness assumptions are only needed for a bounded time, i.e., during recovery, and not during the period in which the system is required to satisfy the task requirements (Definition 1.1).It is important to note that when taking into account also the stabilization time of the underlying communication abstractions, the recycling mechanism stabilizes within O(t) synchronous rounds.
The communication costs of the studied algorithm, i.e., MR, and the proposed one are similar in the number of BRB and Binary consensus invocations.The main difference is that our SSBFT solution uses BV-broadcast for making sure that the value decided by the SSBFT Binary consensus object remains consistent until the proposed SSBFT solution completes and is ready to be recycled.
To the best of our knowledge, we propose the first self-stabilizing Byzantine-tolerant algorithm for solving MVC in asynchronous message-passing systems, enriched with required primitives.That is, our solution is built on using an SSBFT Binary consensus object, a BV-broadcast object, and two SSBFT BRB objects as well as a synchronous recycling mechanism.We believe that our solution can stimulate research for the design of algorithms that can recover after the occurrence of transient faults.

System Settings
We consider an asynchronous message-passing system that has no guarantees of communication delay.Also, the algorithms do not access the (local) clock (or use timeout mechanisms).The system consists of a set, P, of n nodes (or processes) with unique identifiers.Any (ordered) pair of nodes p i , p j ∈ P has access to a unidirectional communication channel, channel j,i , that, at any time, has at most channelCapacity ∈ Z + packets on transit from p j to p i (this assumption is due to a known impossibility [16,Chapter 3.2]).
We use the interleaving model [16] for representing the asynchronous execution of the system.The node's program is a sequence of (atomic) steps.Each step starts with an internal computation and finishes with a single communication operation, i.e., a message send or receive.The state, s i , of node p i ∈ P includes all of p i 's variables and channel j,i .The term system state (or configuration) refers to the tuple c = (s 1 , s 2 , • • • , s n ).We define an execution (or run)

The fault model and self-stabilization
We specify the fault model and design criteria.

Arbitrary node failures.
Byzantine faults model any fault in a node including crashes, and arbitrary malicious behaviors.
Here the adversary lets each node receive the arriving messages and calculate its state according to the algorithm.However, once a node (that is captured by the adversary) sends a message, the adversary can modify the message in any way, delay it for an arbitrarily long period or even remove it from the communication channel.The adversary can also send fake messages spontaneously.The adversary has the power to coordinate such actions without any limitation.For the sake of solvability [28,35,38], we limit the number, t, of nodes that can be captured by the adversary, i.e., n ≥ 3t + 1.The set of non-faulty indices is denoted by Correct and called the set of correct nodes.

Arbitrary transient-faults
We consider any temporary violation of the assumptions according to which the system was designed to operate.We refer to these violations and deviations as arbitrary transient-faults and assume that they can corrupt the system state arbitrarily (while keeping the program code intact).The occurrence of a transient fault is rare.Thus, we assume that the last arbitrary transient fault occurs before the system execution starts [16].Also, it leaves the system to start in an arbitrary state.In other words, we assume arbitrary starting states at all correct nodes and the communication channels that lead to them.Moreover, transient faults do not occur during the system execution.

Dijkstra's self-stabilization
The legal execution (LE) set refers to all executions in which the problem requirements hold.A system is self-stabilizing with respect to LE, when every execution R of the algorithm reaches within a finite period a suffix R legal ∈ LE that is legal.Namely, Dijkstra [15] requires ∀R : The part of the proof that shows the existence of R ′ is called Convergence (or recovery), and the part that shows R legal ∈ LE is called Closure.

Complexity measures and execution fairness
We say that execution fairness holds among processes if the scheduler enables any correct process infinitely often, i.e., the scheduler cannot (eventually) halt the execution of non-faulty processes.The time between the invocation of an operation (such as consensus or broadcast) and the occurrence of all required deliveries is called operation latency.As in MR, we show that the latency is finite without assuming execution fairness.The term stabilization time refers to the period in which the system recovers after the occurrence of the last transient fault.
When estimating the stabilization time, our analysis assumes that all correct nodes complete roundtrips infinitely often with all other correct nodes.However, once the convergence period is over, no fairness assumption is needed.Then, the stabilization time is measured in terms of asynchronous cycles, which we define next.All self-stabilizing algorithms have a do forever loop since these systems cannot be quiescent due to a well-known impossibility [16,Chapter 2.3].Also, the study algorithms allow nodes to communicate with each other via broadcast operation.Let num b be the maximum number of (underlying) broadcasts per iteration of the do forever loop.The first asynchronous cycle, R ′ , of execution R = R ′ • R ′′ is the shortest prefix of R in which every correct node is able to start and complete at least a constant number, num b , of round-trips with every correct node.The second asynchronous communication round of R is the first round of the suffix R ′′ , and so forth.

Building Blocks
Following Raynal [37], Fig. 1 illustrates a protocol suite for SSBFT state-machine replication using total order broadcast.This order can be defined by instances of MVC objects, which in turn, invoke SSBFT Binary consensus, BV-broadcast, and SSBFT recycling subroutine for consensus objects (GMRS [26,27]) as well as SSBFT BRB (DRS [22]).

SSBFT Byzantine-tolerant Reliable Broadcast (BRB)
The communication abstraction of (single instance) BRB allows every node to invoke the broadcast(v) : v ∈ V and deliver(k) : p k ∈ P operations.
Suppose a correct node BRB-delivers message m from a correct p i .Then, p i had BRB-broadcast m.
• BRB-integrity.No correct node BRB-delivers more than once.
No two correct nodes BRB-deliver different messages from p i (which might be faulty).
Suppose p i is a correct sender.All correct nodes BRB-deliver from p i eventually.
• BRB-completion-2.Suppose a correct node BRB-delivers a message from p i (which might be faulty).All correct nodes BRB-deliver p i 's message eventually.
We assume the availability of an SSBFT BRB implementation, such as the one in [22], which stabilizes within O(1) asynchronous cycles.Such implementation lets p i ∈ P to use the operation deliver i (k) for retrieving the current return value, v, of the BRB broadcast from p k ∈ P. Before the completion of the task of the deliver i (k) operation, v's value is ⊥.This way, whenever deliver i (k) ̸ = ⊥, node p i knows that the task is completed and the returned value can be used.There are several BRB implementations [2,14,31] that satisfy different requirements than the ones in Definition 2.1, which is taken from the textbook [37].
Note that existing non-self-stabilizing BFT BRB implementations, e.g., [37,Ch. 4], consider another kind of interface between BRB and its application.In that interface, BRB notifies the application via the raising of an event whenever a new message is ready to be BRB-delivered.However, in the context of self-stabilization, a single transient fault can corrupt the BRB object to encode in its internal state that the message was already BRB-delivered without ever BRBdelivering the message.The interface proposed in [22] addresses this challenge by allowing the application to repeatedly query the status of the SSBFT BRB object without changing its state.
We also assume that BRB objects have the interface function hasTerminated(), which serves as a predicate indicating when the sender knows that all non-faulty nodes have successfully delivered the application message.The implementation of hasTerminated() is straightforward -it checks the condition in the if-statement on line 49 of Algorithm 4 in [22].If the condition is met, it returns False, otherwise, it returns True.

SSBFT Binary-values Broadcast (BV)
This is an all-to-all broadcast operation of Binary values.This abstraction uses the operation, bvBroadcast(v), which is assumed to be invoked independently (i.e., not necessarily simultaneously) by all the correct nodes, where v ∈ V .For the sake of a simpler presentation of our solutions, we prefer V = {False, True} over the traditional V = {0, 1} presentation.The set of values that are BV-delivered to node p i can be retrieved via the function binValues i (), which returns ∅ before the arrival of any bvBroadcast() by a correct node.We specify under which conditions values are added to binValues().
• BV-validity.Suppose v ∈ binValues i () and p i is correct.It holds that v has been BV-broadcast by a correct node.
The above requirements imply that eventually ∃s ⊆ {False, True} : s ̸ = ∅ ∧ ∀i ∈ Correct : binValues i () = s and the set s does not include values that were BV-broadcast only by Byzantine nodes.The SSBFT BV-broadcast solution in [26] stabilizes within O(1) asynchronous cycles.This implementation allows the correct nodes to repeat a BV-broadcast using the same BVbroadcast object.As mentioned in Section 1.4, this allows the proposed solution to overcome challenges related to the corruption of the state of the SSBFT Binary consensus object, more details in Section 4.2.1.

SSBFT Binary Consensus
As mentioned, the studied solution reduces MVC to Binary consensus by enriching the system model with a BFT object that solves Binary consensus (Definition 2.2).Definition 2.2 Every p i ∈ P has to propose a value v i ∈ V = {False, True} via an invocation of propose i (v i ).Let Alg be a Binary Consensus (BC) algorithm.Alg has to satisfy safety, i.e., BC-validity and BC-agreement, and liveness, i.e., BC-completion.
• BC-validity.The value v ∈ {False, True} decided by a correct node is a value proposed by a correct node.
• BC-agreement.Any two correct nodes that decide, do so with identical decided values.
We assume the availability of SSBFT Binary consensus, such as the one from GMRS [26], which stabilizes within O(1) asynchronous cycles.GMRS's solution might fail to decide with negligible probability.In this case, GMRS's solution returns the error symbol, , instead of a legitimate value from the set {False, True}.The proposed SSBFT MVC algorithm returns whenever the SSBFT Binary consensus returns (cf.line 66 of Algorithm 4).

The Recycling Mechanism and Recyclable Objects
Just as MR, we do not focus on the management of consensus invocations since we assume the availability of a mechanism for eventually recycling all MVC objects that have completed their tasks.GMRS [26,27] implement such subroutine.We review their subroutine and explain how they construct recyclable objects.
GMRS implements consensus objects using a storage of constant size allocated at program compilation time.Since these objects can be instantiated an unbounded number of times, it is necessary to reuse the storage once a consensus is reached.This should occur only after each correct node received the decided value via result().
To facilitate this, GMRS assumes that the object has two meta-statuses: used and unused.The unused status indicates the availability of objects that were either never used or are no longer in current use.GMRS specifies that recyclable objects must implement an interface function called wasDelivered(), which returns 1 after the result delivery.Recycling is triggered by the recycling mechanism, which invokes recycle() at each non-faulty node, setting the metastatus of the corresponding consensus object to unused.
GMRS defines recyclable object construction as a task that requires eventual agreement on the value of wasDelivered().In detail, if a non-faulty node p i reports delivery (i.e., wasDelivered i () = 1), then all non-faulty nodes will eventually report delivery as well.We clarify that during the recycling process, i.e., when at least one non-faulty node invokes recycle(), there is no need to maintain agreement on the values of wasDelivered().
GMRS requires us to implement recycle() by locally setting the algorithm to its predefined post-recycling state, see Sections 4.1.2and 4.2.2.Also, our solution implements the operation result(), which facilitates the implementation of wasDelivered() using the same construction proposed by GMRS in [26,27].By implementing GMRS' interfaces, we borrow GMRS correctness properties since the studied problem and the structure of the proposed algorithms are very similar.
GMRS implements a recycling service using a synchronous SSBFT consensus that allows all non-faulty nodes to reuse the object immediately after the process returns from recycle().GMRS's recycling facilitates the transformation of the non-self-stabilizing BFT MR algorithm to an SSBFT one.The transformation concentrates on assuring operation completion since once all objects have been recycled, the system reaches its post-recycling state, which has no trace of stale information, i.e., Convergence holds.As mentioned in Section 1.4, the effect of these assumptions can be mitigated by letting recycling batches of δ objects, where δ is a predefined constant that depends on the available memory.This way, the communicationintensive components are asynchronous and the synchronous recycling actions occur according to a load that is defined by δ.

The Studied Algorithms
As mentioned, MR is based on a reduction of BFT MVC to BFT Binary consensus.MR guarantees that the decided value is not a value proposed only by Byzantine nodes.Also, if there is a value, v ∈ V , that all correct nodes propose, then v is decided.Otherwise, the decided value is either a value proposed by the correct nodes or the error symbol, .This way, an adversary that commands its captured nodes to propose the same value, say, v byz ∈ V , cannot lead to the selection of v byz without the support of at least one correct node.MR uses the VBB communication abstraction (Fig. 1), which we present (Section 3.1) before we bring the reduction algorithm (Section 3.2).

Validated Byzantine Broadcast (VBB)
This abstraction sends messages from all nodes to all nodes.It allows the operation, vbbBroadcast(v) and raises the event vbbDeliver(d), for VBB-broadcasting, and resp., VBB-delivering.

Specifications
We detail VBB-broadcast requirements.
• VBB-validity.VBB-delivery of messages needs to relate to VBB-broadcast of messages in the following manner.
-VBB-justification.Suppose p i : i ∈ Correct VBB-delivers message m ̸ = from some (faulty or correct) node.There is at least one correct node that VBB-broadcast m.
-VBB-obligation.Suppose all correct nodes VBB-broadcast the same v.All correct nodes VBB-delivers v from each correct node.
Let p i : i ∈ Correct.Suppose VBB-delivers m ′ ∈ {m, } from a (possibly faulty) node p j .All the correct nodes VBB-deliver the same message m ′ from p j .
Suppose a correct node p i VBB-broadcasts m.All the correct nodes VBB-deliver from p i .
We also say that a complete VBB-broadcast instance includes vbbBroadcast i (m i ) invocation by every correct p i ∈ P. It also includes vbbDeliver() of m ′ from at least (n−t) distinct nodes, where m ′ is either p j 's message, m j , or the error symbol, .The latter value is returned when a message from a given sender cannot be validated.This validation requires m j to be VBB-broadcast by at least one correct node.

Implementing VBB-broadcast
Algorithm 1 presents the studied VBB-broadcast.Notation: Denote by equal (v, rec) and differ (v, rec) the number of items in multiset rec that are equal to, and resp., different from v. Overview: Algorithm 1 invokes BRB-broadcast twice in the first part of the algorithm (lines 1 to 4) and then VBB-delivers messages from nodes in the second part (lines 5 to 11).
Node p i first BRB-broadcasts INIT(i, v i ) (where v i is the VBB-broadcast message), and suspends until the arrival of INIT() from at least (n−t) different nodes (lines 2 to 3), which p i collects in the multiset rec i .In line 2, node p i tests whether v i was BRB-delivered from at least n−2t ≥ t+1 different nodes.Since this means that v i was BRB-broadcast by at least one correct node, p i attests to the validity of v i (line 4).Recall that each time INIT() arrives at p i , the message is added to rec i .Therefore, the fact that |rec i | ≥ n−t holds (line 3) does not keep rec i from growing.
Algorithm 1's second part (lines 5 to 11) includes n concurrent background tasks.Each task aims at VBB-delivering a message from a different node, say, p j .It starts by waiting until p i BRB-delivered both INIT(j, v j ) and VALID(j, x j ) from p j so that p i has both p j 's VBB's values, v j , and the result of its validation test, x j .
Since p j might be faulty, we cannot be sure that v j was indeed validated.Thus, p i re-attests v j by waiting until equal (v j , rec i ) ≥ n−2t holds.If this happens, p i VBB-delivers v j as a message from p j , because this implies equal (v j , rec i ) ≥ t+1 since n−2t ≥ t+1.

2.
The x j = False case (line 10).For similar reasons to the former case, p i waits until rec i has at least t+1 items that are not v j .This implies at least one correct note cannot attest v j 's validity.If this ever happens, p i VBB-delivers the error symbol, , as the received message from p j .

Non-stabilizing BFT Multivalued Consensus
Algorithm 2 reduces the BFT MVC problem to BFT Binary consensus in message-passing systems that have up to t < n/3 Byzantine nodes.Algorithm 2 uses VBB-broadcast abstraction (Algorithm 1).Note that the line numbers of Algorithm 2 continue the ones of Algorithm 1.

Specifications
Our BFT MVC task (Section 1.1) includes the requirements of BC-validity, BC-agreement, and BC-completion (Section 1.1) as well as the BC-no-Intrusion property (Section 1.1).

Implementation
Node p i waits for EST() messages from (n−t) different nodes after it as VBB-broadcast its own value (lines 15 to 16).It holds all the VBB-delivered values in the multiset rec i (line 13) before testing whether rec i includes (1) non-replies from at least (n−2t) different nodes, and (2) exactly one non-value v (line 13).The test result is proposed to the Binary consensus object, bcO (line 17).
Once consensus is reached, p i decides according to the consensus result, bcO i .result().Specifically, if bcO i .result()= False, p i returns the error symbol, , since there is no guarantee that any correct node was able to attest to the validity of the proposed value.Otherwise, p i waits until it received EST(v) messages that have identical values from at least (n−2t) different nodes (line 20) before returning that value v.Note that some of these (n−2t) messages were already VBB-delivered at line 16.The proof in [33] shows that any correct node that invokes bcO i .propose(True)does so if all correct nodes eventually VBB-deliver identical values at least (n−2t) times.Then, any correct node can decide on the returned value for the MVC object once it also VBB-delivers identical values at least (n−2t) times.

SSBFT Multivalued Consensus
Algorithms 3 and 4 present our SSBFT VBB solution and self-stabilizing Byzantine-and intrusion-tolerant solution for MVC.They are obtained from Algorithms 1 and 2 via code transformation and the addition of necessary consistency tests (Sections 4.1.1 and 4.2.1).Note that the line numbers of Algorithms 3 and 4 continue the ones of Algorithms 2, and resp., 3.

SSBFT VBB-broadcast
The operation vbbBroadcast(v) allows the invocation of a VBB-broadcast instance with the value v. Node p i VBB-delivers messages from p k via vbbDeliver i (k).

Algorithm 1's invariants that transient faults can violate
Transient faults can violate the following invariants, which our SSBFT solution addresses via consistency tests.1. Node p i 's state must not encode the occurrence of BRB execution of phase valid (line 4) without encoding BRB execution of phase init (line 2).Algorithm 3 addresses this concern by informing that the VBB object has an internal error (line 38).This way, indefinite blocking of the application is avoided.The if-statement in line 47 considers the case in which x i is corrupted.Thus, there is a need to return the error symbol, .This happens when p i VBB-delivered VALID() messages from at least n−t different nodes, but none of the if-statement conditions in lines 38 to 45 hold.This fits the consistency test of item 3 in Section 4.1.1,which requires eventual completion in the presence of transient faults.

SSBFT multivalued consensus
The invocation of propose(v) VBB-broadcasts v. Node p i VBB-delivers messages from p k via the result i () operation.

Algorithm 2's invariants that transient faults can violate
As mentioned in Section 1.4, the occurrence of a transient fault can let the Binary consensus object to encode a decided value that was never proposed, i.e., this violates BC-validity.
Any SSBFT solution needs to address this concern since the MVC object can block indefinitely if bcO decides True when ∀p j : j ∈ Correct : sameValue j () = False holds.As we explain next, our implementation BV-broadcasts (line 72) for testing the consistency of the SSBFT Binary consensus object (line 66).This way, indefinite blocking can be avoided by reporting an internal error state.

Local variables
Algorithm 4's state includes the SSBFT BV-broadcast object, bvO, and SSBFT consensus Binary object, bcO.Each has the post-recycling value of ⊥, i.e., when bvO = ⊥ (or bcO = ⊥) the object is said to be inactive.They become active upon invocation and complete according to their specifications (Sections 2.2.1 and 2.2.2, resp.).
The logic of lines 16 and 17 in Algorithm 2 is implemented by lines 70 and 71, resp.In detail, recall from Section 4.2.3 that mcWait() (line 70) allows waiting until there are at least n − t different nodes from which p i is ready to VBB-deliver a message.Then, if bcO = ⊥ (i.e., the Binary consensus object was not invoked), line 71 uses bcO to propose the returned value from sameValue().Recall from Section 4.2.3, the macro sameValue() (line 55) implements the one in line 13 (Algorithm 2), see Section 3.2.2 for details.
Line 72 facilitates the implementation of the consistency test (Section 4.2.1) by using BVbroadcasting for disseminating the returned value from sameValue().This way it is possible to detect the case in which all correct nodes BV-broadcast a value that is, due to a transient fault, different than bcO's decided one.This is explained when we discuss line 66.
The operation result() (lines 57 to 68) returns the decided value, which lines 18 and 20 implement in Algorithm 2. It is a query-based operation, just as deliver() (cf.text just after Definition 2.1).Thus, line 59 considers the case in which the decision has yet to occur, i.e., it returns ⊥.Line 61 considers the case that line 17 (Algorithm 2) deals with and returns the error symbol, .Line 63 implements line 20 (Algorithm 2), i.e., it returns the decided value.Line 66 performs a consistency test for the case in which the if-statment conditions in lines 59 to 63 hold, there are VBB-deliveries from at least n − t different nodes (i.e., mcWait i () holds), and yet there are no correct node, say p j , reports to p i , via BV-broadcast, that the predicate sameValue j () holds.Lemma 5.8 shows that this test addresses the challenge described in Section 4.2.1.Whenever none of the conditions of the if-statements in lines 59 to 66 hold, line 68 returns ⊥.

Correctness
As explained in Section 2.2.4,we demonstrate Convergence (Theorem 5.1) by showing that all operations eventually complete since this implies their recyclability, and thus, the SSBFT object recycler can restart their state (Section 2.2.4).For every layer, i.e., VBB-broadcast and MVC, we prove the properties of completion and Convergence (Theorems 5.1 and resp., 5.6) before demonstrating the Closure property, cf.Theorems 5.2 and 5.9 resp.

VBB-completion and Convergence
The proof demonstrates Convergence by considering executions that start in arbitrary states.Theorem 5.1 shows that all VBB objects are completed within a bounded time.Specifically, assuming fair execution among the correct nodes (Section 2.1.4),Theorem 5.1 shows that, within a bounded time, for any pair of correct nodes, p i , and p j , a non-⊥ value is returned from vbbDeliver j (i).As explained in Section 2. 2.4, this means that all VBB objects become recyclable, i.e., wasDelivered i () returns True.Since we assume the availability of the object recycling mechanism, the system reaches a post-recycling state within a bounded time.Specifically, using the mechanism by GMRS [26,27], Convergence is completed with O(t) synchronous rounds.We introduce the CRWF/ACAF notation since the arguments of Theorem 5.1 can be used for demonstrating different properties under different assumptions.Specifically, Theorem 5.1 demonstrates that VBB-completion occurs within O(1) communication rounds (Section 2.1.4)without assuming execution fairness but assuming that execution R starts in a post-recycling system state.For the sake of brevity, when the proof arguments are used for counting the number of Communication Rounds Without assuming Fairness (CRWF), we write 'within O(1) CRWF'.Theorem 5.1 also demonstrates Convergence within O(1) asynchronous cycles assuming fair execution among the correct nodes (Section 2.1.4).Thus, when the proof arguments can be used for counting the number of Asynchronous Cycles while Assuming Fairness (ACAF), we say, in short, 'within O(1) ACAF'.Moreover, when the same arguments can be used in both cases, we say 'within O(1) CRWF/ACAF'.

Figure 1 :
Figure 1: We assume the availability of SSBFT protocols (cf.Definition 1.2) for Binary consensus and object recycling.The studied problems appear in boldface fonts.The other layers, BRB, BV-broadcast, and state machine replication, are in plain font.
.  .as an alternating sequence of system states c[x] and steps a[x], such that each c[x + 1], except for the starting one, c[0], is obtained from c[x] by a[x]'s execution.

Argument 3 .
Suppose the condition brb i [valid][i] ̸ = ⊥ holds in R's starting state.Within O(1) CRWF/ACAF, either the if-statement condition in line 38 holds or the one in line 40 cannot hold.The proof is implied by Algorithm 4's code, BRB-completion, and arguments 1 and 2.

Argument 4 .
Within O(1) CRWF/ACAF, vbbDeliver j (i) ̸ = ⊥ holds.Suppose the if-statement conditions in lines 38 to 45 never hold.By vbbWait()'s definition (line 26), BRB properties (Definition 2.1), the presence of at least n−t correct and active nodes, and arguments (1) to (3), the if-statement condition in line 47 holds within O(1) CRWF/ACAF.2 T heorem 5.15.2 Closure of VBB-broadcastTheorem 5.10 demonstrates Closure by considering executions that start from a post-recycling state, which Theorem 5.1 implies that the system reaches, see Section 5.1 for details.Theorem 5.2's proof shows no consistency tests causes false error indications.Theorem 5.2 counts communication rounds (without assuming fairness) using the CRWF notation presented in Section5.1.
2. Define the phase types, vbbMSG := {init, valid} (line 21) for BRB dissemination of INIT(), and resp., VALID() messages in Algorithm 1.For a given phase, phs ∈ vbbMSG, the BRB message format must follow the one of BRB-broadcast of phase phs, as in lines 2 and4.In order to avoid blocking, the VBB object informs about an internal error (lines 42 and 43). 3.For a given phase, phs ∈ vbbMSG, if at least n − t different nodes BRB-delivered messages of phase phs, to node p i , p i 's state must lead to the next phase, i.e., from init to valid, or from valid to operation complete, in which VBB-deliver a non-⊥ value.Algorithm 3 addresses this concern by monitoring the conditions in which the nodes should move from phase init to valid (line 33).The case in which the nodes should move from phase valid to operation complete is more challenging since a single transient fault can (undetectably) corrupt the state of the BRB objects.Algorithm 3 makes sure that such inconsistencies are detected eventually.When an inconsistency is discovered, the VBB object informs the application about an internal error (line 47), see Section 4.1.5for more details.The array brbvbbMSG[vbbMSG][P] holds BRB object, which disseminate VBB-broadcast messages, i.e., brb[init][] and brb[valid][] store the INIT(), and resp., VALID() messages in Algorithm 1.The second dimension of the array brb[][] allows us to implement one VBB object per node as this is needed for Algorithm 4. Thus, after the recycling of these objects (Section 2.2.4) or before they ever become active, each of the 2n BRB objects has the value ⊥.For a given p i ∈ P, brb i [-][i] becomes active via the invocation of brb i [-][i].broadcast(v)(which also leads to brb i [-][i] ̸ = ⊥) or the arrival of BRB protocol messages, say, from p j (which leads to brb i [-][j] ̸ = ⊥).Once a BRB message is delivered from p ℓ (in the context of phase phs ∈ vbbMSG and VBB broadcast from p k ), a call to brb i [phs][k].deliver(ℓ)retrieves the delivered message.Specifically, given a phase, phs, it tests whether there is a set S that includes at least n−t different nodes from which there is a message that is ready to be BRB-delivered.The by testing vbbWait i (i,init) ∧ brb i [init][i].hasTerminated(),where the second clause indicates that brb i [init][i] has terminated (Section 2.2.1), and thus, Item 1 in Section 4.1.1 is implemented correctly.Also, the macro vbbEq() is a detailed implementation of the function equal () (Algorithm 1).Line 38 performs a consistency test that matches Item 1 in Section 4.1.1,i.e., for a given sender p k ∈ P, if p k had invoked brb[valid][k] before brb[init][k]'s termination, an error is indicated via the return of .Line 40 follows line 6's logic by testing whether this VBB object is ready to complete w.r.t.sender p k ∈ P. It does so by checking the state of the two BRB objects in brb[-][k] since they each need to deliver a non-⊥ value.In case any of them is not ready to complete, the operation returns ⊥.The if-statements in lines 42 and 43 return when the delivered BRB message is ill-formatted.By that, they fit the consistency test of item 2 in Section 4.1.1.The if-statements in lines 44 to 45 implement the logic of lines 7 to 10 in Algorithm 1.The logic of these lines is explained in items 1, and resp., 2 in Section 3.1.1.Similar to line 7 in Algorithm 1, x i (line 43) is the value that p i BRB-delivers from p k via the BRB object brb i [valid].As mentioned, the macro vbbDiff () is a detailed implementation of differ () used by Algorithm 1.