Consensus with Partial Synchrony
Recall: Partially Synchronous model.
|t=0|____Asynch phase____|t=GST, unknown|____Sync phase, ∆ msg delay ______→
Goals: (for SMR)
- consistency (always)
- liveness (eventually, after GST)
Lecture 8: if f ≥ n/3, no protocol satisfies these goals (even with PKI).
Synchrony | Partial synchrony | Asynchrony | |
Permissioned | ✅ PKI, any f<n ⇒ BB protocol
❌ no PKI, f≥n/3 ⇒ no BB protocol | ✅ ?
❌ f≥n/3 ⇒ no BA protocol | ❌ f=1 ⇒ no BA protocol |
Tendermint: [Buchman-Kwon-Milosevic 2018] In partial synchrony, there exists an SMR protocol that, when f < n/3, satisfies consistency (always) + liveness (eventually).
Tendermint: High-Level Ideas
(Remark: with these consensus protocols the devil is usually in details. Intuition is frequently misleading in distributed systems, so a formal proof is always good to require.)
Idea #1: iterated single-shot consensus. (output of each = ordered list of txs, “block”)
- each node maintains its own “height” (latest block it knows) [in asynchronous phase, might be different for different nodes]
- every single message is annotated with “what is the next block that I am trying to figure out”.
- if a node is working on block 9, it is ignoring all messages about other blocks (with a tiny exception tbd later)
Idea #2: for a fixed height, keep proposing + voting till agreement (can do it since GST will come at some point).
- BB style, there will be a proposer and voters. Proposer will be rotating.
Idea #3: two stages of voting. (This is the key innovation)
- Why one stage is not enough? Because different nodes may see different voting outcomes, because (1) Byzantine nodes; (2) Asynchrony.
- Node will have - restart outcome (when the voting failed from their viewpoint) - commit outcome (when they are convinced) - intermidiate “hedging” outcome (they are convinced but not 100%!)
Quorum Certificates (QCs)
Preliminaries:
- Assume PKI, all messages signed by sender + pub keys distributed before the protocol (not needed in general, but in Tendermint needed)
- Round = interval of 4∆ timesteps (shared clock, ∆ known ⇒ all nodes know which round is now)
- Use rotating leaders (one per round) [easy due to shared clock + permissioned setting]
Definition: a quorum certificate (QC) is a batch of ≥ signed votes for some block B (at some height, at some round, at some stage (1 or 2) of that round).
Lemma: any two QCs overlap in at least one honest node. [simply because they overlap in ≥n/3 nodes, and we have a bound f<n/3]
Proof: overlap is , qed.
Corollary: any two (⇒all) QCs for some [block # + round + stage] must agree on the block.
This is because each honest node votes only once in one referendum [block # + round + stage]; see later for pseudo-code.
Plan: - each node maintains a pair of local variables , where the second is a QC for the first - initially is null, and = all unexecuted txs that knows about. - periodically updates to most recent (according to rounds/stages) block-QC pair it’s heard about - also save for future use any QCs for future blocks
The Tendermint Protocol (in pseudo-code)
Time is divided into rounds (4∆ timestamps): _._._._|_._._._|_._._._|_._._._|._._._|_._._._|_._._._|_._._._|_._._._|_._._._
Fix a height (e.g. block #9), and a round r with leader . The round r is divided into 4 phases, and starts exactly at time 4∆r (since one round = 4∆).
t=4∆r: (phase 1)
- updates to the most recent QC known, proposes to all other nodes (including itself)
- Message looks like this: (round r, height (block #9), , )
t=4∆r+∆: (phase 2)
- If node receives from (it might not if msg is delayed)
and
- If is not older (in terms of rounds/stages) than
then
- Broadcast first-stage vote for : (including itself) (round r, height (block #9), vote fore “yes”, )
- Update
- Broadcast
t=4∆r+2∆: (phase 3)
- If node receives supermajority, that is round-r stage-1 votes for (counting itself and the leader node )
then
- Update newly votes received, (cause it knows that this is most recent QC)
- Broadcast second-stage vote for
- Broadcast
t=4∆r+3∆: (phase 4)
- If node receives round-r stage-2 votes for
then
- Update = this QC,
- Commit to local history (because the block survived two stages of voting!)
- Broadcast
- Increment
- re-initialize: is null, and = all unexecuted txs that knows about.
t=4∆r+4∆: (just before the start of next round r+1)
- If received in the background a stage-2 QC for block # supporting a block B
then
- commit B to a local history, increment
(repeat this procedure if possible)
In the backbground (at all times):
- Store all QCs received for future blocks , , …
From the notes:
Summary:
Tendermint: Proof of Consistency
Theorem: Tendermint satisfies SMR consistency (for a given block #, all honest nodes commit the same block).
Proof: Fix a height h (e.g., block #9).
We need to prove that there cannot be two QCs for block #9, stage-2, some round. If the rounds of these two QCs are the same then we are done by the overlap lemma!
Let r = first round in which [>n/3 honest nodes = set S] cast stage-2 votes for same block . This is of course a prerequisite for a creation of a stage-2 QC (cause >2n/3 voted, <n/3 Byzantine ⇒ >n/3 honest voted).
Intuition: We want to argue that stage-2 QCs for block #9 can only be for block . These [>n/3 honest nodes = set S] simply “lock-in” onto their vote for — they are never going to change it in stage-1 because of the line 6 in the pseudo-code pic above, and therefore there will be never >2n/3 votes in stage 2 for a block ≠ .
Formally: (by induction)
- At the end of round r: (i) for all [by pseudo-code: on round-r 3rd phase >n/3 casted votes for ⇒ on round-r 4th phase there cannot be QCs for other blocks. There cannot be also QCs in the background in the past, because round r is the “first” such round by definition] (ii) from round-r stage-1 or later [obvious from the pseudo-code] (iii) all QCs for other blocks are from round r-1 or earlier [semi-obvious from the pseudo-code]
- In round r+1 no nodes from S change their mind: (i) + (ii) + (iii) all hold! [(iii) in round-r ⇒ leader cannot propose an earlier QC for a different block ⇒ >n/3 nodes in round-r+1 stage-1 don’t update → don’t vote ⇒ no QC for a different block can be formed in round-r+1 stage-2, qed]
- In the future rounds: same.
[…see notes for more details…]
qed.
Tendermint: Proof of Liveness
Claim: Tendermint satisfies SMR liveness (eventually).
(Our SMR liveness property is going to be weaker than the one from Lecture 4: Old, strong livenes: every tx submitted to one honest node gets included. New, weaker liveness: every tx submitted to all honest nodes gets included. This is not a big deal, since honest nodes can communicate via some gossib protocol and share valid txs between each other.)
Proof: Consider a tx T known to all honest nodes.
Fast forwards to a pair and of consecutive rounds after GST+∆ with honest leaders (this exsists, since f<n/3)
Lemma: at start of round , every honest node is working on block # h or h+1. [Roughly, this is because after commiting blocks, honest blocks broadcast stage-2 QC for that block, and since we are post-GST, those broadcasts do arrive to other honest nodes!]
Proof: […see notes…]
Definition: a round is clean if (i) post-GST; (ii) honest leader; (iii) all honest nodes working on same block #; (iv) after update in 1st phase, leader’s QC at least as recent as that of an honest node
Lemma: clean round ⇒ all honest nodes commit the block proposed by the leader. [proof is by inspecting pseudo-code + remembering we are in post-GST…]
Case 1: all honest nodes start round r working on block # h+1. ⇒ round is clean, commits block including T
Case 2: all honest nodes start round r working on block # h. ⇒ round is clean, commits some block ⇒ round is clean, commits block including T
Case 3: the leader is behind. […]
Case 4: the leader is ahead. […]
Can we do better?
- Can’t increase # of Byzantine nodes (without compromising elsewhere)
- Can’t relax partial synchrony to asynchrony
- Can’t have both liveness and safety before GST
Alternative trade-offs:
- Longest chain consensus favors liveness over safety! (see next Lecture)
Same guarantees (fault-tolerance, consistency, eventual liveness) but better performance (smaller communication complexity, fewer rounds, faster recovery time post-GST, etc):
- see HotStuff (Facebook Diem)
- Casper FFG (Ethereum uses now)