# Lecture 9. The Tendermint Protocol.

Resources:
HW:

## Consensus with Partial Synchrony

Recall: Partially Synchronous model.

|t=0|____Asynch phase____|t=GST, unknown|____Sync phase, ∆ msg delay ______

Goals: (for SMR)

• consistency (always)
• liveness (eventually, after GST)

Lecture 8: if f ≥ n/3, no protocol satisfies these goals (even with PKI).

 Synchrony Partial synchrony Asynchrony Permissioned ✅ PKI, any f

Tendermint: [Buchman-Kwon-Milosevic 2018] In partial synchrony, there exists an SMR protocol that, when f < n/3, satisfies consistency (always) + liveness (eventually).

## Tendermint: High-Level Ideas

(Remark: with these consensus protocols the devil is usually in details. Intuition is frequently misleading in distributed systems, so a formal proof is always good to require.)

Idea #1: iterated single-shot consensus. (output of each = ordered list of txs, “block”)

• each node $i$ maintains its own “height” $h_i$ (latest block it knows) [in asynchronous phase, might be different for different nodes]
• every single message is annotated with “what is the next block that I am trying to figure out”.
• if a node is working on block 9, it is ignoring all messages about other blocks (with a tiny exception tbd later)

Idea #2: for a fixed height, keep proposing + voting till agreement (can do it since GST will come at some point).

• BB style, there will be a proposer and voters. Proposer will be rotating.

Idea #3: two stages of voting. (This is the key innovation)

• Why one stage is not enough? Because different nodes may see different voting outcomes, because (1) Byzantine nodes; (2) Asynchrony.
• Node will have - restart outcome (when the voting failed from their viewpoint) - commit outcome (when they are convinced) - intermidiate “hedging” outcome (they are convinced but not 100%!)

## Quorum Certificates (QCs)

Preliminaries:

• Assume PKI, all messages signed by sender + pub keys distributed before the protocol (not needed in general, but in Tendermint needed)
• Round = interval of 4∆ timesteps (shared clock, ∆ known ⇒ all nodes know which round is now)
• Use rotating leaders (one per round) [easy due to shared clock + permissioned setting]

Definition: a quorum certificate (QC) is a batch of ≥$\frac 2 3 n$ signed votes for some block B (at some height, at some round, at some stage (1 or 2) of that round).

Lemma: any two QCs overlap in at least one honest node. [simply because they overlap in ≥n/3 nodes, and we have a bound f<n/3]

Proof: overlap is $\geq \frac 2 3 n + \frac 2 3 n - n = \frac 1 3 n >f$, qed.

Corollary: any two (⇒all) QCs for some [block # + round + stage] must agree on the block.

This is because each honest node votes only once in one referendum [block # + round + stage]; see later for pseudo-code.

Plan: - each node $i$ maintains a pair of local variables $(B_i,QC_i)$, where the second is a QC for the first - initially $QC_i$ is null, and $B_i$ = all unexecuted txs that $i$ knows about. - periodically updates to most recent (according to rounds/stages) block-QC pair it’s heard about - also save for future use any QCs for future blocks

## The Tendermint Protocol (in pseudo-code)

Time is divided into rounds (4∆ timestamps): _._._._|_._._._|_._._._|_._._._|._._._|_._._._|_._._._|_._._._|_._._._|_._._._

Fix a height (e.g. block #9), and a round r with leader $\ell$. The round r is divided into 4 phases, and starts exactly at time 4∆r (since one round = 4∆).

t=4∆r: (phase 1)

• $\ell$ updates $(B_\ell,QC_\ell)$ to the most recent QC known, proposes to all other nodes (including itself)
• Message looks like this: (round r, height (block #9), $(B_\ell,QC_\ell)$, $sign_\ell$)

t=4∆r+∆: (phase 2)

• If node $i$ receives $(B_\ell,QC_\ell)$ from $\ell$ (it might not if msg is delayed)

and

• If $QC_\ell$ is not older (in terms of rounds/stages) than $QC_i$

then

• Broadcast first-stage vote for $B_\ell$: (including itself) (round r, height (block #9), vote fore $B_\ell$ “yes”, $sign_i$)
• Update $(B_i,QC_i):=(B_\ell,QC_\ell)$
• Broadcast $(B_i,QC_i)=(B_\ell,QC_\ell)$

t=4∆r+2∆: (phase 3)

• If node $i$ receives supermajority, that is $\geq \frac 2 3 n$ round-r stage-1 votes for $B$ (counting itself and the leader node $\ell$)

then

• Update $QC_i:=$newly votes received, $B_i:=B$ (cause it knows that this is most recent QC)
• Broadcast second-stage vote for $B_i$
• Broadcast $(B_i,QC_i)$

t=4∆r+3∆: (phase 4)

• If node $i$ receives $\geq \frac 2 3 n$ round-r stage-2 votes for $B$

then

• Update $QC_i:$= this QC, $B_i:=B$
• Commit $B$ to local history (because the block $B$ survived two stages of voting!)
• Broadcast $(B_i,QC_i)$
• Increment $h_i$
• re-initialize: $QC_i$ is null, and $B_i$ = all unexecuted txs that $i$ knows about.

t=4∆r+4∆: (just before the start of next round r+1)

• If received in the background a stage-2 QC for block # $h_i$ supporting a block B

then

• commit B to a local history, increment $h_i$

(repeat this procedure if possible)

In the backbground (at all times):

• Store all QCs received for future blocks $h_i+1$, $h_i+2$, …

From the notes:

Summary:

## Tendermint: Proof of Consistency

Theorem: Tendermint satisfies SMR consistency (for a given block #, all honest nodes commit the same block).

Proof: Fix a height h (e.g., block #9).

We need to prove that there cannot be two QCs for block #9, stage-2, some round. If the rounds of these two QCs are the same then we are done by the overlap lemma!

Let r = first round in which [>n/3 honest nodes = set S] cast stage-2 votes for same block $B^*$. This is of course a prerequisite for a creation of a stage-2 QC (cause >2n/3 voted, <n/3 Byzantine ⇒ >n/3 honest voted).

Intuition: We want to argue that stage-2 QCs for block #9 can only be for block $B^*$. These [>n/3 honest nodes = set S] simply “lock-in” onto their vote for $B^*$ — they are never going to change it in stage-1 because of the line 6 in the pseudo-code pic above, and therefore there will be never >2n/3 votes in stage 2 for a block ≠ $B^*$.

Formally: (by induction)

• At the end of round r: (i) $B_i=B^*$ for all $i\in S$ [by pseudo-code: on round-r 3rd phase >n/3 casted votes for $B^*$⇒ on round-r 4th phase there cannot be QCs for other blocks. There cannot be also QCs in the background in the past, because round r is the “first” such round by definition] (ii) $QC_i$ from round-r stage-1 or later [obvious from the pseudo-code] (iii) all QCs for other blocks are from round r-1 or earlier [semi-obvious from the pseudo-code]
• In round r+1 no nodes from S change their mind: (i) + (ii) + (iii) all hold! [(iii) in round-r ⇒ leader cannot propose an earlier QC for a different block ⇒ >n/3 nodes in round-r+1 stage-1 don’t update → don’t vote ⇒ no QC for a different block can be formed in round-r+1 stage-2, qed]
• In the future rounds: same.

[…see notes for more details…]

qed.

## Tendermint: Proof of Liveness

Claim: Tendermint satisfies SMR liveness (eventually).

(Our SMR liveness property is going to be weaker than the one from Lecture 4: Old, strong livenes: every tx submitted to one honest node gets included. New, weaker liveness: every tx submitted to all honest nodes gets included. This is not a big deal, since honest nodes can communicate via some gossib protocol and share valid txs between each other.)

Proof: Consider a tx T known to all honest nodes.

Fast forwards to a pair $r_1$ and $r_2$ of consecutive rounds after GST+∆ with honest leaders $\ell_1,\ell_2$ (this exsists, since f<n/3)

Lemma: at start of round $r_1$, every honest node is working on block # h or h+1. [Roughly, this is because after commiting blocks, honest blocks broadcast stage-2 QC for that block, and since we are post-GST, those broadcasts do arrive to other honest nodes!]

Proof: […see notes…]

Definition: a round is clean if (i) post-GST; (ii) honest leader; (iii) all honest nodes working on same block #; (iv) after update in 1st phase, leader’s QC at least as recent as that of an honest node

Lemma: clean round ⇒ all honest nodes commit the block proposed by the leader. [proof is by inspecting pseudo-code + remembering we are in post-GST…]

Case 1: all honest nodes start round r working on block # h+1. ⇒ round $r_1$ is clean, commits block including T

Case 2: all honest nodes start round r working on block # h. ⇒ round $r_1$ is clean, commits some block ⇒ round $r_2$ is clean, commits block including T

Case 3: the leader is behind. […]

## Can we do better?

• Can’t increase # of Byzantine nodes (without compromising elsewhere)
• Can’t relax partial synchrony to asynchrony
• Can’t have both liveness and safety before GST