Coordination
Coordination is an extremely important ability that distinguishes humans from animals.
For example, the concept of private property is a form of coordination, around a common ledger of who-owns-what. Some argue that the most important driver for innovation in the west has been “land ownership”, because settlers were incentivised to develop their piece of land.
There are also other forms of coordination.
Let’s try to break down what coordination is. Coordination requires consensus (we agree on a plan) and trust (you will not deceive me). Think how family chooses their future spendings, how small community agrees on a budget, etc.
To achieve coordination (consensus+trust) on a bigger scale, centralized control proved to be very effective:
- Centralized entities can effectively make decision on a large scale (government’s policies on military, public education, public healthcare, etc)
- Centralized entities can effectively maintain databases (paypal, facebook)
- Centralized entities can effectively stop people from stealing from each other, by promising to punish them (police)
Collusion
Another effective tool for coordination, but with no trust/consensus, are free markets. There economic incentives drive people to specialize and exchange goods
But there is a flipside to free markets: economic incentives may drive people to coordinate in a bad way. This phenomenon is called collusion. A simple example would be someone selling their vote in an election. A more complicated example would be all sellers of a product in a market colluding to raise their prices at the same time.
In addition to free markets and economic incentives, the second most frequent component in collusions is a centralized control. Examples are endless:
- A dictator taking advantage of their power, by using citizens’ taxes to enrich themselves/their friends (by signing gov contracts with dictator’s friends), or sending country’s soldiers to a war that is destined to be lost.
- A lobbyist giving a politician a bribe in exchange for that politician adopting the lobbyist's preferred policies.
- Companies excessively raising their prices in a monopoly.
- Discretionary censorship on social media platforms.
- Facebook and other internet giants not sharing profits with their users, whose data and attention they are selling (if you think about it, this is the special case of 3.)
- Banks not sharing revenues with depositors, justifying this by low interest rates environment (again, a special case of 3.)
Blockchain = collusion resistant coordination tool
From this point on we will focus on the following problem: how to safely coordinate people around ownership (”who owns what”)? In more simple terms, how do we keep track of people’s balances, in a shared ledger? An almost trivial solution is to use a central party (e.g. paypal) which will keep track of all the balances. But there is a
Key problem: achieving ownership coordination (trust+consensus) via centralization heavily skews economic incentives, leading to collusion = bad coordination. So how can we achieve coordination without trusting a central authority?
- We cannot all trust each other, because there could be malicious actors. Thus we need our communication to be verifiable, and this leads us to cryptography, namely digital signatures.
- But also, how do we agree on a plan, without central authority? We need to have a consensus protocol.
Let me reiterate: fundamentally, blockchains allow good coordination at scale without invoking collusion (bad coordination).
Remark 1. Blockchains are not “fully trustless” because there are certain trust assumptions in the consensus protocol, like “more than 50% of nodes are honest”.
Remark 2. Decentralization has another nice property (in addition to collusion resistance): lack of a single point of failure. This implies fault tolerance and attack resistance.
Remark 3. Collusions happen in blockchains too, but there are ways to defend: counter-coordination by forking!
Before describing how Bitcoin (the first blockchain) works, we need to cover two cryptographic primitives.
Digital Signatures
Digital signature scheme consists of three functions:
- gen() [output is two strings (priv_key, pub_key)] → (priv_key, pub_key) This is a generating function with no input, which outputs initial pair of strings: Alice keeps priv_key secret, while pub_key represents an “address” of Alice, and is known to everyone. The two keys are connected in a special way, such that the two functions below work as described.
- sign(priv_key, message) [output is a string sig] → sig This function allows Alice (since only she knows priv_key) to “sign” any message with a private key, and outputs a signature string sig. Note that, unlike real life signatures, digital signatures depend on the message
- verify(pub_key, message, sig) [output is True or False] → True, if sign(priv_key, message)=sig → False, otherwise This function allows anyone to verify whether sig was indeed obtained by signing message with priv_key.
The key property of digital signatures is that
- Digital signatures are unforgeable, i.e. no one can generate a signature sig except Alice. This ensures that if the verify function is True, we definitely know that the message was signed by Alice.
It is a great achievement of modern cryptography that digital signatures exist. More on how they actually work later.
Hash functions
Hash function is a function
- hash(any-length-message) [output = 256bit number (fixed length!)]
Key properties of hash functions:
- Irreversibility: given y, it is infeasible to find x such that hash(x)=y
- Collision resistance: given x, it is infeasible to find x’ such that hash(x)=hash(x’)
- Uniform randomness: large number of possible outputs, each occurring with roughly the same small probability.
Next week we will devote 2-3 lectures to the details of hash functions and digital signatures. But now let’s move on and consider how can one possibly design an electronic currency (e-coin).
Introduction to Bitcoin
We will build up a Bitcoin-style cryptocurrency step-by-step, by proposing various versions, observing problems in them, and solving those problems.
- Alice tries to send Bob some digital file as an “e-coin”.
- Alice cryptographically signs a message of type “Alice sends Bob 2 e-coins”
- Alice cryptographically signs messages of type “Alice_tx#174: send Bob 2 e-coins”
- Some trusted centralized entity (bank) is watching the balances.
- Substitute bank by a set of nodes: node maintain (validate the signatures/balances) a shared ledger of balances.
- Introduce fees: Alice signs messages of type “Alice_tx#174: Alice sends Bob 2 e-coins, and pays 0.1 e-coin as a fee to the node validating this tx”.
- Make nodes do some sort of cross-checking, in order for them to achieve consensus (stay in sync). Achieving consensus means that they all agree on the same set of confirmed transactions.
- Longest-chain consensus mechanism (instead of cross-checking idea);
- Proof-of-work sybil-resistance (made possible by the mechanics of 1.
- Use longest-chain consensus instead of cross-checking: once a node has a valid block of txs, it simply appends it to the blockchain and broadcasts this new blockchain to others.
- Use proof-of-work as a sybil-resistance mechanism: make it computationally costly to create blocks. In more detail: - Alice broadcasts “Alice_tx#174: Alice sends Bob 2 e-coins; fee = 0.1 e-coin”. - A node (say David) adds this message to their pending queue of txs. - Then David, after checking if txs add up and valid, first solves the proof-of-work puzzle which requires computational power (see 🟢 below), and only then appends the block to the longest chain.
- Reward nodes for mining by inflating the total supply of e-coins.
- Make rewards decrease by 50% each 4 years. This way the total supply of e-coins will be capped.
Problem: copying this kind of money is very easy.
Benefit: nobody can do it, except Alice, and also we can be sure that Alice did it.
Problem: what if Alice sends 3 such messages to Bob? Did she send 2 or 6 e-coins?
Benefit: if Alice sends Bob 3 identical messages “Alice_tx#174: send Bob 2 e-coins” then she sent 2 e-coins. If Alice sends Bob 3 messages “Alice_tx#174: send Bob 2 e-coins” “Alice_tx#175: send Bob 2 e-coins” “Alice_tx#176: send Bob 2 e-coins” then she sent 6 e-coins.
Remark: this is a solution used in Ethereum blockchain. In Bitcoin the coins themselves are numbered, not the transactions; see “the UTXO model” secction at the end.
Bank does two things: (1) Validating txs = checking whether the signature is valid = checking whether the balance allows for the spend (2) Confirming txs = accepting the tx, and updating the balances of Alice and Bob
Benefit: no double-spending.
Problem: banks have an enormous power by controlling people’s balances ⇒ their economic incentives are skewed towards collusion.
Remark 1: in reality we have a much bigger problem:
Even though it seems that banks holding our money are providing services “for free”, this is far from truth — in reality, banks extract quite a lot of value from their clients’ capital, by investing the capital into relatively safe assets like gov treasuries, ETFs, etc. Key idea here is that “being able to manage someone’s capital” has value. This is the reason for why banks often pay their customers, especially to attract new customers. (Since “changing a bank” is an extremely rare occasion in practice, banks do not worry as much about retaining customers.)
Remark 2: even if banks would just maintain a ledger of our monies (”watch the balances” as opposed to “have the balances”), they still would have too much power and therefore heavily skewed economic incentives. (As an example: we will stop watching your balances unless you deposit your money to us.)
Benefit: no centralized control ⇒ no collusion.
Remark 1: since there is many transactions, it is convenient to confirm (=validate signatures/balances) txs in “batches”, or, as we call them, blocks. As a result, we obtain a data structure called block chain. We add in there hashes to make it tamper-evident.
Remark 2: in reality, the division between users / nodes happens by downloading different kinds of open-source software. The fact that it is open-source is extremely important to eliminate security/centralization risks. Another extremely important consequence is that anyone can download and run software, and so therefore anyone can become a node in the network — this makes the system permissionless.
Problem: free-ride problem — people will happily use such a public good type of a service, but why would nodes actually maintain the ledger?
Benefit: nodes now have an economic incentive to maintain the network.
Problem: double-spending is still possible for Alice — she, having only 2.1 e-coins, can simultaneously (1) send 2 e-coins to Bob and broadcast it to one set of nodes; (2) send 2 e-coins to Charlie and broadcast it to another set of nodes.
Thus we need a way for nodes to stay in sync. We say that nodes need to achieve consensus.
Benefit: solves double-spending problem.
Problem: since our system is permissionless, Alice can spin up her own malicious nodes, in order to highjack the cross-checking part. Broadly speaking, our system is is vulnerable to Sybil attacks […], which means that anyone can spin more than one node and thus increase their voting power.
Important remark: there are ways to tackle this problem in the permissioned setting (where nodes know each other in advance), but nobody before Satoshi Nakamoto figured out how to build both permissionless and sybil-resistant electronic money. In his ground-breaking paper, he introduced two novel ideas:
Of course, there will be competing blocks (forks), simply because nodes may create blocks simultaneously or because there are some malicious blocks. Thus our data structure is no longer a block chain, but rather is an in-tree.
So the question is which chain of the in-tree is considered “correct”?
If there is a tie (several longest chains), just wait a bit and one of the chains will start winning. As soon as it wins just a little, all honest nodes start building on it and it starts to win confidently.
Problem: this system still suffers from Sybil attacks, because blocks can be “rolled back” or “reorganized” if malicious nodes build up a longer chain:
So we need to somehow make sure that there is <49% malicious nodes.
Benefit: the system transforms from “one-node-one-vote” to “one-CPU-one-vote”. So in order to perform the reorg attack, malicious actor would need to have >51% of computational power on the network.
This is a great way to mitigate Sybil attacks, since it is much harder to obtain >51% computational power in the network, rather than >51% of the number of nodes in the network.
Terminology: to mine means to solve the proof-of-work puzzle. This is why in Bitcoin nodes are often called miners.
Problem: why would David waste a lot of his electricity on this mining… Any why is it called “mining”? :)
Benefit: miners have the incentive to perform validation / mining.
Problem: we infinitely inflate the total supply of e-coins, and their value depreciates.
In Bitcoin reward halves every 210,000 validated blocks ~ 4 years. Initially it was 50 BTC, now it is 6.25 BTC.
Remark: whether or not this will work in the future is not clear at all, since currently Bitcoin fees are not enough to make Bitcoin mining profitable.
Remark 1: there needs to be the “genesis” block in every blockchain, creating initial e-coins. In Bitcoin, the genesis block has been created with 50 BTC in the wallet of Satoshi Nakamoto.
Remark 2: in longest-chain consensus blockchains (e.g. Bitcoin, Ethereum), a transaction is not considered confirmed until: (1) it is part of a block in the longest fork, and (2) at least 5 blocks follow it in the longest fork. In this case we say that the transaction has “6 confirmations”. This gives the network time to come to an agreed-upon the ordering of the blocks.
The UTXO model
To give a logically complete picture of how Bitcoin works, we end with describing the UTXO model.
Ethereum uses the more natural account model of keeping track of users’ balances. It is a simple database that keeps track of user’s addresses and their balances.
Bob’s_address: 3 BTC
Alice’s_address: 1.5 BTC
Carol’s_address: 2 BTC |
Transactions in the account model logically look very simple: [Alice → Bob, 0.5 BTC]
Bitcoin uses the UTXO model, and it is best described as a database that keeps track of “chunks of coins” called utxo’s, along with their owners.
utxo1(1 BTC): Bob’s_address
utxo2(2 BTC): Bob’s_address
utxo3(1.5 BTC): Alice’s_address
utxo4(2 BTC): Carol’s_address |
Transactions in the UTXO model look like [utxo3(1.5 BTC):Alice → utxo5(0.5 BTC):Bob, utxo6(1 BTC):Alice]. Note that utxo6 in this example is the “change”.