# Lecture 4. Encryption and digital signatures.

Resources:

UBasel course: 2, 3, 4

Simon Singh, “The Code Book”

RSA paper

time spent in class: 3 hours (did not cover elliptic cryptography)

Diagram codes
flowchart TB
classDef default fill:#F8ECFB, stroke:#000, stroke-width:2px, text-align:left
classDef green fill:#C5F1AB,stroke:#000,stroke-width:2px
classDef red fill:#FFA69D,stroke:#000,stroke-width:2px
classDef blue fill:#DAF4FF,stroke:#000,stroke-width:2px
classDef white fill:#ffffff,stroke:#000,stroke-width:2px
classDef orange fill:#FDDCBE,stroke:#000,stroke-width:2px

SW("<b>Secret writing</b>")
St("<b>Steganography</b></b>\n(hiding the fact of communication)\n\n")
Cr("<b>Cryptography </b>\n(hiding the content of the message\n\n")
Tr("<b>Transposition</b>\n(subparts are permutated)\n\n")
Sub("<b>Substitution</b> \n(subparts are substituted)\n\n")
Code("<b>Code</b> \n(words are subbed; need a dictionary)\n\n")
Cipher("<b>Cipher</b>\n(letters are subbed; really low level)\n\n"):::orange
SW-->St
SW-->Cr
Cr-->Tr
Cr-->Sub
Sub-->Code
Sub-->Cipher
HW:

How does the “signature” part works in the RSA scheme? (we only described encryption)

# 1. Symmetric cryptography

There are various flavors of secret writing. We focus on ciphers.

### Monoalphabetic substitution (1 to 1)

• Simple case: shift alphabet by $x\in\{1,…,25\}$ positions [Caesar Cipher]
• Easy to crack since only 25 different alphabets.

• More advanced: arbitrary letter mapping.
• There are 26! different alphabets — much harder to decipher. Was cracked using frequency analysis of symbols!

### Improved monoalphabetic substitution (1 to N)

• Use symbols that delete preceding symbol (to mess around with frequencies)
• Homophone encryption: use multiple symbols to encrypt one letter (1 to N)
• Intentional misspelling words
• Replace single words with one symbol (ie mix cipher with code)

### Polyalphabetic substitution (N to N)

• Mapping of symbols depends on the position of that symbol in the text! Hard to break.
• An example, with the shared secret code word “CIF”, and Vigenère Cipher:
•  Code word C I F C I F C I F C Plain text b l o c k c h a i n Cipher text D T T E S H J I N P

### Breaking Polyalphabetic Substitution

• Ample frequency analysis not possible
• In mid 19th century vulnerability was discovered: use repetition of code word as a starting point
• Looking for patterns (like the word “the”), try to guess how long the code word is (n), and use frequency analysis on every n-th cipher letter

### Unbreakable Cipher

• Weakness of Vigenère Cipher: repetition of the code word (= key)
• Solution
• length of key = length of text
• random key (don’t use words or lists)
• use each key only once
• It’s called “onetime pad cipher” → theoretically unbreakable! [Used between USA and USSR]
• Still need exchanging keys (or, more precisely, lists of keys, in order to use them more than once)

### Encryption in the age of computers

• Electronics are much faster than mechanical parts
• Cipher machines become possible
• Key difference: Letters are translated into Numbers (eg ASCII encoding below)
• Nowadays people use utf standard, which has different language letters, and even emojis :)

### DES: Data Encryption Standard (block cipher)

• Companies need a standardized approach
• Encryption method “Lucifer” by Horst Feistel in early 1970s (this is a block cipher, see the prev leccture for the definition)
• Later becomes “DES”
• Disadvantage: key distribution problem persists, need a secure channel of communication “in the past”

# 2. Asymmetric Cryptography

Two ways to solve the “key distribution problem”:

1. Come up with a secure key exchange algorithm
2. Asymmetric cryptography: public and private keys

### [Diffie-Hellman ‘76] key exchange

Gives a solution to the problem of generating a secure key for symmetric encryption on a potentially compromised channel.

The key exchange

Alice and Bob publicly predefine a one-way function $G^x ~(mod~P)$, where $P$ is prime and both $G$ and $P$ are large numbers.

Next they implement the following strategy, where all operations happen $mod~P$:

Alice

1. Chooses privately a=3
2. Sends Bob $x=G^a$, receives $y$
3. Computes $y^a=G^{ab}$

Bob

1. Chooses privately b=6
2. Sends Alice $y=G^b$, receives $x$
3. Computes $x^b=G^{ab}$

Result: the key $G^{ab}$ is now securely shared!

More details: (from Section 10.4.1)
The key reason for why Diffie-Hellman key exchange is secure is the fact that taking discrete logarithm is practically impossible.

Problem in Diffie-Hellman key exchange is the separate setup for all “pairs” of actors — cumbersome process.

💡
[Diffie ‘75]: theoretical idea of asymmetric encryption. Two separate keys: private key to encrypt messages, and public key to decrypt messages.

(Inspired by Diffie’s idea)

Properties of RSA Alice keeps priv_key as a secret, and publishes to everyone pub_key. Two scenarios now:

1. Secrecy application Bob can encrypt a message by pub_key, and this message can be decrypted only by Alice using priv_key. [Bob]———msg’=encrypt(msg, pub_key)———>[Alice] [Alice]———decrypt(msg’, priv_key)———>msg
2. Integrity and Authenticity application Alice can sign a message by priv_key, and anyone can check using pub_key that the message was signed by Alice (authenticity) and no-one else (integrity). [Alice]———sig=sign(msg, priv_key)———>[Bob] [Bob]———verify(sig, pub_key)———>True/False

Note that public key in this scheme is a perfect thing to equate with “identity”, an idea that is used evreywhere in blockchains.

Key facts and concepts behind RSA:

• Euclidean algorithm ⇒ [gcd(a,b)=d ⇒ ∃x,y s.t. ax+by=d]
• Euler’s totient function ϕ(N) is the amount of numbers 0<a<N s.t. gcd(a,N)=1 [E.g. if p,q are prime, then ϕ(pq) = pq-(p-1)-(q-1)-1 = (p-1)(q-1)]
• Euler’s theorem: [gcd(a,n)=1 ⇒ $a^{\phi(n)}$ ≡ 1 (mod n)]
A more detailed crash-course into modular arithmetics and Euler’s theorem

(numbers are assumed to be integers below)

1. Modular arithmetics [a≡b (mod n) ⇒ a+x≡b+x (mod n) AND ax≡bx (mod n)]
2. Euclidean algorithm
3. ⇒ [gcd(a,b)=d ⇒ ∃x,y s.t. ax+by=d]

⇒ [gcd(a,n)=1 ⇒ ∃x s.t. ax ≡ 1 (mod n)]

⇒ (cancellation law) [gcd(a,n)=1 AND ax≡ay (mod n) ⇒ x≡y (mod n)]

⇒ [gcd(a,n)=1 ⇒ all possible different residues of N, $\{x_1,x_2,…,x_{ϕ(N)}\}$, are permutated via multiplication by a]

⇒ [gcd(a,n)=1 ⇒ П$x_i$≡Пa$x_i$$a^{\phi(n)}$П$x_i$ (mod n)]

(via cancellation of each $x_i$ in П$x_i$$a^{\phi(n)}$П$x_i$)

⇒ [Euler’s theorem: gcd(a,n)=1 ⇒ 1 ≡ $a^{\phi(n)}$ (mod n)]

The RSA setup: 1) Alice chooses two primes, p and q; computes N = pq 2) Chooses an additional integer e, such that gcd[e, ϕ(N)]=1, where ϕ(N)=(p-1)(q-1) is the number of prime-relative-to-N numbers smaller than N. 3) Computes the private key k by (extended) Euclidean algorithm, according to equation e·k = 1 mod ϕ(N) [Only Alice can do it, since only she knows p and q, and also gcd[e, ϕ(N)]=1 guarantees that k exists.]

Rmk: k is the inverse of e in $\mathbb Z^*_{N}$, and is called the trapdoor for the one-way function $\mathbf{x^e~(mod~N)}$, since it allows to invert it.

 The RSA algorithm Alice Bob • public key: (N, e) • private key: p, q, k Conditions: - p & q are prime, N=pq - gcd[e, ϕ(N)] = 1 - e·k = 1 mod ϕ(N) [note: ϕ(N)=(p-1)(q-1)] Wants to securely send an integer m to Alice Encrypts the message by computing $\mathbf{c=m^e}$ (mod N) $\longleftarrow$ c Decrypts the message: $\mathbf{c^k=m^{ek}=m^{\ell\cdot \phi(N)+1} = }$ $\mathbf{=(m^{\phi(N)})^\ell m=1^\ell m = m ~(mod~N)}$

Remark: gcd(m, N)=1 is needed as a condition of the Euler’s theorem, and it is true probabilistically.

Example with numbers:
1. Alice chooses two primes, p and q (say p=17, q=11); computes N = pq (N=187 in our case); chooses an additional integer e (say e=7), such that gcd[e, (p-1)(q-1)]=1; (N, e) is Alice’s public key. ( (187,7) in our case )
2. Bob encodes message M as an integer (e.g., letter X in ASCII, giving M=88); computes encrypted message C using Alice’s public key:
3. $\mathbf{C=M^e~(mod~N)}$

In our example $C=88^7~(mod~187)=11$

4. Alice computes the private key $k$ by (extended) Euclidean algorithm, according to equation:
5. $\mathbf{e\cdot k = 1 ~(mod~ \phi(N))}$

where $\phi(N)=(p-1)(q-1)$ is the number of relative to N prime numbers smaller than N. Only Alice can do it, since only she knows p and q, and also gcd[e, (p-1)(q-1)]=1 guarantees that $k$ exists. [$k$ is the inverse of $e$ in $\mathbb Z^*_{\phi(n)}$, and is called the trapdoor for the one-way function “x^e (mod N)”, since it allows to invert it]

In our case: $7\cdot k_p=1 ~(mod~ 16\cdot10) = 1 ~ (mod ~ 160) \implies k=23$

6. Alice decrypts Bob’s message using Euller’s theorem, working modulo $N=pq$:
7. $C^{k_p}=(M^e)^{k}=M^{\phi(N)\ell +1}=(M^{\phi(N)})^\ell\cdot M=1^\ell M= M ~(mod ~N)$

In our case $C^{k}=11^{23}=88 ~( mod ~ 187)$ ⇒ M=88 ⇒ symbol is X via ASCII.

Crucial assumption for security of RSA: should be impossible to find p and q ⇒ N needs to be sufficiently large.

Definition: a Digital Signature Scheme (DSS) consists of 3 algorithms:

1. Gen(): outputs a key pair (priv_key, pub_key)
2. Sign(priv_key, msg): outputs a signture σ
3. Verify(pub_key, msg, sig): outputs True or False

Note: unlike signatures in RL, digital signatures depend on the message.

HW: build a DSS using the RSA scheme.

Definition: (informal) DSS is secure if adversary seeing many signatures of Alice (on messages of her choice) cannot forge a signature on a new message.

Families of DSS: (none of these are quantum-resistant)

1. RSA signatures (not used in blockchain) - long signatures and public keys (≥256 bytes), fast to verify - might not be resistant long-term (the Quadratic Sieve and the General Number Field Sieve get more efficient as the numbers get larger)
2. ECDSA and Schnorr (Ethereum before the merge, Bitcoin) - short signatures (64 and 48 bytes) and public keys (32 bytes) - better resistance
3. BLS signatures (Ethereum 2.0) 48 bytes, aggregatable (gamechanger for blockchains), easy threshold building (3 out of 5, say)

Note: post-quantum signatures are long (≥768 bytes). The signatures form the bulk of tx data on a blockchain.

### Elliptic cryptography (ECDSA)

(screenshot taken from this lecture)

• Elliptic curve is a graph of Weierstrass equation: $y^2=x^3+ax+b$ Non-singularity conditions: $4a^3+27b^2\neq0$
• Symmetry along the x-axis.
• Addition of points on Elliptic curve:
• Point doubling

Q. On elliptic curve, why geometrical addition of points coincides with algebraic addition? What about point doubling?

• Bitcoin (before Taproot upgrade) and Ethereum (before The Merge upgrade) use secp256k1: $y^2=x^3+7 ~ (mod~ p)$ where $p=2^{256}-2^{32}-2^9-2^6-2^4-1$.
• This elliptic curve consists of 39 points!

• ECDSA cyclical subgroup:
• the whole subgroup:

• ECDSA private and public keys
• $k_{prv}=9$ is the private key = number of steps in cyclic subgroup starting from the generator
• $G=\langle(8,1)\rangle$ is publicly known subgroup, |G|=13 in 39 element elliptic group $y^2=x^3+7$ over $\mathbb F_{37}$.
• $K_{pub}=9G(=G^9)$ = (23,1) is the public key. Is computed efficiently!
• This difference is the key for why its very easy to derive public key from private (log amount of steps), but not the other way around (n amount of steps, where n=|<G>|=13 in our case)

### ECDSA signature scheme explained theoretically

• $\langle g\rangle=G$ is a cyclic subgroup of some elliptic curve over $\mathbb F_p$ $q=|G|$ better be prime as well
• gen()
• Choose $\alpha \in \mathbb Z^*_q$, compute $u=g^\alpha$
• Output priv_key=$\alpha$ pub_key=$u$
• sign($\alpha$,m)
• Choose $\alpha_t\in \mathbb Z^*_q$, compute $u_t=g^{\alpha_t}$ $=$ $(x,y)$ on elliptic curve
• Output two things: $r=[x]_q$ $\in \mathbb Z_q$$,~ s = (m + [x]_q \alpha) / \alpha_t \in \mathbb Z_q$
• Repeat until r≠0 and s≠0.
• verify($u$, m, (r,s) )
• Compute a = m / s $\in \mathbb Z^*_q$, b = r / s $\in \mathbb Z^*_q$
• Compute $\hat u _t = g^a u^b = (\hat x, \hat y)\in G$
• Verify if $[\hat x]_q= [x]_q$ is True or False.
• Why it works?
• The points $u_t$ and $\hat u_t$ are the same:
• $\hat u_t= g^a u^b = g^{m/s} (g^{\alpha})^{r/s}= g^{\frac{m \alpha_t}{m+[x]_q\cdot\alpha}} g^{\alpha\frac{[x]_q\cdot \alpha_t}{m+[x]_q\cdot\alpha} }=\big(g^{\frac{m+[x]_q\cdot\alpha}{m+[x]_q\cdot\alpha}}\big)^{\alpha_t}=g^{\alpha_t}=u_t$

Example of a signature: (simplified)

• $y^2=x^3+7$
• G=(8,1)
• $k_{prv}=9$
• $K_{pub} = (23,1)$
• $t=4$ (message)
1. Choose a random number, e.g. $i=7$ (chosen differently for every signature)
2. Compute (r and s comprise signature)
1. $P=i\cdot G=7G=(18,20)$
2. $r=x_P ~(mod~ 13)= 18~(mod~ 13)=5$
3. $s= \{i^{-1}(t+rk_{prv}\}~ (mod ~ n)$ $=\{2+(4+5\cdot 9)\} ~( mod ~ 13)$ $=7$
4. (Inverses are computed easily by mirroring)

3. Send
1. (r, s) = (5, 7)
2. t=4 (this is what you are signing)
3. $K_{pub}= (23,1)$

Example of verification: (simplified)

• $y^2=x^3+7$
• G=(8,1)
• $K_{pub} = (23,1)$
• $t=4$
• (r, s) = (5, 7)
1. Compute
1. $u_1 = (s^{-1}t) ~ (mod ~ n)=8$
2. $u_2 = (s^{-1}t) ~ (mod ~ n)=10$
3. $P = G^{u_1} + K_{pub}^{u_2}= 8 G+10 (23,1)= (32,20)+(8,36)=(18,20)$
2. Check authenticity: $x_P ~ (mod ~ n)=r$ (in our case $18 ~ (mod ~ 13) = 5$)