# A hitchhiker's guide to cryptography

## An introduction to cryptography

This chapter serves as an introduction to the cryptographic terms and constructs mentioned in the book. The aim is to give you an idea of what they are and how they might be used in a cryptocurrency context. I won’t go into low-level details of how they work, so you don’t need to know any mathematics or programming to follow along. If this interests you, I hope this introduction will be helpful as a starting point when researching the topics on your own.^{1}

^{1}If the history of cryptography interests you I can also recommend the book “The Codebreakers” by David Kahn. You can enjoy it even without much math knowledge.

## Hash functions

Hash functions, or to be more precise *cryptographic hash functions*, are commonly used in the cryptocurrency space. They’re used as the basis of proof-of-work, to verify the integrity of downloaded files and we used them when we created a timestamped message.

*collision*).

Hashes are *one-way functions*. As the name implies we can give data to a function to get a result, but we cannot go the other way to get back the original data if we only have the hashed result.

It’s similar to how we can break an egg, but there’s no easy way to “unbreak” it.

In the digital world we can use the popular SHA-256 hash function as an example:

`hello → 5891b5b522d5df086d0ff0b110fbd9d21bb4fc7163af34d08286a2e846f6be03`

But there’s no function to unwrap the hash directly:

`084c799cd551dd1d8d5c5f9a5d593b2e931f5e36122ee5c793c1d08a19839cc0 → ???`

To find out what’s hidden behind the hash we have to try all possibilities:

```
1 → 4355a46b19d348dc2f57c046f8ef63d4538ebb936000f3c9ee954a27460dd865
2 → 53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3
...
42 → 084c799cd551dd1d8d5c5f9a5d593b2e931f5e36122ee5c793c1d08a19839cc0
```

Found it! The answer is “42”. But we were lucky that we only had to test 42 possibilities, we could have continued a **very** long time depending on the input.^{2}

Don’t believe me? Then try to guess what message this SHA-256 output comes from, and I can even give you a hint that it’s only spaces, upper- and lower case letters:^{3}

`b409d7f485033ac9f52a61750fb0c54331bfdd966338015db25efae50984f88f`

To get a sense for how hard it can be to figure out the matching data for a hash, let’s look at the mining in Bitcoin. Because that’s really what miners do—they calculate SHA-256 hashes with different kinds of input again and again until they find a match. And they don’t require an exact match either, they only want to find a hash with a certain number of leading zeroes.

The current hashrate for Bitcoin is around 113 exahashes per second (2020-02-18). That’s a staggering 113 x 10^{18}, or 133 000 000 000 000 000 000, hashes per second, yet they’re still only expected to find a single solution every 10 minutes.

Even all of Bitcoin’s hashrate, working for millions of years, are not expected to find the reverse of a single hash. Even though there’s theoretically an infinite number of inputs that produce the same hash, it’s computationally infeasible to ever find one, therefore we can consider it practically impossible to reverse a hash.^{4}

If you want to give up and see what I encoded in the hash, Iron Man is my favorite superhero

^{2}I’ve simplified the explanation here a little. There’s not a one-to-one correspondence between an input and a hash, as several inputs can result in the same hash.

^{3}For security it’s important that the data you want to protect is sufficiently large and has enough variation to make it difficult to guess what it is.

It’s the same way you should choose a password; a short one made of only numbers is easy to guess, but a 30 character password is much harder.

^{4}We can say it’s impossible to reverse a hash if we have to brute force the solution like this, but there could be weaknesses in the hash function that could allow us to find it much earlier. The SHA-1 hash function is for example not secure anymore, as weaknesses have been found that can be used to generate collisions.

## Public-key cryptography

If you jump into the mathematical definitions of *public-key cryptography* it might look very complicated. While some details are complicated, the cryptography is conceptually simple; it’s a digital version of a locked mailbox.

Cryptographic schemes commonly use a single large number as their secret key, but public-key cryptography uses two keys: the *public key*, which is like the mailbox, and the *private key*, which is like the key to the mailbox. Anyone can give you mail—just slide it into the mailbox at the top—but you’re the only one who can read them, because you’re the only with the key to the mailbox.

You *encrypt* a message by placing it in the mailbox, this way nobody but the owner of the mailbox can *decrypt* and read it. The owner of the mailbox can also prove they own the mailbox by placing their name on it, an action that requires you to open the mailbox with the key. In digital terms this is how you *sign* a message.^{5}

^{5}This is where our mailbox metaphor breaks down a bit. It may seem that it’s more inconvenient to sign a message than to encrypt one, but digitally they’re both straightforward.

Large parts of the internet depends on public-key cryptography. For example when you connect to your bank over the internet, this scheme helps ensure that nobody can see how much money you have, who you pay and that you’re the only one that can transfer your money.

I won’t go into details on how the mathematics behind this scheme work, as I’m not able to without making the explanation needlessly complicated, but if this interests you I encourage you to look it up—I personally find it fascinating.^{6}

We will look at public-key cryptography in practice when we look at how Bitcoin addresses work.

^{6}RSA is one of the first public-key cryptography schemes and it was also the first one I studied. It’s fairly simple, so I think it’s a good starting point to understand public-key cryptography.

Bitcoin uses another, arguably more secure, scheme called ECDSA, which uses elliptic-curve cryptography.

## Bitcoin addresses

The addresses in Bitcoin (and other cryptocurrencies) use public-key cryptography to protect your funds. The address is a public key that everyone can send coins to, but to send coins from an address you need the private key.

This is for example a standard **Bitcoin address**:

`19WoNYNXnfNPmLteC8YmZFsTQoN9gBSbCG`

Which corresponds to the **public key**:

`049f6aad24669d180cfe4c974a677407cbf26f03242a09126ebf88621d31f01a218d40fcbcb769b44b014d502a1c9ce8c2ca629bc339fe14b4db56e27e80ac30a7`

The address could be displayed in various different ways, Bitcoin just happened to do it this way. Using an address is more convenient as it’s shorter and includes error checking codes.^{7}

The **private key** to this address looks like this:

`5298e83a0c0884cdcf34294f663220bc73e3c6689e95b53158a9a89e95fd78bb`

The private key is just a large number and can be be displayed in different ways. Here’s the same key in the Wallet Import Format, which is shorter and includes error checking codes:

`5JSfRE8qNQZTtdwuRx6pxVohC3C3VeAHvzKvLsZWHEGPdW2zF3o`

It’s important to note that you should *never* reveal your private key like this. Don’t take a screenshot of it, email it or post it on social media. Because if someone sees your key, they can steal all the coins from your address. The private key really is the key to the castle, and if you lose it you’ll lose all your funds forever. So please back up your private key somewhere safe (or the more user-friendly *seed*, but we’ll get to that shortly).

There are other types of addresses and other formats for the private and public keys, and other cryptocurrencies may handle them differently, but the concepts are the same.

^{7}Bitcoin Cash is a fork of Bitcoin and they have an additional address format. The same Bitcoin address, with the same public key, could also be displayed as this Bitoin Cash address:

`qpwk83ew0xwpe87mmm9v43nvzj2y4d783cmv7ayctd`

The reason Bitcoin uses public-key cryptography for the addresses is because you can **sign** messages with it. For example if I sign the message:^{8}

`Jonas sent the money`

With the private key to the address:

`19WoNYNXnfNPmLteC8YmZFsTQoN9gBSbCG`

I’ll get this signature:

`HCZl2+vEZboqXgaVYi1nLNgwoa/INLiEsA2yXe+87j5iFoo/G96m4AoA5dL5T+rTiFKpXHuS5w3rP1IWSPZZv0Q=`

Which lets you **verify** that I know the private key to the address, even if I never showed it to you. This can be useful if we’ve sent money to someone and we want to prove who did it.^{9}

This is also what happens in the background when you authorize a transaction; you sign it with your private key and your signature is verified before the transaction is accepted. If the signature doesn’t verify then the transaction is invalid and gets discarded, ensuring that coins can only be spent by the owner of the address.^{10}

^{8}I used the desktop version of Electron Cash for these examples.

^{9}Payment systems are usually smarter, so this is normally not needed.

^{10}How hard is it to fake a signature? Very hard, as there’s no known attack that can do it. The biggest threat is quantum computers, which *if* they live up to the hype could break public-key cryptography.

Quantum computers wouldn’t actually be able to steal all Bitcoins directly, since they can only discover the private key if there’s a signature. And if you have coins in your address, but you’ve never sent any coins from it before, no signature exists.

If quantum computers can break public-key cryptography we as a society would have much bigger problems than the security of Bitcoin, as it would break the security of the internet itself. (There is quantum secure cryptography we could potentially move to, so everything isn’t lost yet.)

**Encrypting** messages using your bitcoin keys isn’t that common to my knowledge—they typically use protocols such as PGP—but it’s possible. I’ll include a short example for completeness sake.

For example if you want to send me the message:

`I secretly love your book!`

But only want me to be able to read it, you can encrypt it with my public key:

`049f6aad24669d180cfe4c974a677407cbf26f03242a09126ebf88621d31f01a218d40fcbcb769b44b014d502a1c9ce8c2ca629bc339fe14b4db56e27e80ac30a7`

And you’ll get the encrypted message:

`QklFMQJ+CTugTvsEmuB7owU3DvC5taXqC5DhsJ3Wq8EmUMHwgsE54GlY1PI9d1R/OoGfq1mG9dcThW5T9fpUtQTY+ogLLvKsrN6ngeulLMrfoyCxFtLTjH78PGSd8eROQ1yPq1k=`

Which only I can **decrypt** to the original message. (Since I’ve given out the private key, you can decrypt it as well.)

## Seeds

Because private keys aren’t very user-friendly Bitcoin wallets use seeds. The seed is made up of a sequence of 12, or sometimes 24, words selected from a pre-determined set of 2048 possible words.^{11}

This is for example a 12-word seed:

`reward tip because lock general culture below strike frog fox chunk index`

Which corresponds to the private key:

`KyRoQMYWAtfj5cGLThb1fznm5Utjq7Etmn9DLtdxYCiE3Vntcz3E`

Much more user friendly right? Even though memorizing the public key directly is very difficult, you can see that it would not be too difficult to memorize the seed!

^{11}Variations among cryptocurrencies exist. A Monero seed is for example 25 words long.

In addition to being easy to use, seeds act as a starting point in deterministic wallets to generate multiple private and public key pairs (giving you multiple addresses).^{12}

Giving out a new address each time you receive money is useful for privacy purposes, as it makes it harder to connect your transactions with your identity. This is why all modern wallets generates a new address each time you press receive.

^{12}See the discussion about pseudo-random generators in the chapter about provably fair gambling for some theory of how it might be possible to generate a set of random-looking outputs from a seed.

Here are for example the first 10 addresses and their private keys of our seed:

Address | Private key |

`19oN2GWEH1uiPz11WyChkUp2che9Z11Q5A` | `KyRoQMYWAtfj5cGLThb1fznm5Utjq7Etmn9DLtdxYCiE3Vntcz3E` |

`1LverDkyaWMEyyFHiEWQaJt6UGxRjeBfQR` | `L1NH4wpuKzafbq2PtVXaGCE8hjc7KGzRfyfYik73APu7kZvdJxUp` |

`1QHQ8uFrEL29WAkMLQgkoDzHimEQNqubM1` | `Kx3c9ZeS2pzYPuLa2NoA14SavnsWpkf1BJDLDu1N52oGoNWgv9KM` |

`1HiohATeEm6BBeRCgWZ5vY3ZKFrCDsJnt9` | `KyCWZEpJ3AYUmB7MGEVvZfr6eiwgag89jmZtHC1tEVv9XynSqmot` |

`1KJ5oMUEJTyd3igAYjJGvpdVjGDvF1Brc6` | `L4exrFikcfgSYm1ZZBkJrbwouLjzrrJB6VPyaH4vyK8cAkK2V2nt` |

`1DzZJ6R1xXiQ3HJ3BsBAcviVdtUEeiu2UG` | `Kx37aUKrHRVdinxzHWTK8ebXWeMtRSbtshzonTMTQBrssQ2ms1JV` |

`134TjnZ8xiu4wxfyy4xQQtiMiKhhe6AVur` | `KwPqA3XUaWCX2dhRRm4WXArm5DJKXko1ydgwwApJ3BC3dgnQ3Ydg` |

`12XiJHvYT6TyaWcUhzdcBgqFZc3bNWpYdd` | `L2WakaNFfBehyL17c13iQwJKR8H1hQtsVvR5jsdugFfj9si8DZm2` |

`12MuxMtJb9jbrzMQrr7zDiLYcn6xwaXMkq` | `L2ScmsyKJYzW2koEPjHmLKzjFMYNfR8UZMifP2yvggrRrJEBU4UJ` |

`1MqBeJiVW6FqxKbrMq8mVUcukjXWMzuYew` | `KypFcqzaJRHPwxQfGDiYyJMtAdyKNSQuR78yZPTU57baS42dp4tr` |

I reiterate the importance of backing up and protecting your seed. Here are just some ways you could lose your money:

- You have a wallet on your phone but you lose it or it breaks down.
- You’ve written down your seed on paper, but it burns up.
- You forgot where you wrote down your seed.
- Someone finds your seed, and steals your money.

Therefore it’s of utmost importance for you to backup and protect your seed. Ideally you should have multiple encrypted copies in different locations, protected from fire and theft.

Does this sound too difficult? It’s true, there are many pitfalls and it’s easy to do a bad job. But in practice, for reasonably small amounts, it’s enough just to write down your seed somewhere.