Chapter 9: Keys and Addresses | Elementary Bitcoin

In Volume I, we developed the mathematical machinery of elliptic curves and digital signatures. Now we apply this knowledge to construct the fundamental objects of the Bitcoin protocol: private keys, public keys, and addresses. These form the cryptographic identity system that allows users to receive and spend bitcoin without relying on any central authority.

The journey from a random number to a Bitcoin address involves several layers of transformation, each serving a specific purpose. Understanding these transformations is essential for anyone who wishes to implement Bitcoin software or verify its correctness.

9.1 Private Keys

A Bitcoin private key is, at its core, simply a number. But not just any number—it must be chosen with extreme care.

Definition 9.1 (Private Key)

A private key is an integer d satisfying 1 ≤ d < n, where n is the order of the secp256k1 curve:

n = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141₁₆

In decimal, n ≈ 1.158 × 10⁷⁷.

The requirement that d < n ensures that scalar multiplication d · G produces a valid curve point. The requirement that d ≥ 1 excludes the trivial case where the public key would be the point at infinity.

Remark 9.1 (On Randomness)

The security of Bitcoin depends entirely on private keys being chosen uniformly at random from the valid range. A private key generated with insufficient randomness can be discovered by an attacker, resulting in theft of all associated funds.

Cryptographically secure random number generators (CSPRNGs) must be used. Common sources include /dev/urandom on Unix systems, CryptGenRandom on Windows, or hardware random number generators.

Example 9.1 (A Valid Private Key)

The following 256-bit hexadecimal value is a valid private key:

e8f32e723decf4051aefac8e2c93c9c5b214313817cdb01a1494b917c8436b35

This is simply a 32-byte sequence. In binary, it contains exactly 256 bits. The probability that a randomly chosen 256-bit number falls within the valid range is approximately 1 − 2⁻¹²⁸, which is negligibly different from 1.

9.1.1 The Keyspace

The set of valid private keys contains approximately 2²⁵⁶ elements. To appreciate how large this is, consider:

The number of atoms in the observable universe: ≈ 10⁸⁰ ≈ 2²⁶⁶
The number of valid Bitcoin private keys: ≈ 2²⁵⁶ ≈ 10⁷⁷
The age of the universe in nanoseconds: ≈ 10²⁶ ≈ 2⁸⁶

Even if every atom in the universe were a computer generating a billion keys per second since the Big Bang, the probability of generating any specific private key would still be negligible.

Theorem 9.1 (Birthday Bound for Collisions)

If m private keys are generated uniformly at random from a space of size N ≈ 2²⁵⁶, the probability of any collision (two identical keys) is approximately:

P(collision) \approx m² / (2N) = m² / 2²⁵⁷

For this probability to reach 50%, we would need m ≈ 2¹²⁸ key generations—a number beyond any conceivable computation.

9.2 Public Keys

Given a private key d, the corresponding public key is computed via elliptic curve scalar multiplication.

Definition 9.2 (Public Key)

The public key corresponding to private key d is the elliptic curve point:

K = d \cdot G

where G is the generator point of secp256k1.

This computation is the "easy direction" of the trapdoor function we studied in Chapter 1. Given d, computing K requires only O(log d) group operations via the double-and-add algorithm. The reverse—finding d given K—is the elliptic curve discrete logarithm problem (ECDLP), believed to be computationally infeasible.

Figure 9.1: Public key derivation from private key via scalar multiplication.

9.2.1 Public Key Serialization

A public key is a point K = (x, y) on the secp256k1 curve. Both coordinates are 256-bit integers, so the full representation requires 64 bytes (512 bits). However, Bitcoin supports two serialization formats.

Definition 9.3 (Uncompressed Public Key)

The uncompressed serialization of a public key (x, y) is the 65-byte sequence:

04 || x || y

where 04 is a single prefix byte, and x and y are each encoded as 32-byte big-endian integers.

The uncompressed format is wasteful. Since any valid x-coordinate determines at most two possible y-coordinates (the curve equation is quadratic in y), we can compress the representation.

Definition 9.4 (Compressed Public Key)

The compressed serialization of a public key (x, y) is the 33-byte sequence:

prefix || x

where the prefix byte is 02 if y is even, or 03 if y is odd.

To recover the full point from a compressed key, one computes y² = x³ + 7 (mod p), takes the square root, and selects the root with the correct parity.

Example 9.2 (Public Key Formats)

For the private key from Example 9.1, the public key is:

Uncompressed (65 bytes):

04
39a36013301597daef41fbe593a02cc513d0b55527ec2df1050e2e8ff49c85c2
3cbe7ded0e7ce6a594896b8f62888fdbc5c8821305e2ea42bf01e37300116281

Compressed (33 bytes):

0339a36013301597daef41fbe593a02cc513d0b55527ec2df1050e2e8ff49c85c2

The prefix 03 indicates that the y-coordinate is odd.

Remark 9.2 (Compressed Keys are Standard)

Modern Bitcoin software uses compressed public keys exclusively. Uncompressed keys waste block space and are discouraged. Taproot (BIP-340) goes further, using only the x-coordinate with an implicit even y.

9.3 Hash Functions in Address Generation

Bitcoin addresses are not public keys directly, but rather hashes of public keys. This indirection provides several benefits: shorter addresses, quantum resistance (partial), and scripting flexibility.

Definition 9.5 (HASH160)

The HASH160 function is the composition:

HASH160(x) = RIPEMD160(SHA256(x))

This produces a 160-bit (20-byte) digest from an arbitrary input.

The use of two different hash functions (SHA-256 followed by RIPEMD-160) provides defense in depth: if either function is found to have weaknesses, the composition may still be secure.

Figure 9.2: The HASH160 function produces a 20-byte public key hash.

9.4 Base58Check Encoding

Raw binary data is inconvenient for humans to read, copy, or transcribe. Bitcoin's legacy addresses use Base58Check encoding—a variant of base conversion designed for clarity and error detection.

9.4.1 The Base58 Alphabet

Definition 9.6 (Base58 Alphabet)

The Base58 alphabet consists of 58 characters:

123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

Notably absent are: 0 (zero), O (capital O), I (capital I), and l (lowercase L)—characters easily confused in certain fonts.

Base58 encoding treats the input bytes as a big-endian integer and converts it to base 58, mapping each digit to the corresponding alphabet character.

9.4.2 Adding the Checksum

Definition 9.7 (Base58Check)

Base58Check encoding of data d with version byte v is:

Base58(v || d || checksum)

where checksum is the first 4 bytes of SHA256(SHA256(v || d)).

Figure 9.3: Base58Check encoding with version prefix and checksum.

The 4-byte checksum catches transcription errors with high probability. A single changed character will almost certainly produce an invalid checksum, alerting the user before funds are sent to an unreachable address.

Theorem 9.2 (Checksum Error Detection)

For a random single-character error in a Base58Check-encoded string, the probability of the error going undetected is approximately 2⁻³², or about 1 in 4 billion.

9.5 Legacy Addresses (P2PKH)

The original Bitcoin address format encodes a public key hash with a version prefix indicating the network.

Definition 9.8 (P2PKH Address)

A Pay-to-Public-Key-Hash (P2PKH) address is the Base58Check encoding of:

version || HASH160(public_key)

where version = 0x00 for mainnet and version = 0x6F for testnet.

The version byte determines the leading character of the address:

Network	Version Byte	Address Prefix	Example
Mainnet	`0x00`	`1`	`1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2`
Testnet	`0x6F`	`m` or `n`	`mipcBbFg9gMiCh81Kj8tqqdgoZub1ZJRfn`

Example 9.3 (P2PKH Address Generation)

Starting with a compressed public key:

0279be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798

Step 1: Compute SHA-256:

0f715baf5d4c2ed329785cef29e562f73488c8a2bb9dbc5700b361d54b9b0554

Step 2: Compute RIPEMD-160 of the above:

751e76e8199196d454941c45d1b3a323f1433bd6

Step 3: Prepend version byte (0x00):

00751e76e8199196d454941c45d1b3a323f1433bd6

Step 4: Compute double SHA-256 for checksum:

510d1634d943109b69da527ef5948106f22b655fb5193b4e9ef7e4dcd342d245

Step 5: Take first 4 bytes as checksum:

510d1634

Step 6: Append checksum and Base58 encode:

1BgGZ9tcN4rm9KBzDn7KprQz87SZ26SAMH

9.6 Script Hash Addresses (P2SH)

P2SH addresses, introduced in BIP-16, allow payments to arbitrary scripts rather than simple public key hashes. The script itself is revealed only when spending.

Definition 9.9 (P2SH Address)

A Pay-to-Script-Hash (P2SH) address is the Base58Check encoding of:

version || HASH160(redeem_script)

where version = 0x05 for mainnet (yielding addresses starting with 3).

P2SH enables complex spending conditions—multisignature wallets, time-locked contracts, and more—while presenting a uniform address format to senders.

Network	Version Byte	Address Prefix
Mainnet	`0x05`	`3`
Testnet	`0xC4`	`2`

9.7 Bech32 and Native SegWit (P2WPKH)

Segregated Witness (SegWit), activated in 2017, introduced a new address format called Bech32, specified in BIP-173. These addresses offer improved error detection and are more efficient on-chain.

9.7.1 The Bech32 Alphabet

Definition 9.10 (Bech32 Alphabet)

The Bech32 alphabet consists of 32 lowercase alphanumeric characters:

qpzry9x8gf2tvdw0s3jn54khce6mua7l

Excluded are: 1, b, i, o—characters easily confused with others.

9.7.2 Address Structure

A Bech32 address has three parts:

Human-readable part (HRP): bc for mainnet, tb for testnet
Separator: always 1
Data part: witness version + witness program + checksum

Definition 9.11 (P2WPKH Address)

A Pay-to-Witness-Public-Key-Hash (P2WPKH) address encodes:

Witness version: 0
Witness program: HASH160(compressed_pubkey) (20 bytes)

The result begins with bc1q on mainnet (the q represents witness version 0).

Example 9.4 (P2WPKH Address)

For the same public key as Example 9.3:

bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4

This encodes the same 20-byte hash but uses the more efficient SegWit format.

9.7.3 Error Detection

Bech32 uses a BCH code for error detection, offering stronger guarantees than Base58Check.

Theorem 9.3 (Bech32 Error Detection)

Bech32 encoding guarantees detection of:

Any single character error
Any single transposition of adjacent characters
Any error affecting up to 4 characters

For longer error bursts, undetected errors occur with probability at most 1 in 10⁹.

9.8 Taproot Addresses (P2TR)

Taproot, activated in November 2021, introduced a new output type using Schnorr signatures and Bech32m encoding (BIP-350).

Definition 9.12 (P2TR Address)

A Pay-to-Taproot (P2TR) address encodes:

Witness version: 1
Witness program: 32-byte x-only public key

The result begins with bc1p on mainnet (the p represents witness version 1).

Note that Taproot uses the x-coordinate directly (with implicit even y), not a hash. This enables key-path spending without revealing a script.

Example 9.5 (P2TR Address)

bc1p5cyxnuxmeuwuvkwfem96lqzszd02n6xdcjrs20cac6yqjjwudpxqkedrcr

This 62-character address encodes a 32-byte tweaked public key using Bech32m encoding.

9.8.1 Bech32 vs Bech32m

BIP-350 introduced Bech32m, a slight modification of Bech32 that fixes an issue where certain errors in witness version 1+ addresses could go undetected.

Witness Version	Encoding	Address Prefix
0	Bech32	`bc1q...`
1 (Taproot)	Bech32m	`bc1p...`
2–16 (future)	Bech32m	`bc1z...`, etc.

9.9 Wallet Import Format (WIF)

Private keys also need a human-readable encoding for backup and import. The Wallet Import Format (WIF) uses Base58Check.

Definition 9.13 (WIF Encoding)

The Wallet Import Format of a private key d is:

Base58Check(version || d || [suffix])

where:

version = 0x80 for mainnet
suffix = 0x01 if the corresponding public key is compressed

Format	Version	Suffix	Prefix	Length
Mainnet, uncompressed	`0x80`	none	`5`	51 chars
Mainnet, compressed	`0x80`	`0x01`	`K` or `L`	52 chars
Testnet, compressed	`0xEF`	`0x01`	`c`	52 chars

Example 9.6 (WIF Private Key)

The private key from Example 9.1 in WIF format (compressed):

L52XzL2cMkHxqxBXRyEpnPQZGUs3uKiL3R11XbAdHigRzDozKZeW

The L prefix indicates a mainnet compressed key.

9.10 Summary of Address Types

Figure 9.4: Evolution of Bitcoin address formats from 2009 to 2021.

Type	Prefix	Encoding	Data	Length
P2PKH	`1`	Base58Check	20-byte pubkey hash	25–34 chars
P2SH	`3`	Base58Check	20-byte script hash	25–34 chars
P2WPKH	`bc1q`	Bech32	20-byte pubkey hash	42 chars
P2WSH	`bc1q`	Bech32	32-byte script hash	62 chars
P2TR	`bc1p`	Bech32m	32-byte x-only pubkey	62 chars

Exercises

Exercise 9.1

Verify that the private key 0x01 (the integer 1) is valid. Compute the corresponding compressed public key and P2PKH address.

Exercise 9.2

The Base58 alphabet has 58 characters. Explain why a 20-byte hash (160 bits) encodes to approximately 27–28 Base58 characters. (Hint: compute 160 / log₂(58).)

Exercise 9.3

Given the P2PKH address 1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2, decode it to recover the 20-byte public key hash. Verify the checksum.

Exercise 9.4

Why does Taproot use the x-coordinate directly rather than a hash? What are the privacy and efficiency implications?

Exercise 9.5

A user claims their address is bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t5 (note the final 5 instead of 4). Explain how the Bech32 checksum detects this error.