Chapter 9

Keys and Addresses

From Random Numbers to Bitcoin Identities

In Volume I, we developed the mathematical machinery of elliptic curves and digital signatures. Now we apply this knowledge to construct the fundamental objects of the Bitcoin protocol: private keys, public keys, and addresses. These form the cryptographic identity system that allows users to receive and spend bitcoin without relying on any central authority.

The journey from a random number to a Bitcoin address involves several layers of transformation, each serving a specific purpose. Understanding these transformations is essential for anyone who wishes to implement Bitcoin software or verify its correctness.

9.1 Private Keys

A Bitcoin private key is, at its core, simply a number. But not just any number—it must be chosen with extreme care.

Definition 9.1 (Private Key)

A private key is an integer k satisfying 1 ≤ k < n, where n is the order of the secp256k1 curve:

n = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141₁₆

In decimal, n ≈ 1.158 × 10⁷⁷.

The requirement that k < n ensures that scalar multiplication k · G produces a valid curve point. The requirement that k ≥ 1 excludes the trivial case where the public key would be the point at infinity.

Remark 9.1 (On Randomness)

The security of Bitcoin depends entirely on private keys being chosen uniformly at random from the valid range. A private key generated with insufficient randomness can be discovered by an attacker, resulting in theft of all associated funds.

Cryptographically secure random number generators (CSPRNGs) must be used. Common sources include /dev/urandom on Unix systems, CryptGenRandom on Windows, or hardware random number generators.

Example 9.1 (A Valid Private Key)

The following 256-bit hexadecimal value is a valid private key:

e8f32e723decf4051aefac8e2c93c9c5b214313817cdb01a1494b917c8436b35

This is simply a 32-byte sequence. In binary, it contains exactly 256 bits. The probability that a randomly chosen 256-bit number falls within the valid range is approximately 1 − 2⁻¹²⁸, which is negligibly different from 1.

9.1.1 The Keyspace

The set of valid private keys contains approximately 2²⁵⁶ elements. To appreciate how large this is, consider:

Even if every atom in the universe were a computer generating a billion keys per second since the Big Bang, the probability of generating any specific private key would still be negligible.

Theorem 9.1 (Birthday Bound for Collisions)

If m private keys are generated uniformly at random from a space of size N ≈ 2²⁵⁶, the probability of any collision (two identical keys) is approximately:

P(collision) ≈ m² / (2N) = m² / 2²⁵⁷

For this probability to reach 50%, we would need m ≈ 2¹²⁸ key generations—a number beyond any conceivable computation.

9.2 Public Keys

Given a private key k, the corresponding public key is computed via elliptic curve scalar multiplication.

Definition 9.2 (Public Key)

The public key corresponding to private key k is the elliptic curve point:

K = k · G

where G is the generator point of secp256k1.

This computation is the "easy direction" of the trapdoor function we studied in Chapter 1. Given k, computing K requires only O(log k) group operations via the double-and-add algorithm. The reverse—finding k given K—is the elliptic curve discrete logarithm problem (ECDLP), believed to be computationally infeasible.

k (private key) × G scalar multiplication K = (x, y) (public key) Cannot reverse: ECDLP is hard
Figure 9.1: Public key derivation from private key via scalar multiplication.

9.2.1 Public Key Serialization

A public key is a point K = (x, y) on the secp256k1 curve. Both coordinates are 256-bit integers, so the full representation requires 64 bytes (512 bits). However, Bitcoin supports two serialization formats.

Definition 9.3 (Uncompressed Public Key)

The uncompressed serialization of a public key (x, y) is the 65-byte sequence:

04 || x || y

where 04 is a single prefix byte, and x and y are each encoded as 32-byte big-endian integers.

The uncompressed format is wasteful. Since any valid x-coordinate determines at most two possible y-coordinates (the curve equation is quadratic in y), we can compress the representation.

Definition 9.4 (Compressed Public Key)

The compressed serialization of a public key (x, y) is the 33-byte sequence:

prefix || x

where the prefix byte is 02 if y is even, or 03 if y is odd.

To recover the full point from a compressed key, one computes y² = x³ + 7 (mod p), takes the square root, and selects the root with the correct parity.

Example 9.2 (Public Key Formats)

For the private key from Example 9.1, the public key is:

Uncompressed (65 bytes):

04
339dcdb23a4e82c904f6da32b42dbda63e3646e6e6e6b0a9b13e7a3d7e8b2c1a
7f5e4d3c2b1a0f9e8d7c6b5a4f3e2d1c0b9a8f7e6d5c4b3a2f1e0d9c8b7a6f5

Compressed (33 bytes):

02339dcdb23a4e82c904f6da32b42dbda63e3646e6e6e6b0a9b13e7a3d7e8b2c1a

The prefix 02 indicates that the y-coordinate is even.

Remark 9.2 (Compressed Keys are Standard)

Modern Bitcoin software uses compressed public keys exclusively. Uncompressed keys waste block space and are discouraged. Taproot (BIP-340) goes further, using only the x-coordinate with an implicit even y.

9.3 Hash Functions in Address Generation

Bitcoin addresses are not public keys directly, but rather hashes of public keys. This indirection provides several benefits: shorter addresses, quantum resistance (partial), and scripting flexibility.

Definition 9.5 (HASH160)

The HASH160 function is the composition:

HASH160(x) = RIPEMD160(SHA256(x))

This produces a 160-bit (20-byte) digest from an arbitrary input.

The use of two different hash functions (SHA-256 followed by RIPEMD-160) provides defense in depth: if either function is found to have weaknesses, the composition may still be secure.

Public Key SHA-256 RIPEMD-160 20-byte hash (pubkey hash)
Figure 9.2: The HASH160 function produces a 20-byte public key hash.

9.4 Base58Check Encoding

Raw binary data is inconvenient for humans to read, copy, or transcribe. Bitcoin's legacy addresses use Base58Check encoding—a variant of base conversion designed for clarity and error detection.

9.4.1 The Base58 Alphabet

Definition 9.6 (Base58 Alphabet)

The Base58 alphabet consists of 58 characters:

123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz

Notably absent are: 0 (zero), O (capital O), I (capital I), and l (lowercase L)—characters easily confused in certain fonts.

Base58 encoding treats the input bytes as a big-endian integer and converts it to base 58, mapping each digit to the corresponding alphabet character.

9.4.2 Adding the Checksum

Definition 9.7 (Base58Check)

Base58Check encoding of data d with version byte v is:

Base58(v || d || checksum)

where checksum is the first 4 bytes of SHA256(SHA256(v || d)).

Version Payload (data) SHA256(SHA256(·)) First 4 bytes (checksum) Version Payload Checksum Base58 Address
Figure 9.3: Base58Check encoding with version prefix and checksum.

The 4-byte checksum catches transcription errors with high probability. A single changed character will almost certainly produce an invalid checksum, alerting the user before funds are sent to an unreachable address.

Theorem 9.2 (Checksum Error Detection)

For a random single-character error in a Base58Check-encoded string, the probability of the error going undetected is approximately 2⁻³², or about 1 in 4 billion.

9.5 Legacy Addresses (P2PKH)

The original Bitcoin address format encodes a public key hash with a version prefix indicating the network.

Definition 9.8 (P2PKH Address)

A Pay-to-Public-Key-Hash (P2PKH) address is the Base58Check encoding of:

version || HASH160(public_key)

where version = 0x00 for mainnet and version = 0x6F for testnet.

The version byte determines the leading character of the address:

Network Version Byte Address Prefix Example
Mainnet 0x00 1 1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2
Testnet 0x6F m or n mipcBbFg9gMiCh81Kj8tqqdgoZub1ZJRfn

Example 9.3 (P2PKH Address Generation)

Starting with a compressed public key:

0279be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798

Step 1: Compute SHA-256:

0b7c28c9b7290c98d7438e70b3d3f7c848fbd7d1dc194ff83f4f7cc9b1378e98

Step 2: Compute RIPEMD-160 of the above:

751e76e8199196d454941c45d1b3a323f1433bd6

Step 3: Prepend version byte (0x00):

00751e76e8199196d454941c45d1b3a323f1433bd6

Step 4: Compute double SHA-256 for checksum:

c7f18fe8fcbed6396741e58ad259b5cb16b7fd7f041904147ba1dcffabf747fd

Step 5: Take first 4 bytes as checksum:

c7f18fe8

Step 6: Append checksum and Base58 encode:

1BgGZ9tcN4rm9KBzDn7KprQz87SZ26SAMH

9.6 Script Hash Addresses (P2SH)

P2SH addresses, introduced in BIP-16, allow payments to arbitrary scripts rather than simple public key hashes. The script itself is revealed only when spending.

Definition 9.9 (P2SH Address)

A Pay-to-Script-Hash (P2SH) address is the Base58Check encoding of:

version || HASH160(redeem_script)

where version = 0x05 for mainnet (yielding addresses starting with 3).

P2SH enables complex spending conditions—multisignature wallets, time-locked contracts, and more—while presenting a uniform address format to senders.

Network Version Byte Address Prefix
Mainnet 0x05 3
Testnet 0xC4 2

9.7 Bech32 and Native SegWit (P2WPKH)

Segregated Witness (SegWit), activated in 2017, introduced a new address format called Bech32, specified in BIP-173. These addresses offer improved error detection and are more efficient on-chain.

9.7.1 The Bech32 Alphabet

Definition 9.10 (Bech32 Alphabet)

The Bech32 alphabet consists of 32 lowercase alphanumeric characters:

qpzry9x8gf2tvdw0s3jn54khce6mua7l

Excluded are: 1, b, i, o—characters easily confused with others.

9.7.2 Address Structure

A Bech32 address has three parts:

  1. Human-readable part (HRP): bc for mainnet, tb for testnet
  2. Separator: always 1
  3. Data part: witness version + witness program + checksum

Definition 9.11 (P2WPKH Address)

A Pay-to-Witness-Public-Key-Hash (P2WPKH) address encodes:

  • Witness version: 0
  • Witness program: HASH160(compressed_pubkey) (20 bytes)

The result begins with bc1q on mainnet (the q represents witness version 0).

Example 9.4 (P2WPKH Address)

For the same public key as Example 9.3:

bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4

This encodes the same 20-byte hash but uses the more efficient SegWit format.

9.7.3 Error Detection

Bech32 uses a BCH code for error detection, offering stronger guarantees than Base58Check.

Theorem 9.3 (Bech32 Error Detection)

Bech32 encoding guarantees detection of:

  • Any single character error
  • Any single transposition of adjacent characters
  • Any error affecting up to 4 characters

For longer error bursts, undetected errors occur with probability at most 1 in 10⁹.

9.8 Taproot Addresses (P2TR)

Taproot, activated in November 2021, introduced a new output type using Schnorr signatures and Bech32m encoding (BIP-350).

Definition 9.12 (P2TR Address)

A Pay-to-Taproot (P2TR) address encodes:

  • Witness version: 1
  • Witness program: 32-byte x-only public key

The result begins with bc1p on mainnet (the p represents witness version 1).

Note that Taproot uses the x-coordinate directly (with implicit even y), not a hash. This enables key-path spending without revealing a script.

Example 9.5 (P2TR Address)

bc1p5cyxnuxmeuwuvkwfem96lqzszd02n6xdcjrs20cac6yqjjwudpxqkedrcr

This 62-character address encodes a 32-byte tweaked public key using Bech32m encoding.

9.8.1 Bech32 vs Bech32m

BIP-350 introduced Bech32m, a slight modification of Bech32 that fixes an issue where certain errors in witness version 1+ addresses could go undetected.

Witness Version Encoding Address Prefix
0 Bech32 bc1q...
1 (Taproot) Bech32m bc1p...
2–16 (future) Bech32m bc1z..., etc.

9.9 Wallet Import Format (WIF)

Private keys also need a human-readable encoding for backup and import. The Wallet Import Format (WIF) uses Base58Check.

Definition 9.13 (WIF Encoding)

The Wallet Import Format of a private key k is:

Base58Check(version || k || [suffix])

where:

  • version = 0x80 for mainnet
  • suffix = 0x01 if the corresponding public key is compressed
Format Version Suffix Prefix Length
Mainnet, uncompressed 0x80 none 5 51 chars
Mainnet, compressed 0x80 0x01 K or L 52 chars
Testnet, compressed 0xEF 0x01 c 52 chars

Example 9.6 (WIF Private Key)

The private key from Example 9.1 in WIF format (compressed):

L4rK1yDtCWekvXuE6oXD9jCYfFNV2cWRpVuPLBcCU2z8TrisoyY1

The L prefix indicates a mainnet compressed key.

9.10 Summary of Address Types

Evolution of Bitcoin Address Formats P2PKH 1... 2009 P2SH 3... 2012 P2WPKH bc1q... 2017 P2TR bc1p... 2021 Base58Check 20-byte hash Base58Check Script flexibility Bech32 SegWit v0 Bech32m Schnorr + Tapscript
Figure 9.4: Evolution of Bitcoin address formats from 2009 to 2021.
Type Prefix Encoding Data Length
P2PKH 1 Base58Check 20-byte pubkey hash 25–34 chars
P2SH 3 Base58Check 20-byte script hash 25–34 chars
P2WPKH bc1q Bech32 20-byte pubkey hash 42 chars
P2WSH bc1q Bech32 32-byte script hash 62 chars
P2TR bc1p Bech32m 32-byte x-only pubkey 62 chars

Exercises

Exercise 9.1

Verify that the private key 0x01 (the integer 1) is valid. Compute the corresponding compressed public key and P2PKH address.

Exercise 9.2

The Base58 alphabet has 58 characters. Explain why a 20-byte hash (160 bits) encodes to approximately 27–28 Base58 characters. (Hint: compute 160 / log₂(58).)

Exercise 9.3

Given the P2PKH address 1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2, decode it to recover the 20-byte public key hash. Verify the checksum.

Exercise 9.4

Why does Taproot use the x-coordinate directly rather than a hash? What are the privacy and efficiency implications?

Exercise 9.5

A user claims their address is bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t5 (note the final 5 instead of 4). Explain how the Bech32 checksum detects this error.