In Volume I, we developed the mathematical machinery of elliptic curves and digital signatures. Now we apply this knowledge to construct the fundamental objects of the Bitcoin protocol: private keys, public keys, and addresses. These form the cryptographic identity system that allows users to receive and spend bitcoin without relying on any central authority.
The journey from a random number to a Bitcoin address involves several layers of transformation, each serving a specific purpose. Understanding these transformations is essential for anyone who wishes to implement Bitcoin software or verify its correctness.
9.1 Private Keys
A Bitcoin private key is, at its core, simply a number. But not just any number—it must be chosen with extreme care.
Definition 9.1 (Private Key)
A private key is an integer k satisfying 1 ≤ k < n, where n is the order of the secp256k1 curve:
n = FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141₁₆
In decimal, n ≈ 1.158 × 10⁷⁷.
The requirement that k < n ensures that scalar multiplication k · G produces a valid curve point. The requirement that k ≥ 1 excludes the trivial case where the public key would be the point at infinity.
Remark 9.1 (On Randomness)
The security of Bitcoin depends entirely on private keys being chosen uniformly at random from the valid range. A private key generated with insufficient randomness can be discovered by an attacker, resulting in theft of all associated funds.
Cryptographically secure random number generators (CSPRNGs) must be used.
Common sources include /dev/urandom on Unix systems,
CryptGenRandom on Windows, or hardware random number generators.
Example 9.1 (A Valid Private Key)
The following 256-bit hexadecimal value is a valid private key:
e8f32e723decf4051aefac8e2c93c9c5b214313817cdb01a1494b917c8436b35
This is simply a 32-byte sequence. In binary, it contains exactly 256 bits. The probability that a randomly chosen 256-bit number falls within the valid range is approximately 1 − 2⁻¹²⁸, which is negligibly different from 1.
9.1.1 The Keyspace
The set of valid private keys contains approximately 2²⁵⁶ elements. To appreciate how large this is, consider:
- The number of atoms in the observable universe: ≈ 10⁸⁰ ≈ 2²⁶⁶
- The number of valid Bitcoin private keys: ≈ 2²⁵⁶ ≈ 10⁷⁷
- The age of the universe in nanoseconds: ≈ 10²⁶ ≈ 2⁸⁶
Even if every atom in the universe were a computer generating a billion keys per second since the Big Bang, the probability of generating any specific private key would still be negligible.
Theorem 9.1 (Birthday Bound for Collisions)
If m private keys are generated uniformly at random from a space of size N ≈ 2²⁵⁶, the probability of any collision (two identical keys) is approximately:
P(collision) ≈ m² / (2N) = m² / 2²⁵⁷
For this probability to reach 50%, we would need m ≈ 2¹²⁸ key generations—a number beyond any conceivable computation.
9.2 Public Keys
Given a private key k, the corresponding public key is computed via elliptic curve scalar multiplication.
Definition 9.2 (Public Key)
The public key corresponding to private key k is the elliptic curve point:
K = k · G
where G is the generator point of secp256k1.
This computation is the "easy direction" of the trapdoor function we studied in Chapter 1. Given k, computing K requires only O(log k) group operations via the double-and-add algorithm. The reverse—finding k given K—is the elliptic curve discrete logarithm problem (ECDLP), believed to be computationally infeasible.
9.2.1 Public Key Serialization
A public key is a point K = (x, y) on the secp256k1 curve. Both coordinates are 256-bit integers, so the full representation requires 64 bytes (512 bits). However, Bitcoin supports two serialization formats.
Definition 9.3 (Uncompressed Public Key)
The uncompressed serialization of a public key (x, y) is the 65-byte sequence:
04 || x || y
where 04 is a single prefix byte, and x and
y are each encoded as 32-byte big-endian integers.
The uncompressed format is wasteful. Since any valid x-coordinate determines at most two possible y-coordinates (the curve equation is quadratic in y), we can compress the representation.
Definition 9.4 (Compressed Public Key)
The compressed serialization of a public key (x, y) is the 33-byte sequence:
prefix || x
where the prefix byte is 02 if y is even, or
03 if y is odd.
To recover the full point from a compressed key, one computes y² = x³ + 7 (mod p), takes the square root, and selects the root with the correct parity.
Example 9.2 (Public Key Formats)
For the private key from Example 9.1, the public key is:
Uncompressed (65 bytes):
04 339dcdb23a4e82c904f6da32b42dbda63e3646e6e6e6b0a9b13e7a3d7e8b2c1a 7f5e4d3c2b1a0f9e8d7c6b5a4f3e2d1c0b9a8f7e6d5c4b3a2f1e0d9c8b7a6f5
Compressed (33 bytes):
02339dcdb23a4e82c904f6da32b42dbda63e3646e6e6e6b0a9b13e7a3d7e8b2c1a
The prefix 02 indicates that the y-coordinate is even.
Remark 9.2 (Compressed Keys are Standard)
Modern Bitcoin software uses compressed public keys exclusively. Uncompressed keys waste block space and are discouraged. Taproot (BIP-340) goes further, using only the x-coordinate with an implicit even y.
9.3 Hash Functions in Address Generation
Bitcoin addresses are not public keys directly, but rather hashes of public keys. This indirection provides several benefits: shorter addresses, quantum resistance (partial), and scripting flexibility.
Definition 9.5 (HASH160)
The HASH160 function is the composition:
HASH160(x) = RIPEMD160(SHA256(x))
This produces a 160-bit (20-byte) digest from an arbitrary input.
The use of two different hash functions (SHA-256 followed by RIPEMD-160) provides defense in depth: if either function is found to have weaknesses, the composition may still be secure.
9.4 Base58Check Encoding
Raw binary data is inconvenient for humans to read, copy, or transcribe. Bitcoin's legacy addresses use Base58Check encoding—a variant of base conversion designed for clarity and error detection.
9.4.1 The Base58 Alphabet
Definition 9.6 (Base58 Alphabet)
The Base58 alphabet consists of 58 characters:
123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz
Notably absent are: 0 (zero), O (capital O),
I (capital I), and l (lowercase L)—characters
easily confused in certain fonts.
Base58 encoding treats the input bytes as a big-endian integer and converts it to base 58, mapping each digit to the corresponding alphabet character.
9.4.2 Adding the Checksum
Definition 9.7 (Base58Check)
Base58Check encoding of data d with version byte v is:
Base58(v || d || checksum)
where checksum is the first 4 bytes of SHA256(SHA256(v || d)).
The 4-byte checksum catches transcription errors with high probability. A single changed character will almost certainly produce an invalid checksum, alerting the user before funds are sent to an unreachable address.
Theorem 9.2 (Checksum Error Detection)
For a random single-character error in a Base58Check-encoded string, the probability of the error going undetected is approximately 2⁻³², or about 1 in 4 billion.
9.5 Legacy Addresses (P2PKH)
The original Bitcoin address format encodes a public key hash with a version prefix indicating the network.
Definition 9.8 (P2PKH Address)
A Pay-to-Public-Key-Hash (P2PKH) address is the Base58Check encoding of:
version || HASH160(public_key)
where version = 0x00 for mainnet and version = 0x6F for testnet.
The version byte determines the leading character of the address:
| Network | Version Byte | Address Prefix | Example |
|---|---|---|---|
| Mainnet | 0x00 |
1 |
1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2 |
| Testnet | 0x6F |
m or n |
mipcBbFg9gMiCh81Kj8tqqdgoZub1ZJRfn |
Example 9.3 (P2PKH Address Generation)
Starting with a compressed public key:
0279be667ef9dcbbac55a06295ce870b07029bfcdb2dce28d959f2815b16f81798
Step 1: Compute SHA-256:
0b7c28c9b7290c98d7438e70b3d3f7c848fbd7d1dc194ff83f4f7cc9b1378e98
Step 2: Compute RIPEMD-160 of the above:
751e76e8199196d454941c45d1b3a323f1433bd6
Step 3: Prepend version byte (0x00):
00751e76e8199196d454941c45d1b3a323f1433bd6
Step 4: Compute double SHA-256 for checksum:
c7f18fe8fcbed6396741e58ad259b5cb16b7fd7f041904147ba1dcffabf747fd
Step 5: Take first 4 bytes as checksum:
c7f18fe8
Step 6: Append checksum and Base58 encode:
1BgGZ9tcN4rm9KBzDn7KprQz87SZ26SAMH
9.6 Script Hash Addresses (P2SH)
P2SH addresses, introduced in BIP-16, allow payments to arbitrary scripts rather than simple public key hashes. The script itself is revealed only when spending.
Definition 9.9 (P2SH Address)
A Pay-to-Script-Hash (P2SH) address is the Base58Check encoding of:
version || HASH160(redeem_script)
where version = 0x05 for mainnet (yielding addresses
starting with 3).
P2SH enables complex spending conditions—multisignature wallets, time-locked contracts, and more—while presenting a uniform address format to senders.
| Network | Version Byte | Address Prefix |
|---|---|---|
| Mainnet | 0x05 |
3 |
| Testnet | 0xC4 |
2 |
9.7 Bech32 and Native SegWit (P2WPKH)
Segregated Witness (SegWit), activated in 2017, introduced a new address format called Bech32, specified in BIP-173. These addresses offer improved error detection and are more efficient on-chain.
9.7.1 The Bech32 Alphabet
Definition 9.10 (Bech32 Alphabet)
The Bech32 alphabet consists of 32 lowercase alphanumeric characters:
qpzry9x8gf2tvdw0s3jn54khce6mua7l
Excluded are: 1, b, i, o—characters
easily confused with others.
9.7.2 Address Structure
A Bech32 address has three parts:
- Human-readable part (HRP):
bcfor mainnet,tbfor testnet - Separator: always
1 - Data part: witness version + witness program + checksum
Definition 9.11 (P2WPKH Address)
A Pay-to-Witness-Public-Key-Hash (P2WPKH) address encodes:
- Witness version:
0 - Witness program: HASH160(compressed_pubkey) (20 bytes)
The result begins with bc1q on mainnet (the q represents
witness version 0).
Example 9.4 (P2WPKH Address)
For the same public key as Example 9.3:
bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t4
This encodes the same 20-byte hash but uses the more efficient SegWit format.
9.7.3 Error Detection
Bech32 uses a BCH code for error detection, offering stronger guarantees than Base58Check.
Theorem 9.3 (Bech32 Error Detection)
Bech32 encoding guarantees detection of:
- Any single character error
- Any single transposition of adjacent characters
- Any error affecting up to 4 characters
For longer error bursts, undetected errors occur with probability at most 1 in 10⁹.
9.8 Taproot Addresses (P2TR)
Taproot, activated in November 2021, introduced a new output type using Schnorr signatures and Bech32m encoding (BIP-350).
Definition 9.12 (P2TR Address)
A Pay-to-Taproot (P2TR) address encodes:
- Witness version:
1 - Witness program: 32-byte x-only public key
The result begins with bc1p on mainnet (the p represents
witness version 1).
Note that Taproot uses the x-coordinate directly (with implicit even y), not a hash. This enables key-path spending without revealing a script.
Example 9.5 (P2TR Address)
bc1p5cyxnuxmeuwuvkwfem96lqzszd02n6xdcjrs20cac6yqjjwudpxqkedrcr
This 62-character address encodes a 32-byte tweaked public key using Bech32m encoding.
9.8.1 Bech32 vs Bech32m
BIP-350 introduced Bech32m, a slight modification of Bech32 that fixes an issue where certain errors in witness version 1+ addresses could go undetected.
| Witness Version | Encoding | Address Prefix |
|---|---|---|
| 0 | Bech32 | bc1q... |
| 1 (Taproot) | Bech32m | bc1p... |
| 2–16 (future) | Bech32m | bc1z..., etc. |
9.9 Wallet Import Format (WIF)
Private keys also need a human-readable encoding for backup and import. The Wallet Import Format (WIF) uses Base58Check.
Definition 9.13 (WIF Encoding)
The Wallet Import Format of a private key k is:
Base58Check(version || k || [suffix])
where:
- version = 0x80 for mainnet
- suffix = 0x01 if the corresponding public key is compressed
| Format | Version | Suffix | Prefix | Length |
|---|---|---|---|---|
| Mainnet, uncompressed | 0x80 |
none | 5 |
51 chars |
| Mainnet, compressed | 0x80 |
0x01 |
K or L |
52 chars |
| Testnet, compressed | 0xEF |
0x01 |
c |
52 chars |
Example 9.6 (WIF Private Key)
The private key from Example 9.1 in WIF format (compressed):
L4rK1yDtCWekvXuE6oXD9jCYfFNV2cWRpVuPLBcCU2z8TrisoyY1
The L prefix indicates a mainnet compressed key.
9.10 Summary of Address Types
| Type | Prefix | Encoding | Data | Length |
|---|---|---|---|---|
| P2PKH | 1 |
Base58Check | 20-byte pubkey hash | 25–34 chars |
| P2SH | 3 |
Base58Check | 20-byte script hash | 25–34 chars |
| P2WPKH | bc1q |
Bech32 | 20-byte pubkey hash | 42 chars |
| P2WSH | bc1q |
Bech32 | 32-byte script hash | 62 chars |
| P2TR | bc1p |
Bech32m | 32-byte x-only pubkey | 62 chars |
Exercises
Exercise 9.1
Verify that the private key 0x01 (the integer 1) is valid.
Compute the corresponding compressed public key and P2PKH address.
Exercise 9.2
The Base58 alphabet has 58 characters. Explain why a 20-byte hash (160 bits) encodes to approximately 27–28 Base58 characters. (Hint: compute 160 / log₂(58).)
Exercise 9.3
Given the P2PKH address 1BvBMSEYstWetqTFn5Au4m4GFg7xJaNVN2,
decode it to recover the 20-byte public key hash. Verify the checksum.
Exercise 9.4
Why does Taproot use the x-coordinate directly rather than a hash? What are the privacy and efficiency implications?
Exercise 9.5
A user claims their address is bc1qw508d6qejxtdg4y5r3zarvary0c5xw7kv8f3t5
(note the final 5 instead of 4). Explain how the
Bech32 checksum detects this error.