Chapter 20

Light Clients and the Validation Gap

What a Client That Does Not Validate Can Still Verify

Not every user can run a full Bitcoin node, and every client that does not makes a precise trade: it gives up some part of validation in exchange for resources. The central question of this chapter is therefore not which wallet backend to prefer but what, exactly, a client that does not validate everything can still verify. Every light-client design is a point on a validation spectrum, and the difference between two designs is the difference between the consensus rules each one checks.

Sections 20.2 through 20.5 survey the architectures in production use—the Electrum client-server model, block-explorer APIs, and hybrid designs that layer verification onto a server backend—treating each as a data point: a trust model, a privacy profile, and a position on the spectrum. The destination is Section 20.6, which supplies the theory: it classifies every consensus rule by the data required to verify it, establishes what fraud proofs can and cannot recover, and identifies data availability as the structural obstacle. The product architectures will date; the classification will not.

20.1 The Light Client Spectrum

Light clients exist on a spectrum from "minimal trust, maximal resources" to "maximal trust, minimal resources":

Full Validation

Trust: None (trustless)

Resources: ~550 GB storage, continuous bandwidth

Example: Bitcoin Core

SPV / Light Node

Trust: Miners (majority honest)

Resources: ~65 MB headers + filters

Example: Neutrino, BIP-157

Server-Dependent

Trust: Server operator

Resources: Minimal

Example: Electrum (public servers)

Definition 20.1 (Light Client)

A light client is any Bitcoin client that does not independently validate all transactions and blocks. Light clients necessarily trust some external party—miners, servers, or peers—for certain guarantees that full nodes verify independently.

Trust Requirements by Architecture

Architecture	Transaction Validity	Inclusion Proof	Privacy
Full Node	Self-verified	Self-verified	Perfect
Pruned Node (Section 21.4)	Self-verified	Self-verified	Perfect
BIP-157/158	Trust miners	Merkle proof	Good
Electrum (personal)	Trust miners	Server provides	Good
Electrum (public)	Trust server + miners	Server provides	Poor
Centralized API	Trust service	Trust service	None

20.2 The Electrum Protocol

Electrum, created in 2011, pioneered the client-server model for Bitcoin wallets. The Electrum protocol (also called the ElectrumX protocol) provides a JSON-RPC interface for querying blockchain data.

Architecture Overview

Figure 20.1: Electrum architecture: clients connect to servers that maintain an address-indexed database built from a full node. The server sees all client addresses and queries.

Trust Model and Indexing

The protocol consists of a small set of JSON-RPC methods through which the client retrieves transaction history, balances, and unspent outputs for its scripts, subscribes to changes, and broadcasts transactions. Nearly every method transmits a script identifier to the server, so the query interface itself—not any implementation defect—is what discloses the wallet's contents. The trust model follows directly: the client trusts miners for transaction validity (as in any SPV design) and trusts the server to report history completely and honestly, although Merkle proofs, when requested, let the client verify inclusion of reported transactions against its own header chain.

One detail of the indexing scheme matters for what follows: servers index the blockchain not by address but by a hash of the output script.

Definition 20.2 (Electrum Script Hash)

For any scriptPubKey, the Electrum script hash is:

script_hash = SHA256(scriptPubKey), reversed to little-endian hex

This provides a uniform 32-byte identifier regardless of address type.

Several server implementations exist (ElectrumX, electrs, Fulcrum, Electrum Personal Server), differing mainly in whether they maintain a full script-hash index of the chain—tens of gigabytes beyond the full node itself, allowing the server to answer queries for arbitrary wallets—or, as in the personal-server design, track only the wallets registered with them and require essentially no index at all.

20.3 Privacy Analysis of Electrum

What the Server Learns

Remark 20.1 (Electrum Privacy Leakage)

When a client connects to an Electrum server, the server learns:

Every address the wallet has ever generated
The complete transaction history for those addresses
Current balances and unspent outputs
Which addresses are monitored in real-time
When and what transactions the user broadcasts
Client IP address (without Tor)
Wallet software and version (from protocol negotiation)

This leakage is structural, not accidental: the Electrum protocol requires the client to send scriptPubKey hashes to query address history, so the server necessarily learns the queried addresses. Subscription requests explicitly reveal ongoing interest, and transaction broadcasts reveal spending before the transaction propagates through the P2P network.

Timing Attacks and Mitigations

Beyond the addresses themselves, query patterns leak information: the order and timing of script-hash queries reveal when a wallet was created and how its keys are derived (hierarchical-deterministic structure under BIP-32, a wallet standard outside this book's scope), queries issued immediately after a broadcast identify change outputs, and recurring patterns allow a server to link separate sessions to the same wallet.

The structural leak admits one complete mitigation and several partial ones. The complete mitigation is to operate the server oneself, connected to one's own full node: the protocol still discloses everything, but only to infrastructure the user controls. Partial mitigations include connecting over Tor (which hides the client's network identity but not its addresses) and splitting queries across multiple servers (which gives each server a partial view at the cost of disclosing to more parties).

20.4 Esplora and Block Explorer APIs

Many wallets query HTTP APIs provided by block explorers, of which Esplora (an open-source explorer by Blockstream) defines the de facto standard interface. The API is a catalog of REST endpoints keyed by address, txid, and block hash: the client fetches history, balances, and unspent outputs for each of its addresses in plaintext and posts raw transactions for broadcast. The disclosure is therefore the same as Electrum's—the server learns every address—but the verification is weaker: responses carry no Merkle proofs, so the client has no means of checking them against the header chain.

Trust Model

Remark 20.2 (API Trust Requirements)

When using a third-party API:

Availability: Service can deny access or go offline
Integrity: Service can return false data (balances, history)
Privacy: Service logs all queries with IP addresses
Censorship: Service can refuse to broadcast transactions

Without independent verification (Merkle proofs, header validation), the client must trust the API operator completely.

As with Electrum, the trust and privacy problems are problems of public infrastructure: a self-hosted Esplora instance backed by one's own full node restores privacy, availability, and integrity at the cost of roughly 100 GB of additional index storage.

20.5 Hybrid Approaches

Modern wallets often combine multiple approaches to optimize for different requirements.

Multi-Backend Wallets

Figure 20.2: Hybrid architecture: wallets can support multiple backends with configurable priority, optimizing for security when available and falling back to less private options when necessary.

Verification Layers

Even when using a server backend, clients can add verification:

Definition 20.3 (Layered Verification)

Layer	Verification	Protects Against
Header chain	Validate PoW chain	Fake blocks with low work
Merkle proofs	Verify tx inclusion	Fabricated transactions
Filter headers	Multi-peer consensus	Transaction omission
Multiple servers	Cross-reference responses	Single malicious server

Example 20.1 (Verification-Enhanced Electrum)

A wallet using Electrum can add security by:

Requesting Merkle proofs for all transactions
Maintaining a validated header chain locally
Verifying transactions are included in headers
Querying multiple Electrum servers for consensus

This does not fix privacy (servers still see addresses) but prevents balance manipulation attacks.

20.6 The Validation Gap

This section is the theoretical core of the chapter. The architectures surveyed so far differ in who serves the data, but they share one limit: they obtain and verify inclusion of transactions in blocks, and inclusion is not the same as validity. We now examine precisely what a light client cannot verify—the validation gap—and consider how the gap might be narrowed incrementally.

What Full Nodes Check

Recall from Chapter 15 that consensus requires every block to satisfy a comprehensive set of rules. A full node verifies all of them; a light client verifies almost none. The following table classifies every major consensus check by whether a light client can perform it.

Definition 20.4 (Validation Capability Classes)

We classify consensus checks into four classes based on the data required to verify them:

Class H (Header-only): Verifiable from the 80-byte header chain alone.
Class T (Transaction-level): Requires the offending transaction plus its merkle inclusion proof and, in some cases, the spent output data.
Class F (Fraud-provable): Not verifiable by default, but a compact fraud proof can demonstrate a violation using a small witness (typically under 2 KB).
Class U (UTXO-dependent): Requires access to the unspent transaction output set, which light clients do not possess.

Consensus Rule	Class	Evidence Required
Proof-of-work meets difficulty target	H	Block header (80 bytes)
Difficulty retarget is correct	H	Previous 2016 headers
Timestamp within median-time bounds	H	Previous 11 headers
Previous block hash links correctly	H	Adjacent headers
Block version meets activation height	H	Header + height
Block weight does not exceed limit	F	SHA-256 midstate proof (BIP 180)
Coinbase reward does not exceed subsidy (fee component is Class U)	F	Coinbase transaction + merkle proof
Coinbase encodes block height (BIP 34)	F	Coinbase transaction + merkle proof
Witness commitment present in coinbase	F	Coinbase transaction + merkle proof
Transaction signatures are valid	T	Transaction + merkle proof + spent output scripts
Locktime and sequence constraints	T	Transaction + merkle proof + block context
No duplicate transactions within a block	T	Two transactions + both merkle proofs
Script execution succeeds	T	Transaction + spent output data + Script interpreter
Inputs reference existing unspent outputs	U	UTXO set or equivalent commitment
No cross-block double-spending	U	UTXO set or equivalent commitment
Total fees calculated correctly	U	All transaction inputs (requires UTXO set for input values)

Fraud Proofs: Compact Evidence of Rule Violations

The Bitcoin whitepaper, in its discussion of simplified payment verification, anticipated that full nodes could alert light clients when they detect an invalid block. This concept is now called a fraud proof: a compact piece of evidence that demonstrates a specific consensus violation.

Definition 20.5 (Fraud Proof)

Refining Definition 17.5: a fraud proof for a consensus rule R applied to block B is a data structure P such that:

Completeness: If B violates R, an honest full node can construct P from block data.
Soundness: If B satisfies R, no adversary can construct a valid P.
Compactness: |P| is significantly smaller than |B|.
Verifiability: A light client holding the header chain can verify P without additional data.

BIP 180 specifies a fraud proof for the block weight rule (Class F above). The proof works by including SHA-256 midstate data that allows reconstruction of the merkle root from partial transaction size information. If the reconstructed merkle root matches the header and the computed weight exceeds the consensus limit, the proof is valid. The total evidence is approximately 1–2 KB, compared to the full block which may be several megabytes.

Similar constructions work for other Class F rules. A coinbase inflation proof, for instance, requires only the coinbase transaction (typically 200–500 bytes) and its merkle inclusion proof (about 11 hashes for a block of 2000 transactions). The light client verifies the merkle proof against the header it already holds, parses the coinbase outputs, and checks them against the known subsidy for that block height. Note the limit: verifying the fee component would require the input values of every transaction in the block, so a compact fraud proof can only catch coinbase outputs exceeding the subsidy plus an independently proven fee total.

Theorem 20.1 (Fraud Proof Coverage)

Given a distribution mechanism for fraud proofs and assuming data availability (that is, block data is accessible to at least one honest full node), the following consensus rules become enforceable by a light client holding only the header chain:

All Class H rules (verified directly from headers)
All Class F rules (verified via compact fraud proofs)
All Class T rules (verified via transaction-level fraud proofs, given a Script evaluation capability in the client)

Class U rules remain unenforceable without either UTXO set commitments in the block header (requiring a consensus change) or a validity proof covering the full state transition.

Proof.

For Class H, the light client performs the check directly from stored headers. For Class F, the fraud proof provides a compact witness linking the violation to the block header via merkle inclusion; the client verifies the merkle path and checks the rule. For Class T, the fraud proof includes the offending transaction, its merkle path, and the referenced outputs together with their own merkle inclusion proofs (without these, a malicious prover could fabricate input data to "prove" a valid transaction invalid); the client verifies all inclusions and re-executes the relevant check. For Class U, proving that a UTXO does not exist requires enumerating the entire set or referencing a commitment that does not currently exist in Bitcoin's block structure. Without such a commitment, no compact non-existence proof is known. ∎

The Data Availability Problem

Fraud proofs assume that block data is available to honest full nodes. But a malicious miner could publish a valid header—one with sufficient proof of work—while withholding the corresponding block data. In this scenario:

The light client sees a valid header and accepts the block.
Full nodes cannot download the block to verify it.
No fraud proof can be generated because the evidence is hidden.

This is the data availability problem, and its sting is sharper than the withholding scenario alone suggests. Suppose an honest node raises the alarm: "the data for block B is unavailable." That claim is unattributable. Unavailability is not a property of the block but of the network at a moment in time—the withholder can release the data the instant anyone investigates, at which point the alarm looks false and the alarmer looks dishonest. No proof of past unavailability is possible, so no one can be punished: not the miner (the data is now available) and not the alarmer (perhaps it really was unavailable when they checked). The consequence is that unavailability alarms are a free, unpunishable denial-of-service vector: an adversary can cry wolf endlessly, forcing light clients either to download whole blocks (defeating the purpose) or to learn to ignore alarms (defeating the alert system). This dilemma, not the absence of a UTXO commitment, is the structural reason the whitepaper's suggestion that full nodes could "alert" light clients (Section 17.7) has never been made to work.

Solutions proposed in other systems include data availability sampling, where light clients request random fragments of the block and use erasure coding to detect withholding probabilistically (Al-Bassam, Sonnino & Buterin, 2018). Bitcoin does not currently implement such a mechanism.

Compact Clients: Header Distribution via Relay Networks

A practical obstacle for light clients is obtaining block headers reliably. Traditional SPV clients connect to the Bitcoin peer-to-peer network directly, which exposes them to eclipse attacks where all connected peers are controlled by an adversary.

An alternative approach distributes headers through a separate relay network. Full nodes acting as publishers serialize each new block header and broadcast it through the relay layer. Light clients subscribe to multiple independent publishers and verify the header chain locally.

Definition 20.6 (Compact Client)

A compact client is a light client that:

Receives block headers from one or more publishers via a relay network, rather than from the Bitcoin P2P network directly.
Validates the header chain (proof of work, difficulty, timestamps).
Optionally subscribes to fraud proof events from multiple independent full nodes.
Alerts the user when publishers disagree on the chain tip, indicating a possible chain split or eclipse attack.

This architecture offers improved eclipse resistance compared to traditional SPV. An attacker must compromise all relay endpoints simultaneously rather than surrounding a single node in the P2P network. Publisher identity is cryptographically fixed, so the client knows exactly which full nodes it is trusting and can select publishers operated by independent parties.

Incremental Verification Tiers

Combining the classification above with the compact client architecture yields a natural progression from minimal to near-full verification:

Tier	What the Client Verifies	Trust Assumption
0: Header chain	PoW, difficulty, timestamps, chain linkage (Class H)	Longest valid-work chain is honest
1: Block-level fraud proofs	Block weight, coinbase inflation, BIP 34/141 commitments (Class F)	At least one honest publisher relays fraud proofs
2: Transaction-level fraud proofs	Signatures, timelocks, script execution (Class T)	Same as Tier 1, plus client has Script evaluation
3: UTXO verification	Input existence, cross-block double-spends, fee totals (Class U)	Requires UTXO commitments (consensus change) or validity proofs

Each tier strictly reduces the trust surface. A Tier 0 client trusts that the majority of mining hash rate produces valid blocks. A Tier 1 client only trusts that miners do not violate the specific Class F rules and that no honest full node detects the violation, which is a strictly weaker assumption. The progression continues until Tier 3, which approaches full node security but depends on infrastructure that does not yet exist in Bitcoin's consensus layer.

Honest Assessment

Even at Tier 2, a light client with fraud proofs is not equivalent to a full node. The fundamental difference is epistemic: a full node has positive knowledge that every rule was satisfied, while a light client with fraud proofs has only the absence of evidence that any rule was violated. This distinction, emphasized by Bitcoin Core developers, is real and irreducible. The value of the tiered approach is not that it replaces full validation, but that it honestly quantifies what can and cannot be verified at each resource level.

20.7 Wallet Architectures in Practice

Desktop Wallets

Wallet	Backend	Privacy	Verification
Bitcoin Core	Built-in full node	Perfect	Full
Electrum	Electrum servers	Configurable	Merkle proofs
Sparrow	Core/Electrum/Public	Configurable	Merkle proofs
Wasabi	BIP-157/158	Good	Filters + headers

Mobile Wallets

Wallet	Backend	Privacy	Notes
BlueWallet	Electrum/own server	Configurable	Can use own server
Blockstream Green	Blockstream servers	Trust Blockstream	Multisig option
Phoenix (LN)	ACINQ servers	Trust ACINQ	Lightning-focused
Breez (LN)	Neutrino	Good	BIP-157/158

Hardware Wallet Integration

Hardware wallets (Ledger, Trezor, Coldcard) are signing devices, not full wallets. They require a software companion for blockchain queries:

Vendor software: Ledger Live, Trezor Suite (use vendor servers)
Third-party wallets: Electrum, Sparrow (configurable backend)
Best practice: Use hardware wallet with personal server backend

20.8 Choosing an Architecture

Decision Framework

Definition 20.7 (Architecture Selection Criteria)

Consider these factors when choosing a light client architecture:

Privacy requirements: Who can see your addresses?
Trust requirements: What can a malicious server do?
Resource constraints: Storage, bandwidth, always-online?
Use case: Savings, spending, Lightning, business?
Technical ability: Can you run your own infrastructure?

Recommendations by Use Case

Use Case	Recommended	Rationale
Long-term savings	Full node + hardware wallet	Maximum security for large amounts
Privacy-focused	Full node or personal Electrum	No third-party sees addresses
Mobile spending	Wallet with own server	Balance convenience and privacy
Lightning node	Full node or Neutrino	Need to monitor channels
Casual/learning	Any reputable wallet	Convenience acceptable for small amounts
Business/exchange	Full node required	Cannot trust third parties

The Path to Self-Sovereignty

Progressive Decentralization

Users often progress through trust levels as they gain experience:

Start with convenient public server wallet
Add Tor for IP privacy
Run personal Electrum server
Eventually run full Bitcoin Core node

This progression reflects increasing value stored and deepening understanding.

Exercises

Exercise 20.1

Set up electrs (Rust Electrum server) connected to Bitcoin Core in regtest mode. Query it using the Electrum protocol and observe the raw JSON-RPC messages.

Exercise 20.2

Compare the bandwidth requirements for syncing a 100-address wallet from genesis using: (a) full node, (b) BIP-157/158, (c) Electrum server queries. Assume the average block has 2500 transactions.

Exercise 20.3

Design an attack where a malicious Electrum server profits by lying about transaction confirmations. What verification would detect this?

Exercise 20.4

Explain why using an Electrum server over Tor still leaks privacy to the server operator, while BIP-157/158 over Tor provides meaningful privacy even against the peer.

Exercise 20.5

A wallet queries three Electrum servers and receives conflicting balance information. Design a protocol to identify which server(s) are lying using Merkle proofs.

Exercise 20.6

Calculate the storage requirements to run: (a) Bitcoin Core, (b) pruned Bitcoin Core, (c) Bitcoin Core + ElectrumX, (d) Bitcoin Core + Fulcrum. Which setups can run on a Raspberry Pi with 1TB storage?

Exercise 20.7

Construct a coinbase inflation fraud proof for a block at height 840,000 (post-halving, subsidy = 3.125 BTC). Specify exactly what data the proof must contain, how the light client verifies the merkle inclusion, and what check it performs on the coinbase outputs. What information is the light client still missing to verify that the claimed fee total is correct?

Exercise 20.8

A compact client subscribes to three independent header publishers. Two publishers report block hash A at height n, while the third reports a different hash B at the same height. Describe the possible causes (chain split, eclipse attack on one publisher, stale block) and design a decision procedure for the client.

Chapter Summary

Light clients exist on a spectrum from trustless (full node) to fully trusted (centralized API), with various trade-offs in between.
Electrum's client-server model provides efficient address queries but reveals all wallet addresses to the server.
Operating one's own Electrum server eliminates third-party privacy leakage but requires additional infrastructure.
Block explorer APIs are convenient but provide no verification and entail complete privacy loss to the operator.
Hybrid approaches combine multiple backends with verification layers to optimize for different requirements.
Consensus rules can be classified by what data is needed to verify them: header-only (Class H), transaction-level (Class T), fraud-provable (Class F), and UTXO-dependent (Class U). Light clients can incrementally close the validation gap by progressing through verification tiers.
Fraud proofs enable compact evidence of consensus violations, but depend on data availability and provide negative assurance (absence of alerts) rather than the positive assurance of full validation.
The appropriate architecture depends on the user's privacy, security, and resource requirements: there is no one-size-fits-all solution.