This article is the ninth (and the second part) in a guide to the Bitcoin network, accessible even to those who aren’t experts in coding. This article continues a sort of guide designed to gradually enter what many call the “rabbit hole”.
As far as the bibliography is concerned, it is necessary to mention the book “Mastering Bitcoin” by Andreas M. Antonopoulos, from which the images have been taken.
Many bitcoin clients are designed to work on devices with limited space and power (smartphones, tablets, etc.). A Simplified Payment Verification (SPV) method is used to allow such devices to operate without storing the complete blockchain. These types of clients are called SPV clients or light clients. As bitcoin adoption increases, the SPV node is becoming the most common form of bitcoin node, especially for bitcoin wallets.
SPV nodes only download block headers and do not download the transactions included in each block. The chain is 1000 times smaller than the complete blockchain. SPV nodes are not able to create a complete representation of all UTXOs available for spending because they are not aware of all transactions on the network. SPV nodes verify transactions using a slightly different methodology that relies on peers to provide partial views of relevant parts of the blockchain on-demand.
The “Simplified Payment Verification” verifies transactions by referring to their depth in the blockchain rather than their height. While a complete blockchain node builds a fully verified chain of thousands of blocks and transactions reaching the blockchain (back in time) to the genesis block, an SPV node will verify the chain of all blocks (but not all transactions) and link that chain to the transaction in question.
For example, when examining a transaction in the 300,000th block, a complete node will link all 300,000 blocks to the genesis block and create a complete UTXO database, establishing the validity of the transaction and confirming that the UTXO remains unused. An SPV node cannot validate if the UTXO is not in use. Instead, the SPV node will establish a link between the transaction and the block containing it, using a Merkle path.
So, the SPV node waits until it sees the six blocks from 300.001 to 300.006 placed above the block containing the transaction and verifies it by establishing the depth below the blocks from 300.006 to 300.001. The fact that other nodes on the network accepted the 300,000th block, and the work required to produce an additional six blocks on it, is proof that the transaction was not a double spend.
An SPV node cannot be persuaded that a transaction exists in a block (if not included in it), but the existence of a transaction can be hidden from an SPV node, which can prove that the transaction exists but not verify that, as in the case of double-spending, it does not exist (because it has no memory of all the transactions). This vulnerability can be used in denial-of-service or double-spending attacks.
To counter this, an SPV node randomly connects to several nodes, to increase the probability of being in contact with at least one honest node; this, in turn, makes the SPV node vulnerable to network partitioning attacks or Sybil attacks, where they are connected to fake nodes or fake networks and do not have access to honest nodes or the real Bitcoin network. For most practical purposes, well-connected SPV nodes are quite secure, achieving a balance between resource usage, security and convenience. Since SPV nodes retrieve specific transactions to selectively verify them, they create a privacy risk: by requesting specific data they may inadvertently reveal addresses in their wallets (a third party monitoring the network may keep track of all transactions requested by a wallet on an SPV node and use them to associate Bitcoin addresses to the user’s wallet, undermining its privacy).
Shortly after the introduction of SPV nodes, bitcoin developers added a feature called bloom filters to get around the privacy risks of SPV nodes. These filters allow SPV nodes to receive a subset of transactions without accurately revealing the addresses they are interested in. This is done through a filtering mechanism that uses probabilities instead of fixed patterns.
These are used to filter the transactions (and the blocks containing them) that an SPV node receives from peers, selecting only the transactions of interest to the node without revealing which addresses or keys are affected. An SPV node will set a bloom filter as empty (in this state the filter does not match any pattern). The SPV node then creates a list of all addresses, keys and hashes it is interested in, extracting the public key hash, the hash script and the transaction ID from a UTXO controlled by its wallet. The SPV node then adds each of these to the filter so that it can match if these patterns are present in a transaction, without revealing them.
The SPV node sends a filterload message to peers, containing the bloom filter to be used in the connection. On the peers, the filters are checked against each incoming transaction. The full node checks different parts of the transaction against the filter, looking for a match that includes:
- the transaction ID;
- the component data from the block scripts of each transaction output (each key and hash in the script);
- each of the transaction inputs;
- each of the component data from the signature inputs (or witness script).
By checking all these components, filters can be used to match hashes of public keys, scripts, OP_RETURN values, public keys in signatures or future components such as smart contracts or complex scripts. Once a filter is established, peers will test each transaction against it, forwarding only those that match. The network protocol and bloom filters mechanism are described in BIP-137.
Encrypted and Authenticated Connections
Newcomers believe that communications between nodes within the Bitcoin network are encrypted: in the original implementation, they are totally unencrypted. As mentioned above, this is not a problem for full nodes, but it is a big problem for SPV nodes. To increase privacy, two solutions have been proposed to encrypt communications:
- TOR Transport: The Onion Routing Network is a project that offers encryption and encapsulation of data through randomised network paths that offer anonymity, non-traceability and privacy (Bitcoin Core offers several configuration options that allow running a Bitcoin node whose data traffic is transported on the TOR network, and also offers a hidden service that allows other TOR nodes to connect to a node directly via TOR).
- P2P Authentication and Encryption (BIP-150/151): the two BIPs define optional services that can be offered through compatible nodes. BIP-151 enables negotiated encryption for all communications between nodes supporting BIP; BIP-150 offers optional peer authentication that allows nodes to authenticate their identity using ECDSA and private keys (as of January 2017 they are not yet implemented in Bitcoin Core, but in another client called bcoin). The two BIPs allow an SPV node to connect to a trusted full node using encryption and authentication in order to protect the privacy of the SPV client. Additionally, authentication can be used to create trusted bitcoin node networks and prevent man-in-the-middle attacks. Finally, P2P encryption, if used extensively, strengthens Bitcoin’s resistance to data traffic analysis and surveillance, especially in countries where the Internet is monitored and supervised.
Almost all nodes in the bitcoin network maintain a temporary list of unconfirmed transactions called memory pool (or transaction pool). Nodes use this pool to keep track of transactions known to the network but not yet included in the blockchain. As transactions are received and verified, they are added to the memory pool and forwarded to adjacent nodes to propagate over the network.
Some node implementations also maintain a separate pool of orphaned transactions. If the inputs to a transaction refer to a transaction that is not yet known, such as a missing parent, the orphan transaction will be temporarily stored in the orphan pool until the parent transaction arrives. When a transaction is added to the mempool, the orphan pool is checked for orphans who are references to the output of this transaction (the children), and the corresponding orphans are validated.
If valid, they are removed from the orphan pool and added to mempool, completing the chain started with the main transaction. In light of the newly added transaction, which is no longer an orphan, the process is repeated recursively searching for more children until no more descendants are found. Through this process, the arrival of a father transaction triggers a cascading reconstruction of an entire chain of interdependent transactions, reuniting orphans with their parents throughout the chain.
Both the mempool and the orphan pool (where implemented) are stored in local memory and are not saved in permanent memory; rather, they are dynamically “populated” by incoming network messages (at node startup, both pools are empty and are gradually populated with new transactions received on the network; currently, some versions of Bitcoin Core save the mempool on the disk when the node closes and retrieve it at startup).