This article is the ninth in a guide to the Bitcoin network, accessible even to those who aren’t experts in coding. This article continues a sort of guide designed to gradually enter what many call the “rabbit hole”.
As far as the bibliography is concerned, it is necessary to mention the book “Mastering Bitcoin” by Andreas M. Antonopoulos, from which the images have been taken.
The Bitcoin network
Bitcoin is structured as a peer-to-peer (P2P) network architecture built on top of the internet. Peer-to-peer means that there are no network nodes with different powers from the others: all of them are equal and have the responsibility and honour to provide and enjoy network services: the nodes connect in a mesh network with a flat topology: there are no servers, centralised services or hierarchy of any kind. The term network refers precisely to the set of nodes that run the Bitcoin protocol (there are also other protocols, such as Stratum used for mining and mobile wallets: these protocols are equipped with gateway routing servers that have access to the Bitcoin network by using the protocol and extending the network by running other protocols).
Although nodes are equal, there are different roles depending on the functionality they provide. A Bitcoin node is a set of functions:
- blockchain (database);
- wallet (services).
All nodes include the routing function to participate in the network, ergo they validate and propagate transactions and blocks, they discover and manage peer to peer connections:
- some nodes (full nodes) keep a complete and updated copy of the blockchain, they independently and authoritatively verify transactions without external references;
- some nodes (SPV nodes or lightweight nodes) keep only a subset of the blockchain and verify transactions using a method called SPV (simplified payment verification);
- Mining nodes compete with each other to create new blocks by running hardware specialising in finding solutions to the Proof-of-Work algorithm (some are full nodes, others are lightweight participants in mining pools);
- user wallets can be part of a full node, or as in the case of a mobile wallet, be part of an SPV node;
- in addition to the main types of nodes inherent in the Bitcoin protocol, there are servers and nodes that run other protocols, such as specialised protocols for mining pools and those for accessing lightweight clients.
The expanded Bitcoin network
The Bitcoin main network, which runs the Bitcoin P2P protocol, now has almost 10,000 nodes running various versions of the bitcoin reference client (Bitcoin Core) and a few hundred nodes running various other implementations of the Bitcoin P2P protocol, such as BitcoinJ, Libbitcoin, and btcd. A small percentage of the P2P Bitcoin network nodes are also mining nodes. Various businesses interface with the bitcoin network by running full-node clients based on the Bitcoin Core client, thus becoming network nodes with copies of the blockchain, but without mining or wallet functionality. These nodes act as network edge routers, allowing various other services (exchange, wallet, block explorer, merchant payment processing) to be built on top of the network.
As a result, the extended network includes both nodes running the Bitcoin protocol and nodes running specialised protocols. Attached to the main network are a number of pool servers and protocol gateways that connect nodes running other protocols (mostly mining pool nodes and “light” clients).
Bitcoin Relay Network
While the P2P network meets the general needs of a wide variety of node types, there is too much latency for the needs of mining nodes, which have to minimise the time between the propagation of a winning block and the start of the next competition round. A Bitcoin Relay Network is a network that attempts to minimise this latency and consists of dozens of specialised nodes hosted on Amazon Web Service infrastructure which are used to connect the majority of miners and mining pools. The original (2015, Matt Corallo) was replaced in 2016 (Matt Corallo) with the Fast Internet Bitcoin Relay Engine (FIBRE), a UDP-based relay network that forwards blocks within a network of nodes.
When starting a node, it must discover other nodes to participate in the network and connect to at least one of them (geography is irrelevant), chosen at random. To connect to a known node, the nodes establish a TCP connection (normally on port 8333). After establishing the connection, the nodes start with a “handshake” and send a version message containing basic information, such as:
- nVersion – the version of the client’s Bitcoin P2P protocol (e.g. 70002)
- nLocalServices – a list of local services supported by the node, currently only NODE_NETWORK
- nTime – the time
- addrYou – the IP address of the remote node as viewed by this node
- addrMe – the IP address of the local node, as discovered by the local node
- subver – shows the type of software that runs on the node (e.g.: /Satoshi:0.9.2.1/)
- BestHeight – the height of the block on the node’s blockchain
The version message is always the first message sent by a peer to another. A local peer receiving it examines the reported nVersion to see if the remote peer is compatible. If yes, the local peer recognises the version message and establishes a connection by sending a verack.
To find peers, the first method is to query DNSs that use a number of “DNS seeds” which provide a static list of IP addresses of Bitcoin nodes. Some of these seeds provide a static list of IP addresses of stable Bitcoin nodes listening. Some of these are custom implementations of BIND (Berkeley Internet Name Daemon) that return a random subset from a list of bitcoin addresses collected from a crawler or long-running Bitcoin node. Bitcoin Core contains the names of 5 different seeds. The diversity of ownership and implementation of different DNS seeds offers a good level of reliability for the initial start-up process. Alternatively, a startup node that knows nothing about the network must have the IP address of at least one bitcoin node, after which it can establish further connections. Once one or more connections are established, the new node sends an addr message containing its IP address to the neighbours, who will forward the message to their neighbours, ensuring that the new node becomes known and well connected. In addition, the new node can send getaddr to their neighbours, asking them to return a list of IP addresses of other peers. In this way, a node can find peers to connect to and notify of its existence in the network for other nodes to find it. A node must connect to different peers to establish different paths in the network: these are not reliable – nodes come and go – so nodes must continue to discover new nodes as they lose old connections and as they help other nodes to start. One connection is enough to start, and it’s also useless and wasteful of network resources to connect to more than a handful of nodes. After booting, a node remembers recent successful connections, so if it is restarted it easily restores connections with previous peers (if no one responds to the connection request, the node uses the seed to boot again).
In the early years of Bitcoin, all nodes were complete and currently the Bitcoin Core client is a complete blockchain node. Complete nodes maintain a full and updated blockchain with all transactions, which are independently formed and verified, starting with the first block (genesis block) and building the last known block in the network. A full node can independently and authoritatively verify each transaction without resorting or relying on any other node or source of information. A full node relies on the network to receive updates on new blocks of operations, which it then verifies and incorporates into its local copy of the blockchain.
The first thing a complete node will do when it connects to peers is to try to construct the complete blockchain. If it is a new node and has no blockchain, it only knows one block (genesis block), which is embedded in the client. Starting with block no. 0, the new node will have to download hundreds of thousands of blocks to synchronise with the network and restore the entire blockchain. The blockchain synchronisation process starts with the version message, as it includes BestHeight, the current “height” of the blockchain. The peer that has the longest blockchain has more blocks than the other node and can identify which blocks the other node needs to “restore”: it will identify the first 500 blocks to share and transmit their hashes using an inv (inventory) message. The node that is missing these blocks will retrieve them, sending a series of getdata messages that require complete block data and identifying the required blocks using hashes from the inv message (this process of comparing the local blockchain with peers and recovering any missing blocks happens whenever a node goes offline for any period of time and starts sending getblocks).