DHT (Distributed Hash Table) is a category of distributed computing systems and a type of distributed system that provides a lookup service similar to a hash table. Key-value pairs are stored in the DHT distributed hash table, and any participating node can efficiently retrieve the key-value associated with a given key. In the article comparing the principles of BitTorrent and Magnet, we have introduced their connections and differences. Now we will explain in detail what the DHT protocol specifically is.
The main advantage of DHT is that nodes can be added or removed with minimal key redistribution work. Key values are unique identifiers mapped to specific values, which can be anything from addresses to documents to arbitrary data. The responsibility for maintaining the mapping from key values to numerical values is distributed among the nodes, minimizing the impact of any changes to the involved participants on the overall functionality of the system or process. This allows DHT to scale to a very large number of nodes and handle continuous node arrivals, departures, and failures.
BitTorrent uses a "Distributed Loose Hash Table" (DHT) to store contact information for seed nodes based on a Trackerless protocol. In practice, each node becomes a Tracker. The protocol is based on Kademlia and is implemented via UDP.
Please note the terminology used in this document to avoid confusion. Peer nodes are clients/servers that listen on TCP ports implementing the BitTorrent protocol. Node nodes are Tracker clients/servers that listen on UDP ports implementing the distributed hash table protocol. DHT consists of Node nodes and location nodes that store Peers. The BitTorrent client includes a DHT node that is used to contact other Node nodes in the DHT to obtain the location of Peer nodes downloaded using the BitTorrent protocol.
Peer generally refers to the user end participating in file sharing, that is, the client program downloading or uploading specific files, which are usually at the same level. Each peer can download and upload files, making the entire network more decentralized, as no single node holds all the data. Node typically refers to the Tracker server, which manages and coordinates connections and data transfers between various Peers through the Tracker protocol. The Tracker server maintains a list of active Peers and sends it to other Peers to help them establish connections and find other Peers that can share files. Therefore, in BitTorrent, Peer refers to the client program that downloads and uploads files, while Node refers to the Tracker server used to coordinate connections and file sharing between Peers.
Each Node node has a globally unique identifier called Node ID. The Node ID is randomly selected from the same 160-bit space as the BitTorrent info hash. A distance metric is used to compare two Node IDs or a Node ID with proximity comparison info hashes. Node nodes must maintain a routing table containing contact information for a small number of other nodes. As the ID gets closer to the node's own ID, the routing table becomes more detailed. Nodes know many other nodes in the DHT whose IDs are close to their own ID, but only a few downloaders' IDs are far from their own ID.
In Kademlia, the distance metric is XOR, and the result is interpreted as an unsigned integer. distance(A,B) = |A xor B|; the smaller the value, the closer the nodes are.
When a Node node wants to find the Peer node of a seed, it uses the distance metric to compare the seed's info hash with the IDs of nodes in its routing table. It then contacts the nodes it knows that have IDs closest to the info hash and requests them to provide the download information for the current downloading seed's Peer nodes. If the contacted node knows the seed's peer nodes, the peer contact information will be returned with the response. Otherwise, the contacted node must respond using the contact information of the node in its routing table that is closest to the seed's info hash. The original node iteratively queries nodes that are closer to the target info hash until no closer nodes can be found. After the search is complete, the client inserts its own peer downloader information into the response node with the ID closest to the seed's info hash.
infohash is a unique identifier represented by 40 hexadecimal characters, used to identify a torrent file. This identifier is generated by hashing the metadata of the torrent file, which includes information such as file name, file size, and number of files. When a client wants to download a torrent file, it sends a request to the tracker containing the infohash of that file. The Tracker will reply with a list of available seeds and peers, which can help the client connect to other clients and start downloading the file. Because the infohash is generated based on the content of the file, even if two torrent files have different names, as long as they contain the same data, their infohash will be the same.
The return value of a Peer node query includes an opaque value called a token. To allow the Node node to announce that its controlled Peer node is downloading the seed, it must provide the Token received from the same querying node in the most recent Peer node query. When a node attempts to announce a seed, the queried node checks the Token against the querying node's IP address. Since the Token is only returned by the querying node to the same node from which it received the Token, the Token must be accepted within a reasonable time after distribution. BitTorrent implementations use the SHA1 hash of the IP address connected to a Token, which changes every five minutes and accepts Tokens for up to ten minutes.
In BitTorrent, "Token" typically refers to a method authorized by the Tracker server to help clients obtain a one-time key or token for acquiring other peers and downloading required file chunks. When a BitTorrent client wants to join a specific Torrent download, it first sends a request to the Tracker to obtain a list of available peers. If the Tracker requires authentication, it may return a "Token" to the client as a one-time token that can only be used for this specific Torrent download. The client can use this Token to send additional requests to the Tracker, including updating the peer list and downloading file chunks. This Token can prevent unauthorized clients or peers from attempting to join the download and enhances the overall security of the BitTorrent network.
Routing Table
Each node maintains a routing table of known good nodes. Nodes in the routing table serve as starting points for queries in the DHT. Nodes from the routing table are returned in response to queries from other nodes.
Not all nodes we know are equal. Some are good, and some are not. Many nodes using DHT can send queries and receive responses but cannot respond to queries from other nodes. It is important that each node's routing table only contains known good nodes. A good node is one that has responded to one of our queries in the past 15 minutes. If a node has ever responded to one of our queries and has sent us a query in the past 15 minutes, it is also considered good. After being inactive for 15 minutes, a node becomes problematic. When nodes fail to respond to multiple queries in succession, they become bad. We prioritize good nodes over those with unknown status.
The routing table covers the entire node ID space from 0 to 2160. The routing table is divided into buckets, each covering a portion of the space. An empty table has one bucket, with an ID space range of min=0, max=2160. When a node with ID "N" is inserted into the table, it will be placed in the bucket where the minimum value is <= N < maximum value. An empty table only has one bucket, so any node must fit within it. Each bucket can only hold K nodes, currently set to 8, before it is considered "full." When a bucket is filled with known good nodes, no additional nodes can be added unless our own node ID falls within the range of the bucket. In this case, the bucket will be replaced with two new buckets, each covering half the range of the old bucket, and the nodes from the old bucket will be distributed between the two new buckets. For a new table with only one bucket, the entire bucket is always split into two new buckets, with ranges of 0..2159 and 2159..2160.
In the BitTorrent protocol, a Bucket refers to the downloader dividing the download task into multiple sub-tasks and assigning each sub-task to different remote uploaders (peers) to download data blocks. These data blocks are organized into Buckets, which can be seen as a collection of data blocks. Typically, a Bucket contains hundreds of data blocks, each ranging from 16KB to 64KB in size. The downloader attempts to obtain different Buckets from multiple uploaders to speed up the download. This method is also known as multi-source downloading, as the downloader downloads data from multiple sources simultaneously.
When a bucket is filled with good nodes, new nodes will simply be discarded. If any known nodes in the bucket are bad, one node will be replaced with a new node. If any suspicious nodes in the bucket have not been queried in the past 15 minutes, a ping operation will be performed on the least recently queried node. If the pinged node responds, the next least recently queried suspicious node will be pinged until a node fails to respond or all nodes in the bucket are known good. If nodes in the bucket fail to respond to pings, it is recommended to try again before discarding that node and replacing it with a new good node. This way, the table will be filled with stable, long-running nodes.
Each bucket should maintain a "last changed" attribute to indicate the "freshness" of its contents. When nodes in the bucket are pinged and respond, or when nodes are added to the bucket, or when nodes in the bucket are replaced with other nodes, the last changed attribute of the bucket should be updated. Buckets that have not changed in 15 minutes should be "refreshed." This is done by selecting a random ID within the bucket range and performing a find_nodes search on it. Nodes that can receive queries from other nodes typically do not need to refresh their buckets frequently. Nodes that cannot receive queries from other nodes usually need to refresh all buckets regularly to ensure that their tables contain good nodes when DHT is needed.
After inserting the first node into its routing table and upon subsequent startups, nodes should attempt to find the closest nodes to themselves in the DHT. This is achieved by sending find_node messages to increasingly closer nodes until no closer nodes can be found. The routing table should be preserved between calls of the client software.
BitTorrent Protocol Extensions#
The BitTorrent protocol has been extended to exchange node UDP port numbers between Peer nodes introduced by the Tracker. In this way, clients can automatically seed their routing tables by downloading regular seeds. A newly installed client attempting to download trackerless seeds will have no nodes in its routing table, and the seed file will also require node information.
Peer nodes supporting DHT set the last bit of the 8-byte reserved flag exchanged in the BitTorrent protocol handshake. When a Peer node receives a handshake indicating that the remote Peer node supports DHT, it should send a PORT message. It starts with byte 0x09 and has a two-byte payload containing the UDP port of the DHT node in network byte order. The Peer node receiving this message should attempt to ping the node at the remote Peer node's receiving port and IP address. If a response to the ping is received, the node should attempt to insert the new contact information into its routing table according to the usual rules.
Torrent File Extensions#
A Trackerless seed dictionary does not have a "release" key. Instead, Trackerless seeds have a "nodes" key. This key should be set to K closest nodes in the seed-generating client's routing table. Alternatively, the key can be set to known good nodes, such as those operated by the person generating the seed. Please do not automatically add "Router.Bittorrent.Com" to the seed file or automatically add this node to the client routing table.
nodes = [["<host>", <port>], ["<host>", <port>], ...]
nodes = [["127.0.0.1", 6881], ["your.router.node", 4804], ["2001:db8:100:0:d5c8:db3f:995e:c0f7", 1941]]
KRPC Protocol#
The KRPC protocol is a simple RPC mechanism consisting of encoded dictionaries sent via UDP. A single query packet is sent, and a single packet is sent as a response, with no retries. There are three types of messages: query, response, and error. For the DHT protocol, there are four queries: ping, find_node, get_peers, and announce_peer.
Each KRPC message is a single hash table, with three common keys and additional key-value pairs provided based on the message type. Each message has a key "t," whose string value represents the transaction ID. This transaction ID is generated by the querying node and echoed in the response, so the response may be associated with multiple queries to the same node. The transaction ID should be encoded as a small string of binary digits, usually 2 characters are sufficient, as they cover 2^16 outstanding queries. Each message also has a key "y," which contains a single character value describing the message type. The value of the "y" key is "q" for queries, "r" for responses, and "e" for errors. Each message with a client version string should also include a key "v." Not all implementations include the "v" key, so clients should not assume its presence.
Contact Encoding#
A compact representation of Peer node contact information, also known as "compressed IP address/port information." In this encoding, a Peer node's contact information is represented as a 6-byte string. The first 4 bytes represent the Peer node's IP address, encoded in network byte order (i.e., big-endian); the last two bytes represent the Peer node's port number, also encoded in network byte order. Specifically, the arrangement of these 6 bytes is to first represent the IP address and then the port number, resulting in a string in the format of IP address + port number.
For example, if a Peer has an IP address of 192.168.1.100 and a port number of 6881, its contact information can be represented as a 6-byte string:
c0 a8 01 64 1a e1
where the first 4 bytes c0 a8 01 64 represent the IP address 192.168.1.100, and the last 2 bytes 1a e1 represent the port number 6881. The two parts are concatenated in the order of IP address first and port number second, resulting in this 6-byte string.
Node information is encoded differently. The contact information of a node is encoded as a 26-byte string, which contains 20 bytes of node ID and 6 bytes of IP address and port number information. This encoding is also known as "compact node information." Nodes can use this information to connect to other peers and share files.
Queries#
When sending a request, the "y" key in the message dictionary is set to "q," indicating that this is a query request. Additionally, the message dictionary must contain the "q" and "a" keys. The "q" key specifies the specific query type, while the "a" key contains the parameters required for the query.
Responses#
According to the KRPC protocol, the message dictionary contains a "y" key indicating the message type. If its value is "r," it indicates that this is a response message. In this case, the message dictionary must also contain an additional "r" key, whose value is a dictionary containing the named return values.
When a node receives a request message and successfully processes it, it will send a response message to reply to the requester. This response message follows the same format, but the value of the "y" key is "r," and it contains an additional "r" key, whose value is a dictionary containing the response return values. In this way, nodes can perform queries and replies, enabling basic communication and interaction.
Errors#
In the KRPC protocol, when the value of the "y" key in the message dictionary is "e," it indicates that this is an error message. In this case, the message dictionary will contain an additional "e" key, whose value is a list. The first element of this list is an integer representing the error code, and the second element is a string containing the error message. This situation typically occurs when a node receives a request message but cannot process it, or when the request cannot be completed for some reason. By sending an error message, the requester can understand the specific reason for the error and take appropriate action.
The following table describes possible error codes:
Error Code | Error Description |
---|---|
201 | General error |
202 | Server-side error |
203 | Protocol error, such as malformed packets, invalid parameters, or incorrect tokens |
204 | Unknown method error |
Example of an error data packet:
generic error = {"t":"aa", "y":"e", "e":[201, "A Generic Error Occurred"]}
bencoded = d1:eli201e23:A Generic Error Occurrede1:t2:aa1:y1:ee
DHT Queries#
All query and response messages contain an "ID" key, and the value of this key contains the ID of the node that issued the query or response. For query messages, the requester appends its own node ID as the value of the "ID" key in the message dictionary. When the receiving node receives the query message, it appends its own node ID as the value of the "ID" key in the response message so that the requester can determine which node the response came from upon receiving it.
By using the "ID" key, basic communication and interaction can be established between nodes, allowing each node to recognize and track queries and response messages from other nodes.
Ping#
The most basic query is a ping. “q” = “ping” A ping query has a single argument, “id,” whose value is a 20-byte string containing the sender's node ID in network byte order. The appropriate response to a ping has a single key “id” containing the node ID of the responding node. The most basic query is ping. “q” = “ping” ping queries have a single parameter, “id,” whose value is a 20-byte string containing the sender's node ID in network byte order. The response to the ping contains a single key “id” with the node ID of the responding node.
In the KRPC protocol, “ping” is one of the most basic query types used to check whether the target node is online and responding normally. The “ping” query contains a parameter “ID,” whose value is a 20-byte string representing the sender's node ID in network byte order. When a node receives a “ping” query message, it simply replies with a response message containing the node ID as confirmation. This response message contains only one key “ID,” whose value is the ID of the responding node.
By sending a “ping” query message and receiving a response message, nodes can determine whether the target node is online and ready to establish a connection. This is one of the most basic communication methods in establishing a P2P network.
arguments: {"id" : "<querying nodes id>"}
response: {"id" : "<queried nodes id>"}
Example data packet:
ping Query = {"t":"aa", "y":"q", "q":"ping", "a":{"id":"abcdefghij0123456789"}}
bencoded = d1:ad2:id20:abcdefghij0123456789e1:q4:ping1:t2:aa1:y1:qe
Response = {"t":"aa", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}}
bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t2:aa1:y1:re
Find_node#
In the KRPC protocol, “find_node” is used to find contact information for a given node ID. The “find_node” query contains two parameters: “ID” representing the requester node's ID and “target” representing the ID of the target node being queried.
When a node receives a “find_node” query message, it determines the response content based on its routing table and the ID of the target node being queried. If the node has contact information for the target node being queried, it will append it as compact node information in string format to the “nodes” key in the response message. If the node does not have contact information for the target node being queried, it will return the K (usually 8) closest nodes' compact node information from its own routing table to the queried target node ID.
By using the “find_node” query message and receiving a response, nodes can learn about the contact information of other nodes and build and maintain the network topology.
arguments: {"id" : "<querying nodes id>", "target" : "<id of target node>"}
response: {"id" : "<queried nodes id>", "nodes" : "<compact node info>"}
Example data packet:
find_node Query = {"t":"aa", "y":"q", "q":"find_node", "a": {"id":"abcdefghij0123456789", "target":"mnopqrstuvwxyz123456"}}
bencoded = d1:ad2:id20:abcdefghij01234567896:target20:mnopqrstuvwxyz123456e1:q9:find_node1:t2:aa1:y1:qe
Response = {"t":"aa", "y":"r", "r": {"id":"0123456789abcdefghij", "nodes": "def456..."}}
bencoded = d1:rd2:id20:0123456789abcdefghij5:nodes9:def456...e1:t2:aa1:y1:re
Get_peers#
In the KRPC protocol, “get_peers” is used to obtain a list of nodes associated with a specific Torrent's infohash. The “get_peers” query contains two parameters: “ID” representing the requester node's ID and “info_hash” representing the infohash of the Torrent file being queried.
When a node receives a “get_peers” query message, it determines the response content based on the information it stores. If the node itself has peers associated with the queried Torrent's infohash, it will append the compact format information of these peers as strings to the “values” key in the response message. Each string represents the compact format information of a peer.
If the node does not have peers associated with the queried Torrent's infohash, it will return the K (usually 8) closest nodes' compact node information from its own routing table, appended as string format contact information to the “nodes” key in the response message. In either case, the response message contains a “token” key, whose value is a short binary string. This TOKEN key can be used as a required parameter for future “announce_peer” queries.
By using the “get_peers” query message and receiving a response, nodes can learn about the list of nodes associated with a specific Torrent's infohash and perform corresponding connections and interactions.
arguments: {"id" : "<querying nodes id>", "info_hash" : "<20-byte infohash of target torrent>"}
response: {"id" : "<queried nodes id>", "token" :"<opaque write token>", "values" : ["<peer 1 info string>", "<peer 2 info string>"]}
or: {"id" : "<queried nodes id>", "token" :"<opaque write token>", "nodes" : "<compact node info>"}
Example data packets:
get_peers Query = {"t":"aa", "y":"q", "q":"get_peers", "a": {"id":"abcdefghij0123456789", "info_hash":"mnopqrstuvwxyz123456"}}
bencoded = d1:ad2:id20:abcdefghij01234567899:info_hash20:mnopqrstuvwxyz123456e1:q9:get_peers1:t2:aa1:y1:qe
Response with peers = {"t":"aa", "y":"r", "r": {"id":"abcdefghij0123456789", "token":"aoeusnth", "values": ["axje.u", "idhtnm"]}}
bencoded = d1:rd2:id20:abcdefghij01234567895:token8:aoeusnth6:valuesl6:axje.u6:idhtnmee1:t2:aa1:y1:re
Response with closest nodes = {"t":"aa", "y":"r", "r": {"id":"abcdefghij0123456789", "token":"aoeusnth", "nodes": "def456..."}}
bencoded = d1:rd2:id20:abcdefghij01234567895:nodes9:def456...5:token8:aoeusnthe1:t2:aa1:y1:re
Announce_peer#
Announce that the peer controlling the querying node is downloading the seed on a port. announce_peer has four parameters: “id” contains the querying node's ID, “info_hash” contains the seed's info hash, “port” contains the port as an integer, and the “token” received from the previous get_peers query.
A node can query other nodes for the peer list of a specific seed file and inform other nodes that it is downloading that specific seed file through the announce_peer request, providing its port number and the previously received token to verify its identity. The queried node needs to verify that the token comes from the same IP address and store the querying node's IP address and port number under the corresponding infohash in its peer contact information. This way, other nodes can connect to that node to download the specific seed file.
An optional parameter implied_port, which can be 0 or 1. If this parameter exists and is non-zero, the port parameter should be ignored, and the source port of the UDP packet should be used as the peer's port. This is useful for peers behind NAT, as they may not know their external port, while supporting the uTP protocol, which accepts incoming connections on the same port as the DHT port.
arguments: {"id" : "<querying nodes id>",
"implied_port": <0 or 1>,
"info_hash" : "<20-byte infohash of target torrent>",
"port" : <port number>,
"token" : "<opaque token>"}
response: {"id" : "<queried nodes id>"}
Example data packet:
announce_peers Query = {"t":"aa", "y":"q", "q":"announce_peer", "a": {"id":"abcdefghij0123456789", "implied_port": 1, "info_hash":"mnopqrstuvwxyz123456", "port": 6881, "token": "aoeusnth"}}
bencoded = d1:ad2:id20:abcdefghij012345678912:implied_porti1e9:info_hash20:mnopqrstuvwxyz1234564:porti6881e5:token8:aoeusnthe1:q13:announce_peer1:t2:aa1:y1:qe
Response = {"t":"aa", "y":"r", "r": {"id":"mnopqrstuvwxyz123456"}}
bencoded = d1:rd2:id20:mnopqrstuvwxyz123456e1:t2:aa1:y1:re
Reference Links#
http://bittorrent.org/beps/bep_0005.html
https://hazelcast.com/glossary/distributed-hash-table/
https://en.wikipedia.org/wiki/Distributed_hash_table