IPFS DHT

IPFS DHT #

Please note that this website is a preview and is subject to change at any time. We strive to provide accurate and up-to-date information, but we cannot guarantee the completeness or reliability of the information presented during the preview.

One of the core components of IPFS is the Distributed Hash Table (DHT), which enables nodes to locate and retrieve content from other nodes on the network. The IPFS DHT is a distributed key-value store that maps content addresses (aka CIDs) to the nodes that are currently storing that content. It works by dividing the content address space into small, hash-based “buckets” that are distributed across the network. When a node wants to find a piece of content, it queries the DHT with the content’s CID, and the DHT returns the nodes that are currently storing that content, in the form of Provider Records. This allows content to be located and retrieved in a decentralized, efficient, and fault-tolerant manner. The IPFS DHT is a key component of the IPFS network, and is used by many IPFS implementations, tools, as well as a variety of decentralized applications and systems built on top of IPFS.

Health #

As a distributed system, IPFS relies on the coordinated participation of multiple nodes and servers to function correctly. Monitoring the availability and long term stability of DHT servers over time can give insight into the health of the network. High churn in the network can make content harder to locate and lead to longer retrieval times. Measuring DHT server availability and expected lifetimes can help assess the health and overall efficiency of the network.

Availability #

The Nebula crawler attempts to connect to DHT Server peers in the IPFS DHT periodically. When a new DHT Server peer is discovered, the crawler records the start of a session of availability and extends the session length with every successful connection attempt. However, a failed connection terminates the session, and a later successful attempt starts a new session. Peers can have multiple sessions of availability during each measurement period.

In the following, a peer is classified as “online” if it was available for at least 80% of the measurement period. If a peer was available between 40% and 80% of the period, it is considered “mostly online,” while “mostly offline” indicates availability between 10% and 40% of the time. Any peer that was available for less than 10% of the period is classified as “offline.”

DHT Server Availability #

27.62k▼−1.27%Online1457▲24.21%Mostly Online1960▲4.53%Mostly Offline7.28k▲25.63%OfflineData: week ending 11 Jun 2023. Source: Nebula.

DHT Server Availability, classified over time #

Feb 262023Mar 12Mar 26Apr 9Apr 23May 7May 21Jun 405k10k15k20k25k30k35k40k
OfflineMostly offlineMostly onlineOnlineData: 26 Feb 2023 to 11 Jun 2023. Source: Nebula.

DHT Server Availability, classified by region #

05k10k15k20kAfricaAsiaEuropeNorth AmericaOceaniaSouth AmericaUncategorised
OfflineMostly offlineMostly onlineOnlineData: 4 Jun 2023 to 11 Jun 2023. Source: Nebula.

Churn #

The following displays the count of unique Peer IDs that joined and left the network during the measurement period. The term “entered” refers to a peer that was offline at the start of the measurement period but appeared within it and remained online throughout. The term “left” refers to a DHT Server peer that was online at the start of the measurement period but went offline and did not come back online before the end of the measurement period.

May 212023May 28Jun 4Jun 110100020003000400050006000
leftenteredData: 15 May 2023 to 14 Jun 2023. Source: Nebula.Unique peers

The cumulative distribution of session lengths for peers found in the network is shown below. It plots the cumulative fraction of sessions longer than a given time, shown along the x-axis.

0h1h2h3h4h5h6h7h8h9h10h11h12h13h14h15h16h17h18h19h20h21h22h23h24h00.20.40.60.81
Data: 15 May 2023 to 14 Jun 2023. Source: Nebula.Uptime (hours)Cumulative fraction of online peers

Performance #

Measuring the time it takes to publish and retrieve information from the IPFS DHT is crucial for understanding the network’s performance and identifying areas for improvement. It allows us to assess the network’s efficiency in different regions, which is essential for global-scale applications that rely on the IPFS DHT. Measuring performance across different regions helps identify potential bottlenecks and optimize content delivery.

We have instrumented the following experiment to capture the DHT Lookup performance over time and from several different geographic locations. We have set up IPFS DHT Server nodes in seven (7) different geographic locations. Each node periodically publishes a unique CID and makes it known to the rest of the nodes, who subsequently request it through the DHT (acting as clients). This way we capture the entire performance spectrum of the DHT, i.e., both the DHT Publish and the Lookup performance from each location. In this section we present the average performance over all regions, as well as per region for both the DHT Lookup Performance and the DHT Publish Performance.

Lookup Performance #

0.554s▼−1.35%Median1.128s▲0.63%P901.551s▲2.43%P99Data: 11 Jun 2023. Source: Parsec.

The following plots show the distribution of timings for looking up provider records from various points across the world.

Median DHT Lookup Performance over time #

Mar 262023Apr 9Apr 23May 7May 21Jun 400.20.40.60.811.21.41.6
Data: 16 Mar 2023 to 14 Jun 2023. Source: Parsec.Median lookup time (seconds)

DHT Lookup Performance Distribution #

00.20.40.60.811.21.41.61.800.20.40.60.81
Data: 15 May 2023 to 14 Jun 2023. Source: Parsec.Lookup time (seconds)Cumulative fraction

DHT Lookup Performance Distribution, by region #

00.511.5200.20.40.60.81
af-south-1ap-south-1ap-southeast-2eu-central-1sa-east-1us-east-2us-west-1Data: 15 May 2023 to 14 Jun 2023. Source: Parsec.Lookup time (seconds)Cumulative fraction

Publish Performance #

7.01s▲0.59%Median17.98s▲0.80%P9075.8s▲6.48%P99DHT publish performance (week ending 11 Jun 2023 compared with previous). Source: Parsec.

The following plots show the distribution of timings for publishing provider records from various points across the world.

Median DHT Publish Performance over time #

Mar 262023Apr 9Apr 23May 7May 21Jun 40123456789
Data: 16 Mar 2023 to 14 Jun 2023. Source: Parsec.Median publish time (seconds)

DHT Publish Performance Distribution #

02040608010012000.20.40.60.81
Data: 15 May 2023 to 14 Jun 2023. Source: Parsec.Publish time (seconds)Cumulative fraction

DHT Publish Performance Distribution, by region #

02040608010012014000.20.40.60.81
af-south-1ap-south-1ap-southeast-2eu-central-1sa-east-1us-east-2us-west-1Data: 15 May 2023 to 14 Jun 2023. Source: Parsec.Publish time (seconds)Cumulative fraction

Participation in the DHT #

Measuring participation in the IPFS DHT is crucial to understanding the health and effectiveness of the network. A diverse and wide participation of software agents and peers helps ensure a robust and resilient network. Such diversity helps prevent centralization, provides greater redundancy, and increases the chances of content availability. Moreover, a wide participation allows for a more efficient distribution of content, improves load balancing, and can lead to faster content retrieval.

Client vs Server Node Estimate #

The plot presented below illustrates an estimate of the number of peers that exclusively function as clients. This estimate is derived by deducting the total number of unique peer IDs observed by the bootstrap nodes, operated by Protocol Labs, from the number of unique Peer IDs visited by the Nebula crawler during the same period. Additionally, the plot also shows the number of unique IP addresses observed by the Nebula crawler.

Mar 122023Mar 26Apr 9Apr 23May 7May 21Jun 40100k200k300k400k
DHT Server IP AddressesDHT ClientsDHT ServersData: 5 Mar 2023 to 11 Jun 2023. Sources: Bootstrap+preload server logs; Nebula.Number of unique peers

DHT Server Software #

The Nebula crawler records the software agents announced by peers registered in the IPFS DHT. These peers act as DHT servers and record provider records pointing to content available from other peers in the network.

Most Frequent DHT Server Agents #

2510251002510002510k25ethtweetgo-p2ptunneladeltaDHTMoney0.12.1-system-service-deployer-b9aa93d-2317-1.0go.vocdoni.ioipfs-nucleusuplinknfeederestuaryhydra-boosterparsec0.12.0gitlab.comP2P_Exampleedgevpngo-libp2pmainTaubyte Node v1.0ipfs-counterstormSybilNodego-ipfskubo
Data: 11 Jun 2023. Source: Nebula.Number of unique peers seen in DHT (log scale)

Note that the x-axis in the above plot is represented using a log scale, which emphasizes the relatively smaller populations of software agents compared to the much larger use of Kubo (previously known as go-ipfs) within the DHT.

Active Kubo Versions #

Kubo is the most prevelant software used for peers participating in the DHT. It adheres to a regular release cycle to introduce new features and improvements in performance, stability, and security. Measuring the distribution of Kubo versions provides insights into the adoption rate of new features and improvements and potential issues related to backward compatibility during protocol upgrades.

Kubo Version Distribution #

05k10k15k20kgo-ipfs 0.4.21go-ipfs 0.4.22go-ipfs 0.4.23go-ipfs 0.5.0go-ipfs 0.5.1go-ipfs 0.6.0go-ipfs 0.7.0go-ipfs 0.8.0go-ipfs 0.9.0go-ipfs 0.9.1go-ipfs 0.10.0go-ipfs 0.11.0go-ipfs 0.11.1go-ipfs 0.12.0go-ipfs 0.12.1go-ipfs 0.12.2go-ipfs 0.13.0go-ipfs 0.13.1kubo 0.14.0go-ipfs 0.14.0kubo 0.15.0kubo 0.16.0kubo 0.17.0kubo 0.18.0kubo 0.18.1kubo 0.19.0kubo 0.19.1kubo 0.19.2kubo 0.20.0kubo 0.21.0
Data: 11 Jun 2023. Source: Nebula.

Recent Kubo Versions Over Time #

In the following we show the change in distribution of the nine most recent releases of Kubo each week, grouping all prior releases into the “all others” category. When a new version appears, the oldest of the nine is moved to the “other” category.

Apr 162023Apr 23Apr 30May 7May 14May 21May 28Jun 4Jun 1105k10k15k20k25k30k35k
kubo 0.21.0kubo 0.20.0kubo 0.19.2kubo 0.19.1kubo 0.19.0kubo 0.18.1kubo 0.18.0all othersData: 16 Apr 2023 to 11 Jun 2023. Source: Nebula.
Last published on 15 Jun, 2023 at 3:02pm