Millions of people use BitTorrent every day but only a few know all the ins-and-outs of how it works.
Meanwhile, an even smaller group is actively involved in shaping the future of the file-sharing protocol.
BitTorrent was first made public by inventor Bram Cohen nearly two decades ago. While it was swiftly embraced by the masses, the protocol itself was far from perfect. Over the years many new features were added, including DHT, UDP trackers, peer-exchange, and support for streaming.
As developer of one of the leading BitTorrent libraries, Libtorrent, Arvid Norberg has been closely involved in the protocol’s development. It’s his code that makes a wide variety of torrent clients function properly. This includes uTorrent Web, Deluge, and qBittorrent.
LibTorrent 2.0 and BitTorrent v2
This week, Norberg announced the latest release of Libtorrent; version 2.0. This new version comes with many changes that eventually will make their way to torrent clients. The most crucial one is the implementation of the BitTorrent v2 protocol specification.
BitTorrent v2 is an improved version of the early BitTorrent standards and includes several technical changes. It was first proposed by Bram Cohen in 2008 and updated and improved along the way. Since most changes take place under the hood, the public at large won’t immediately recognize them, except for one.
Tech-savvy readers can get the complete lowdown from the Libtorrent site but for the sake of simplicity, we will focus on how the changes will affect users.
V2 Torrents and Separate Swarms
BitTorrent v2 changes the way torrents are ‘compiled’ and the newer version is not backward compatible. Older torrents have a SHA-1 hash and the new versions use SHA-256 hashing. This means that going forward, there will be different torrent versions.
These different (v1 and v2) torrents will also create separate torrent swarms. People who download a v1 torrent can’t share anything with people who download a v2 torrent and vice versa. While that sounds like a step back, the reality isn’t that bad.
There is an option to create so-called “hybrid” torrents that can connect to both swarms. These are basically two torrents in one. As a result, all torrents will have the same number of people sharing.
“A v2-enabled client would still be able talk to all peers, and peer-exchange would still work across v1 and v2-enabled peers. The main impact, I think, is that a v2-enabled peer would announce twice for a hybrid torrent, once for each info-hash. Both to trackers and the DHT,” Norberg tells us
For now, it makes sense that publishers, including torrent sites, are best off using hybrid torrents. After all, torrents that only use the v2 specification will have access to a limited number of peers. Norberg agrees.
“I think it would make sense for publishers to generate hybrid torrents. At least experiment with it to ensure it works well. v2-only torrents would only make sense for closed ecosystems right now, where the publisher also controls all clients.”
Important Changes Under The Hood
While new torrents are the most visible change, for outsiders it’s merely a byproduct of important changes under the hood. For example, the switch from SHA-1 to SHA-256 hashing will prevent a possible hash collision, which can be used for attacks and exploits.
Norberg tells is that the risk of these attacks is mostly theoretical, but this may change over time. So changing to SHA-256 is certainly wise. An even more exciting change, according to the developer, is the use of ‘per-file merkle hash trees’ for the piece hashes.
In simple terms, this means that all files in a torrent will have their own unique identifier (hash). So, a collection of 100 photos will have a unique hash for each photo. This comes with several advantages.
For example, it will allow torrent clients to quickly check if they are receiving the right file. This prevents pollution attacks that can be used by outsiders to slow down torrent transfers.
“With the v2 hash trees, corrupt data will be detected immediately and the peer responsible for it can be disconnected. Currently, there’s more complex heuristic involved in attributing corrupt data to a peer, which means a malicious peer can do slightly more damage before being disconnected,” Norberg says.
Mutable Torrents and Merging Swarms
In addition, it opens up the door for peers to get the same file from multiple torrents. This is already technically possible today, as BiglyBT’s ‘swarm merging’ feature shows, but with unique file hashes, it’s easier and more reliable.
“Doing that is technically possible today, but to make it work generally for arbitrary torrents is very complicated. Having ‘per-file merkle trees’ greatly simplifies implementing this,” Norberg notes.
The same is true for so-called ‘mutable torrents’ where publishers can update torrents to add or remove files. That’s much easier with BitTorrent v2.
Finally, we should mention that it’s not just the .torrent files that will change. The v2 and hybrid magnet links are different too. And they will likely start downloading quicker, as the initial transfer of all piece hashes will be smaller. That is most noticeable when streaming or for downloading large archives.
Just how soon the v2 torrents will work depends on when clients update to the latest Libtorrent version. That can take days, but also more than a year. When large publishers and torrent sites will embrace the changes is uncertain as well, but eventually, it’s the way forward for all.