Knowing that most BitTorrent-based sharing is conducted in public, the site’s operators harvested torrents and captured the IP addresses connected to them.
When we discovered the project, YouHaveDownloaded had 103,200 torrents in its database and IP address data on 51.2 million users. That platform eventually shut down but a similarly named site, IKnowWhatYouDownload, later emerged with similar functionality.
The tracking service has been entertaining and sometimes scaring BitTorrent users for years, matching IP addresses to infringing downloads and even providing lists of IP addresses relating to specific content. It can show the countries where a torrent proved most popular this month or reveal content becoming popular everywhere today.
Users with dynamic IP addresses researching themselves may be presented with false alarms, but as a broad research tool operating in an underserved niche, the service works as advertised.
Anyone who has spent any longer than five minutes on the site – pirates especially – will understand what the site is for. It’s a service that harvests and then publishes data related to the BitTorrent ecosystem (specifically DHT) so if that’s your thing, you won’t be disappointed.
Those seeking pirate downloads will find absolutely nothing of interest. No torrents. No downloads. Not even a magnet link. Anti-piracy groups and leading entertainment companies arrived at a different conclusion five years ago and still haven’t changed their minds.
Anti-Piracy Experts Unite in Disagreement
After we first reported on IKnowWhatYouDownload in December 2016, anti-piracy companies started reporting the site to Google, claiming it infringed their clients’ rights.
DMCA notices spiked in February 2017 and a handful of months later began to level off. In late 2019, complaints to Google started to rise again and in January 2021, they suddenly took off once more.
At the time of writing, more than 9,472 individual complaints targeting in excess of 18,800 URLs have been submitted to Google, alleging copyright violations that simply did not happen.
Making matters worse, close to 50% of all complaints filed with Google contain URLs that weren’t even present in Google’s indexes when the takedown notices were sent. The search engine usually indexes all pages quickly but in this case the URLs couldn’t be indexed because they never existed in the first place.
The anti-piracy companies may have attempted to predict where infringing links would appear in the future, fabricated the URLs, and sent them to Google in advance, hoping that Google would bin them before they appeared in search results. That can work against pirate sites, but this is not a pirate site – it’s a database of piracy activity.
Other things make the continuous targeting of IKnowWhatYouDownload even more baffling.
Demo Project to Showcase Data Availability
While the service is a fully functioning BitTorrent data portal in its own right, it’s actually a live demo of what can be achieved using data collected by tech outfit PeerTrace. Due to the way data is collected, it is not suitable for prosecuting BitTorrent users but if copyright holders want to access the available data, they can.
PeerTrace data is also available to law enforcement agencies and, as we already know, is useful for people generally interested in how content is spread using BitTorrent, by whom, and where.
It’s the type of data that could prove useful to anti-piracy and entertainment companies but beyond that, it also drives legitimate consumption. Every page on the site referencing data for a specific movie carries links to legal streaming portal Kinopoisk.
Checking For Actual Infringement?
As things stand, there’s no sign that the copyright complaints will end anytime soon. French anti-piracy group ALPA, anime company Toei, Disney, Sky, Canal+, Columbia, Irdeto, Fox, Lionsgate, Sony, and Netflix have all filed infringement complaints – and that’s just a tiny sample of the 42-page list of rightsholders published by Google.
IKnowWhatYouDownload owner Andrey Rogov believes that the companies scan for filenames matching their content and consider that’s good enough to file a complaint.
“I think that a lot of companies (copyright holders) implement automatic systems that search pages with torrents with their content (movie, series and other),” Rogov says.
“Usually, they write to us with automatic email and we answer that we don’t distribute content. But probably some just write reports to Google and that’s it. We don’t like it, of course, but I think we can do nothing with it.”
One thing we considered early on is that copyright holders might not be scanning for filenames on their own but also BitTorrent hash values. In itself, publishing hashes is not an infringement of copyright but if a filename referencing pirated content appears on the same page as an ‘infringing’ hash value, it’s more likely to be a pirate site than not.
Unfortunately, that doesn’t provide a credible explanation either. Rather than displaying the hash values of potentially infringing content, the hash values shown on Rogov’s site (including in URLs) are internally generated and definitely not BitTorrent hashes.
Since Google is required to remove content following complaints, around 46% of the URLs submitted in DMCA notices so far have indeed been removed from Google. That raises the question of when IKnowWhatYouDownload’s search ranking will suffer after being incorrectly labeled a pirate site.