For many years, Google has been bombarded with requests from copyright holders to remove allegedly-infringing content from its indexes.
As reported here on TF last week, those requests have now reached astronomic levels – four billion links reported by 168,180 copyright holders against 2,283,811 separate domains.
Google honors most of the requests but rejects a fair few too, often due to the reported activity not actually being copyright infringement. However, when links are removed, users are informed of the fact via a note at the bottom of Google’s search results.
As the image above shows, when results are removed the associated DMCA notice which caused the removal can be found on the LumenDatabase, the online repository where some Internet companies file complaints for transparency purposes.
Anyone can click through and view the notices for themselves but this can be time-consuming, especially when researching a large number of links. It’s a problem the folks at ibit tried to solve this week with the release of a new browser extension.
Compatible with Chrome and Opera, Google Unlocked is open source and available via its Github repo. Its developer offers this simple introduction.
“The extension scans hidden links that were censored on Google search results due to complaints. The tool scans those complaints and extracts the links from them, puts the links back into Google results, all in matter of seconds,” he writes.
TF tested the extension (which isn’t available on the Chrome store) with a clean Opera install and found that it only asks for minimal permission to access Google domains, something confirmed by its developer.
“It only needs permission to access www.google.* domains so that it can inject the missing links back in the page. Under the hood, the extension checks the Google results for the word “complaint” and fetches the URL behind it with a simple XMLHttpRequest. It then parses those URLs and puts them back on the same page.”
Since by its very nature the tool searches for allegedly infringing links, we aren’t going to demonstrate those here. Safe to say, however, the tool does scan LumenDatabase as advertised and all the removed links do get embedded in the search result page itself, very large numbers of links in some instances.
However, we also discovered that Google Unlocked is helpful when researching invalid DMCA notices too, but that (and indeed its ability to concisely display URLs from legitimate takedown complaints) then uncovers a flaw in the system, one that cannot be solved easily – if at all.
Readers will perhaps recall that a poet by the name of Shaun Shane issued a heap of false DMCA notices against sites (this one included) that legitimately reported on his efforts to stop people writing about his poem. So, for fun, we typed the phrase “If only our tongues were made of glass” into Google, which informed us that a single result had been removed.
However, after pressing the Google Unlocked button, we were confronted with eight URLs injected by the extension, as shown below.
While these are indeed all of the URLs present in the notice advised by Google under the “read the DMCA complaint” link provided, most of them were either rejected by Google or are actually legitimate links provided by Shaun Shane himself.
Most DMCA notices filed with the company also include locations where the original source material can be found, so these are also parsed by Google Unlocked and presented as removed content, as the image below illustrates.
So, while Google Unlocked is very capable when it comes to ‘reinstating’ links removed by Google following a copyright complaint, it has some of the same issues suffered by many anti-piracy crawlers – it simply cannot differentiate between infringing and non-infringing content.
Given the simplicity of the extension and the complexity of the situation, this is not a problem Google Unlocked will ever be able to completely solve. So, while it does work as advertised in many scenarios, the reinstated URLs will nearly always contain links pointing to legitimate sources or links that Google has thrown out due to them being non-infringing.
That being said, Google Unlocked’s developer is inviting others to contribute to this interesting project, which may improve its performance over time.
“I put the source on Github and I hope to get more programmers to do pull requests to keep the extension up to date, since I know a lot of geeks will love this extension,” he concludes.
Update: The extension is now on the Chrome store and is also available for Firefox.