With over 200 million code repositories, GitHub takes prides in being the largest and most advanced development platform in the world.
As with other platforms that host user-generated content, this massive code library occasionally runs into copyright infringement troubles.
In some cases, people use code without permission from the creators, while others use GitHub to store pirated books or even music. And there are also developers whose projects are seen as pirate tools or apps, which often leads to copyright holder complaints.
A few high-profile cases have popped up over the years, including the RIAA’s takedown of YouTube-DL, which was later reversed. Other rightsholders were more successful, with GitHub taking down Unblockit proxy service respositories, as well as the reverse engineered GTA games “Re3” and “reVC”.
1,828 DMCA Notices Last Year
These examples are just the tip of the iceberg. GitHub’s latest transparency report reveals that the platform received a total of 1,828 valid DMCA takedown notices last year. Just a small number of these, 46, were retracted or reversed.
GitHub doesn’t just remove content in response to takedown notices, it can also reach out to developers before taking action. This means developers can sometimes make modifications to prevent entire repositories from going offline.
“That way, if the user removes or remediates the specific content identified in the notice, we avoid having to disable any content at all. This is an important element of our DMCA policy, given how much users rely on each other’s code for their projects,” GitHub notes.
GitHub already posts copies of all DMCA notices on its own website and, starting this year, it also sends copies to Lumen. This central database, managed by the Berkman Klein Center for Internet & Society at Harvard University, also archives copies of notices sent to Google, Twitter, and other platforms.
19,276 Projects Taken Down
These detailed notices also show how many GitHub projects were targeted last year. This is substantially higher than the number of notices, which can list dozens of projects each.
In 2021, DMCA notices took down 19,276 GitHub projects. This can refer to complete repositories and subsets of code or individual files. In response to counternotices or reversals, 85 projects were reinstated, which means that 19,191 stayed down.
Github notes that this number is significantly lower than in 2020, when 36,173 projects were pulled offline. The company stresses that the number is also relatively small compared to the total number of repositories hosted on the site.
“The number 19,191 may sound like a lot of projects, but it’s less than .01% of the more than 200 million repositories on GitHub in 2021,” GitHub writes.
Supporting Developers’ Rights
There is no clear explanation for the drop in takedowns. However, GitHub believes that its response to the YouTube-DL debacle, after which it committed strongly to supporting developers’ rights, may have played a role.
“We are not able to determine the exact cause of the downtick, however, we suspect that a contributing factor is GitHub’s continued focus on standing up for developers’ rights, including the update to our DMCA review process in late 2020,” GitHub writes.
It’s also worth noting that, for the first time, GitHub provides details on its automated scanning filters. The platform doesn’t use these for copyrighted content but it does scan images for child abuse, extremist and terrorist content. This resulted in one hit last year.
“In 2021, out of millions of images scanned, we confirmed automated detection of one account with CSEAI, which was reported to the National Center for Missing & Exploited Children (NCMEC),” GitHub writes, noting that no terrorist or extremist content was found.
GitHub’s full transparency report is available here. In addition to what’s discussed above, it also includes more details on requests for user data, national security letters, takedown requests from governments, and more.