Thursday, December 25, 2025

Who has the legal burden of filtering copyrighted materials before large language models are trained on gigantic datasets

This legal question becomes more and more pressing! In the name of human progress, the courts in Western countries need to clarify this legal issue fast.

As a classical liberal, I argue that the copyright protections for authors are in effect way too long in Western countries. I bet many people do not know about it! According to Google e.g.: "In Germany, copyright protection generally lasts for the author's entire life plus 70 years after their death, aligning with EU standards". I am all for reducing these outrageous copyrights!

It appears various authors are trying to pick the most low hanging fruits and most lucrative options to make their legal claims. How dishonest is that? Are the authors greedy?

If e.g. (pirated?) illegal copies of copyrighted works are on the Internet freely available to anyone in many cases for years, why did these authors not prevent it?

"A new lawsuit filed by authors including John Carreyrou targets Anthropic, Google, OpenAI, Meta, xAI, and Perplexity, alleging the companies trained large language models on pirated copies of their books. The complaint argues that prior relief—namely the Anthropic $1.5 billion settlement offering roughly $3,000 per eligible writer—fails to address the “massive willful infringement” of using stolen books to build models that generate billions in revenue. This follows a judge’s earlier ruling that while training on pirated copies may be legal under current interpretations, the underlying act of book piracy is not, leaving a gap authors want courts to address directly. ...

In a parallel suit, Adobe faces a proposed class action alleging it trained its SlimLM small language model on SlimPajama-627B, which the complaint says is a derivative of RedPajama that includes the Books3 corpus of 191,000 books. ..."

Last Week in AI #330 - Groq->Nvidia , ChatGPT Apps, US AI Genesis Mission

No comments: