Thursday, November 21, 2024

Data Collection + Data Reproduction == Copyright Infringement!

 

This is utterly amusing that, some of the AI companies, are now issuing such bizarre statements, that the Data which is available in the Public domain, is presumed to be open for everyone, and in one case, even one of the billionaire founder made a statement that when, if in any case your Data is not making money in the public domain, so it is better to be used by the AI companies for their training module, rather, the owner being objecting to it! And such bizarre statements have been echoed by many in the mainstream too. I won’t publish the specific names.

Now, as in several blogs I’ve already explained that every article of the BERNE convention has basically been thwarted, and for elucidation purposes, I would try to explain once again, briefly, some aspects of copyright herein below.

Indeed, Ideas are not copyrightable, until, they been put down on paper theoretically, wherein at least been proven to be worked vide theoretical aspect! Similarly, Themes aren’t copyrightable, until, certain degree of similarity index been proven as infringement.

Similarly, Facts aren’t copyrightable. For example, the event occurred in the pubic domain, for instance, political news. Until, the entire part of the draft, video shot, or story been ripped from the reported Facts.

But, there’s another term that comes into picture, and that is -> Expressions. They are absolutely Copyrightable, even if they are in the Public Domain for FREE FREE FREE; and not been used by the owner for monetary purposes.

Further, I’ve also tried to explain in my previous blogs, as why many videos that report the current events, or mere opinions on the same, or otherwise, may not be subjected to Copyright, as in the videos, all essential features, needs to be Copyrighted, individually. Otherwise, they merely come under the purview of Fair Dealing / Fair Use, no more no less, as long as they are not used for monetary or trading purposes, and further they don’t come under the purview of transformative nature, nor of derivative nature with permission already taken.

That’s why I raised this concern with the YouTube also, when they were monetizing almost anything, despite the entire features not being subjected to Copyright.

 The use of the DATA from the WEB to train their AI models, LLMs, was only restricted for Scientific Purposes, R&D activities. No More No Less! YET, somehow, all ended up becoming Billionaires somehow, by manipulation of the entire DATA. That’s why I call them DATA Pirates who became Millionaires, Billionaires vide piracy.

Now, to justify their use, they are hiring legal eagles, and spreading statements that, as the entire content is available on the Web, or is opened for the public, thus, aren’t subjected to copyright, which is absolutely WRONG! Shakespeare works are available in the public domain, BUT, can anyone copy paste Romeo and Juliet, and publish in his/her own name, giving the same argument? WRONG!   

My only request is: if you’ve violated BERNE convention, no issues; no one would ask you for money! But please at least don’t come out and make such statements that DATA in the Public domain is open for Public.

Even to publish the ORPHAN works, Compulsory Licenses etc., a request needs to be filed and permission needs to be taken from the Registrar of Copyrights. And if DATA is being collected and used for Wrong reasons, then that comes under the IT Act. And it is not only Global AI Companies alone; people have criticized Bollywood too, in the past, for copying Music/Scrips/Themes.

FYI! 😊

© Pranav Chaturvedi

No comments:

Post a Comment

Should There Be Any Limitation Timeline For Copyright Infringement?

  Let’s separate trademarks, designs, G.I., Patents, and Domain Name Disputes for a moment first, when it come to the infringement proceed...