Nvidia is finding NeMo in hot water as the GPU giant and its LLM toolkit are being sued for copyright infringement in the latest in a long line of AI lawsuits
Just because it's on the Internet doesn't mean what's mine is yours.
If you've ever written a book and had it professionally published, you'll know that your work is protected by copyright laws. There are exemptions and limitations, such as fair usage, but all of it is very clear and strict. However, three American authors have filed a lawsuit, claiming that Nvidia is guilty of breaching said laws by using their work without permission, to train its LLM toolkit called NeMo.
Generative AI models, such as GPT-3, Llama, and Dall-e, require huge amounts of data to train them and make it possible to use the model in tools like ChatGPT and Copilot. In the case of NeMo, it's technically a framework for AI developers, helping to make it easier to create, tweak, and distribute their own large language models (LLMs).
But even so, it still has to undergo AI training and additionally, Nvidia offers a range of pre-trained models in its cloud service. The Reuters report on the lawsuit (via Seeking Alpha) is a touch light on details, but it's early days for the case as it was only filed last week. The three authors in question (Brian Keene, Abdi Nazemian, and Stewart O'Nan) are claiming that one of the very large datasets Nvidia used for its training contains copies of their published works and the use was done without permission.
Normally in such legal cases, the defence focuses on it being an example of 'fair use' and Meta has even gone as far as to say that it's essentially no different to how a child learns by being exposed to speech and text around it.
On the other hand, those that have filed lawsuits in the past, such as the New York Times, have said that this is simply about the AI world not being willing to pay the due fees for works that are not only protected by copyright laws but have also correctly registered their work with the appropriate authorities.
Best CPU for gaming: The top chips from Intel and AMD.
Best gaming motherboard: The right boards.
Best graphics card: Your perfect pixel-pusher awaits.
Best SSD for gaming: Get into the game ahead of the rest.
Defendants of generative AI typically have a different view: If you've read a multitude of books and then go on to write your own bestseller, is your work in breach of copyright? LLMs don't automatically use exact copies of the material used in the training and if you've ever used something like Stable Diffusion and told it to draw you a famous painting, you'll get one like it, but not a picture that's a direct copy.
It's a complex situation, no doubt, but if this class action case is successful, it will almost certainly be followed by countless more, as the dataset in question used nearly 200,000 novels, short stories, textbooks, and so on. All of that material is copyrighted, though not necessarily all of it has been registered.
The biggest gaming news, reviews and hardware deals
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
Either way, the AI lawsuit train is showing no signs of slowing down and I should imagine a great number of writers, artists, musicians, and designers will be paying close attention to the outcome of this particular case. Choo, choo!
Nick, gaming, and computers all first met in 1981, with the love affair starting on a Sinclair ZX81 in kit form and a book on ZX Basic. He ended up becoming a physics and IT teacher, but by the late 1990s decided it was time to cut his teeth writing for a long defunct UK tech site. He went on to do the same at Madonion, helping to write the help files for 3DMark and PCMark. After a short stint working at Beyond3D.com, Nick joined Futuremark (MadOnion rebranded) full-time, as editor-in-chief for its gaming and hardware section, YouGamers. After the site shutdown, he became an engineering and computing lecturer for many years, but missed the writing bug. Cue four years at TechSpot.com and over 100 long articles on anything and everything. He freely admits to being far too obsessed with GPUs and open world grindy RPGs, but who isn't these days?
Meta-funded regulator for AI disinformation on Meta's platform comes under fire: 'You are not any sort of check and balance, you are merely a bit of PR spin'
Microsoft's building data centres out of wood hoping we'll forget AI's made its carbon emissions 29% higher than when it pledged to go 'carbon negative' in 2020