The complaint was filed by three authors, Abdi Nazemian, Brian Keene, and Stewart O'Nan, who claim that books they wrote were among the material used to train the Megatron LLMs.
The lawsuit refers specifically to models that Nvidia released in September 2022, namely NeMo Megatron-GPT 1.3B, NeMo Megatron-GPT 5B, NeMo Megatron-GPT 20B, and NeMo Megatron-T5 3B., along with information about each model, including its training dataset. In this case, the information states that the models were trained on"The Pile" dataset prepared by EleutherAI.
According to the court filing, the Books3 dataset was available separately on Hugging Face until October 2023, when it was removed because it"is defunct and no longer accessible due to reported copyright infringement."
Technology Technology Latest News, Technology Technology Headlines
Similar News:You can also read news stories similar to this one that we have collected from other news sources.
Source: TheEconomist - 🏆 6. / 92 Read more »
Source: BBCTech - 🏆 81. / 55 Read more »
Source: eurogamer - 🏆 68. / 61 Read more »
Source: BBCTech - 🏆 81. / 55 Read more »