Add more copyright lawsuits levied against OpenAI to the growing pile.
On Wednesday, news organizations the Intercept, Raw Story, and AlterNet filed lawsuits against OpenAI for copyright infringement in the Southern District of New York. The Intercept also includes Microsoft in the lawsuit, whose tool Copilot uses OpenAI’s model GPT-4. The lawsuits allege OpenAI (and Microsoft in the case of The Intercept) violated the Digital Millennium Copyright Act, which prohibits online service providers from removing copyright information from digital content.
ChatGPT’s ability to provide informed, conversational responses was built on the backs of human-created content scraped from the web through datasets like Common Crawl and OpenAI’s WebText and WebText 2. A December lawsuit by the New York Times against OpenAI and Microsoft claimed ChatGPT plagiarized verbatim texts from its stories, without credit or compensation. Similarly, in August 2023, class-action lawsuits were filed against Google and OpenAI for using individuals’ personal data to train the model.
The complaints accuse OpenAI of removing copyright information like authorship and titles, and avoiding paying licensing fees for work created by journalists. The Raw Story and AlterNet suit also claims OpenAI knowingly used copyrighted works because OpenAI created tools for publishers to block their works from being scraped for training data.
“When they populated their training sets with works of journalism, Defendants had a choice: they could train ChatGPT using works of journalism with the copyright management information protected by the DMCA intact, or they could strip it away,” said the lawsuits. “Defendants chose the latter, and in the process, trained ChatGPT not to acknowledge or respect copyright, not to notify ChatGPT users when the responses they received were protected by journalists’ copyrights, and not to provide attribution when using the works of human journalists.”
This is likely not the last copyright infringement lawsuit case against OpenAI or other makers of generative AI tools. Soon after ChatGPT’s release, questions emerged about the training data that was used. And the proliferation of AI models and new tools like OpenAI’s video generator Sora.
Other news organizations are taking a different approach by negotiating licensing deals with OpenAI. The Associated Press and German media company Axel Springer both have deals with the ChatGPT maker.
However it all shakes out, the great AI copyright battle is in full swing.