ChatGPT – do you copy? A case analysis of Raw Story Media Inc. v. OpenAI Inc. et al.
25 November 2024
AUTHOR: CELINE BAKKER
Background:
A digital news outlet, Raw Story Media, brought an infringement claim in the United States District Court, Southern District of New York in July 2024 alleging that the language model OpenAI scraped its copyrighted articles from the web. This was done to train the ChatGPT AI model which generates text-based responses based on large datasets. Raw Story claims that OpenAI did not obtain permission to use its content and that the AI-generated outputs which paraphrase or summarize news articles amount to derivative works that infringe Raw Story’s copyright.
The case primarily hinges on whether OpenAI’s use of copyrighted materials for training purposes falls under fair use—a legal doctrine in U.S. copyright law that allows for limited use of copyrighted material without permission for purposes such as research, commentary, education, or news reporting – or, whether such use would amount to a violation of copyright laws.
Although judgment has been passed dismissing Raw Story Media’s claim in favour of OpenAI, the case is still in its preliminary stages and may have many years of legal debate ahead of it. Importantly however, it raises important legal questions regarding:
A case analysis of Raw Story Media Inc. v. OpenAI Inc. et al. The Background, Opinions, Issues arising and our Conclusion.
-
- Fair Use:
OpenAI argues that its use of publicly available data (including Raw Story’s articles) for training its AI models falls within “fair use” because it does not reproduce the articles directly but rather uses them to train the model in a way that is transformative and not a direct copy of the original content.
OpenAI puts forward that training AI models with publicly available data is transformative: its purpose being for educational and research purposes – which is what the doctrine of fair use seeks to achieve.
-
- Derivative Works:
Raw Story on the other hand asserts that the content generated (paraphrased and or rephrased) by ChatGPT constitutes derivative works (as opposed to sufficiently distinct, original works) because they closely resemble the content of its articles, merely reproducing its content into a new form, which infringes upon its exclusive rights as a copyright holder.
-
- Potential Consequences for AI Companies:
The court’s ruling will have far-reaching implications for the AI industry, as it could establish important legal precedents regarding how AI companies use copyrighted content for training (i.e. either with or without obtaining explicit licenses from the content creators) and the limits of fair use.
-
- Opinion and Commentary on the Digital Environment and Copyright in South African Law:
In South Africa, copyright law is governed by the Copyright Act, 1978, which protects original literary, musical, and artistic works, and prescribes similar copyright principles found in international copyright regimes. Unfortunately, South Africa’s copyright laws face unforeseen challenges when applied to new technologies like AI.
Issues arising from the Copyright Act, 1978:
-
- Fair Dealing:
Unlike U.S. copyright law which explicitly includes a fair use doctrine, South Africa’s Copyright Act provides a fair dealing exception. Fair dealing allows the use of copyrighted works for certain purposes such as research, teaching, and private study, but is narrower than fair use. The defense of transformative use in the AI context cannot thus be as easily put forward before the South African courts, since the law fails to make provision for broad exceptions such as transformative works.
-
- AI and Copyright:
The question of AI-generated content and ownership is less clear under South African law. If an AI model generates content based on copyrighted materials, it raises questions about whether the AI is creating a derivative work or whether the output is sufficiently original to be considered a new work in its own right. South African law will need to evolve to address these concerns, particularly as AI becomes more integrated into sectors like journalism, entertainment, and education.
-
- The Balance Between Innovation and Protection:
South African content creators and businesses could benefit from stronger protection of their works against unauthorized use by AI companies. On the other hand, AI developers need access to large datasets to improve the performance of their models. A legal framework that balances these interests will be necessary to foster innovation while respecting the rights of content creators.
Conclusion:
The Raw Story v. OpenAI case raises fundamental questions about copyright law in the digital age, especially as it pertains to emerging AI technologies and the use of large-scale datasets that include publicly available content.
As AI technologies evolve, copyright holders (such as news outlets and content creators) are increasingly concerned that their intellectual property is being exploited without permission, recognition or compensation, especially when the AI outputs are monetized or lead to products that directly compete with the content creators’ own work.
On the other hand, AI companies argue that training AI models on publicly available data is analogous to traditional research uses, such as using books or articles to develop academic work or software.
In South Africa, the principles of fair dealing and copyright protection for digital works are similarly underdeveloped in relation to new technologies. The South African legislature may need to update the Copyright Act to better address the challenges posed by AI, ensuring that there is a balance between encouraging technological innovation and protecting the economic interests of content creators.
The case underscores the need for a modernized copyright framework that accounts for the realities of the digital economy and AI development, ensuring that both creators and innovators can benefit from the growing role of technology in creative industries.
Celine Bakker, Author.
BA Law, LLB
Associate
Find out more about Celine Bakker,
an Associate at SL Law.
Celine is an associate specialising in Tech and Artificial Intelligence Law. She graduated from Stellenbosch University with a BA (law) majoring in German and completed her LLB degree at the Vrije Universiteit of Amsterdam, focussing on Tech and Artificial Intelligence Law.
UNLOCK TAX EFFICIENCY
BUDGET SPEECH SUMMARY 2023
/ Previous Post
Next Post /
UNLOCK TAX EFFICIENCY
/ Previous Post
BUDGET SPEECH SUMMARY 2023
Next Post /