Policy Brief | Foundation Models and Copyright Questions

November 2023

Foundation Models and Copyright Questions

Foundation models are often trained on large volumes of copyrighted material. In the United States, AI researchers have long relied on fair use doctrine to avoid copyright issues with training data. However, our U.S. case law analysis in this brief highlights that fair use is not guaranteed for foundation models and that the risk of copyright infringement is real, though the exact extent remains uncertain. We argue that the United States needs a two-pronged approach to addressing these copyright issues—a mix of legal and technical mitigations that will allow us to harness the positive impact of foundation models while reducing intellectual property harms to creators.

Key Takeaways

➜ Foundation models—AI models trained on broad data at scale for a wide range of tasks—are often trained on large volumes of copyrighted material. Deploying these models can pose legal and ethical risks related to copyright.

➜ Our review of U.S. fair use doctrine concludes that fair use is not guaranteed for foundation models as they can generate content that is not “transformative” enough compared to the copyrighted material. However, amid still evolving case law, the extent of copyright infringement risk and potency of a fair use defense remain uncertain.

➜ To mitigate copyright risks, policymakers should consider making clarifications to fair use doctrine as it applies to AI training data while also encouraging good-faith technical mitigation strategies that align foundation models with fair use standards. Together, these strategies can maximize the benefits of foundation models while minimizing the moral, ethical, and legal harms of copyright violations.

➜ In parallel, policymakers should investigate other policy mechanisms to ensure artists, authors, and creators are awarded fair compensation and credit, both those who do their work with the assistance of AI tools and those who do not use AI.

Read the full brief

Authors

HAI Policy Briefs

Foundation Models and Copyright Questions

Key Takeaways

Authors

Peter Henderson

Xuechen Li

Dan Jurafsky

Tatsunori Hashimoto

Mark Lemley

Percy Liang