chevron-down Created with Sketch Beta.

Voice of Experience

Voice of Experience: April 2024

Is Copyrighted Material Used by AI?

Ashley Hallene and Jeffrey M Allen


  • With no universal guidelines for the use of copyright materials for training AI, it raises serious legal questions.
  • The debate over training AI systems with copyrighted works places copyright holders and AI development proponents in opposition.
  • The advancement of AI opens new markets and opportunities for economic growth.
Is Copyrighted Material Used by AI? Olieinikov

Jump to:

The use of copyrighted works to train artificial intelligence (AI) systems raises complex legal and ethical questions, particularly concerning the rights of the copyright holders. The landscape around this issue is evolving rapidly but a clear picture hasn’t emerged yet.

Legal uncertainty and variations across jurisdictions

Currently, there is no consensus or universal guidelines for the use of copyright material for training AI. The legality of using copyrighted works without permission for this purpose varies by jurisdiction and is subject to ongoing legal debate and interpretation. In some jurisdictions, such as the United States, the doctrine of fair use may allow the use of copyrighted works without the need for permission under certain conditions. Factors influencing this include the purpose and character of the use, the nature of the copyrighted work, the amount and substantiality of the portion used, and the effect on the work's value.

For example, consider a university research team that is developing AI to analyze literary works and identify themes and patterns that could help in teaching literature. The team uses excerpts from copyrighted novels, poems, and essays to train the AI, aiming to enhance educational tools and methodologies. Now consider the Fair Use factors:

  1. Purpose and Character of the Use: The AI is being developed for a non-commercial, educational purpose, which is a factor often favoring fair use. Additionally, if the AI's analysis provides new insights or is transformative—meaning it adds new expression or meaning to the original works—this also supports a fair use argument.
  2. Nature of the Copyrighted Work: Using works that are highly creative (like novels and poems) could weigh against fair use since copyright protection is stronger for creative works than for factual ones. However, the educational context and the potential for transformative use could mitigate this factor.
  3. Amount and Substantiality of the Portion Used: If the research team uses only a small portion of each work to train the AI, this factor might favor fair use. However, if large portions or the "heart" of the works are used, this could weigh against fair use.
  4. Effect on the Market or Value of the Work: If the AI's training and resultant educational tool do not substitute for the original works and do not affect the market for those works, this factor may favor fair use. The argument is stronger if the AI's use could potentially increase interest in the original works.

Microsoft, GitHub, and OpenAI are currently being sued in a class action lawsuit that alleges they violated copyright law by allowing Copilot, Microsoft’s generative AI service, to train on billions of lines of public code, and regurgitate licensed code snippets without providing credit. Popular AI-driven generated art tools from Midjourney and Stability AI are the focus of a lawsuit that alleges they infringed on millions of artists' rights by training their tools on web-scraped images. Getty Images jumped on board the lawsuit bandwagon, suing Stability AI for reportedly using millions of images from its site without permission to train its AI art generator, Stable Diffusion. 

Arguments supporting copyright holders

The debate over training AI systems with copyrighted works places copyright holders and AI development proponents in opposition, each bringing their own arguments and concerns to the table. Designed to foster creativity, innovation, and information sharing, copyright protections allow creators to control the use of their work and to profit financially from their creativity. Copyright offers a way for creators to earn from their works, thereby encouraging the production of new content, ideas, and innovations. This incentive structure aims to spur artistic, literary, and scientific creation for society's benefit.

  1. Protection of Intellectual Property: Copyright holders maintain that the law protects their works as personal or corporate assets. They view AI companies' unauthorized use as a form of theft or infringement, which undermines creators' and owners' legal rights.
  2. Financial Compensation: Many creators rely on the royalties and licensing fees from their works to make a living. If AI replicates their style or content without offering compensation, this could drastically affect their income.
  3. Moral Rights: Beyond financial issues, authors often claim moral rights, including the right to attribution and the right to integrity, which protects their works from derogatory treatment. They contend that AI's use of their works could infringe upon these rights, particularly if AI-generated content distorts or misrepresents their original intentions.
  4. Devaluation of Human Creativity: The ease and speed with which AI can churn out art raises fears of flooding the market, which might devalue the uniqueness and worth of human-made art and possibly reduce the cultural value we attribute to human creativity. Art embodies more than just aesthetics; it reflects society, culture, and the times. Human artists grasp and convey the essence of their era, capturing their struggles, joys, and the socio-political context of their time—something AI fails to replicate. Take Picasso's Guernica, for instance, which powerfully captures the horrors of war and comments on the Spanish Civil War's impact. Should AI begin producing works that only superficially resemble such masterpieces, lacking context, meaning, or emotional depth, it could erode the cultural value and narrative power of new artworks. Consequently, the public might start viewing art as merely decorative, rather than a profound expression of human experience and history.

Arguments supporting AI and its development

Developing AI pushes technological advancement forward, benefiting society. By introducing new techniques and possibilities for creative expression, AI drives progress in various fields, including art.

  1. Access to Cultural Materials: When we train AI with existing artworks, we're opening culture and knowledge to everyone. It's like throwing open the doors to a vast art gallery, inviting people from all walks of life to come in, explore, and see art in exciting new ways. It's not just about making art more accessible; it's about sparking fresh appreciation and engagement across society.
  2. Educational Benefits: Just as students draw lessons from existing works, we can view AI as "learning" from human creativity to produce something new. This learning process can give rise to educational tools, accessibility solutions, and new forms of artistic expression, enriching the cultural landscape.
  3. Legal Precedents for Transformative Use: Some argue that training AI with copyrighted works qualifies as "fair use" or "fair dealing," especially when the AI's output is transformative enough that it does not compete with the original works in the market or serve as a substitute.
  4. Economic Opportunities: The advancement of AI opens new markets and opportunities for economic growth. This includes the creation of new jobs in tech, new types of art and entertainment, and potential collaborations between human artists and AI systems that could lead to innovative works.

As you can see, the debate on using copyrighted works to train AI systems is tangled in a web of legal and ethical complexities. The landscape is swiftly changing, and it is too early to tell where the chips will fall on this issue.