Who’s the Owner?: AI and Copyright

The rapid growth of generative artificial intelligence in both popularity and ability continues to pose difficult questions for courts and governmental bodies. Copyright law has already experienced a string of AI-related lawsuits this year alone, prompting the U.S. Copyright Office to begin contemplating and studying the law and policy issues posed by AI systems. On August 30, 2023, the Copyright Office published a notice of inquiry and request for comments in the Federal Register

“The United States Copyright Office is studying the copyright law and policy issues raised by artificial intelligence (“AI”) systems. To inform the Office’s study and help assess whether legislative or regulatory steps in this area are warranted, the Office seeks comment on these issues, including those involved in the use of copyrighted works to train AI models, the appropriate levels of transparency and disclosure with respect to the use of copyrighted works, and the legal status of AI-generated outputs.”

This summer, lower courts across the country have dealt with these controversial questions. Two recent cases regarding the intersection of copyright law and AI are summarized below:

The increased attenuation of human creativity from the actual generation of the final work will prompt challenging questions regarding how much human input is necessary to qualify the user of an AI system as an “author” of a generated work

The use of copyrighted works to train AI models

Silverman, et al. v. OpenAI, Inc. 

Have writers “unwittingly built the foundation for Silicon Valley’s red-hot AI boom?”

Joining a string of lawsuits filed over material used to train AI systems, a group of authors—including Sarah Silverman, an American comedian and writer– have brought suit against OpenAI, Inc. (the creator of ChatGPT) for copyright infringement, alleging that their copyrighted books were used for ChatGPT training systems without permission. 

Generative AI systems like ChatGPT “create content using large amounts of data scraped from the internet,” including books, which offer the “best examples of high-quality longform writing.” In 2018, OpenAI researchers admitted that “. . . long stretches of contiguous text” allow “the generative model to learn to condition on long-range information.”  Silverman’s complaint claims that OpenAI accessed copyrighted books for training data without permission through illegal “shadow libraries.” 

The issue here is the source of the data OpenAI is using to train its generative AI system. While the lawsuit is pending, it poses yet another set of questions for the intersection of copyright law and generative AI: Are the creators of systems like ChatGPT really using illegally accessed copyrighted works without permission? Is it possible for companies like OpenAI to train generative AI products without turning to “shadow libraries?” If not, should use of these materials for training purposes be considered copyright infringement? With the growth of AI systems, how will the U.S. Copyright Office police this new use of copyrighted works?

The legal status of AI-generated outputs

Stephen Thaler v. Shira Perlmutter, et al

Stephen Thaler, a 73-year-old inventor, created an artificial intelligence system called the “Creativity Machine,” which can generate original pieces of art. After the system generated a work titled “A Recent Entrance to Paradise,” Thaler applied for copyright registration through the Copyright Office. The Office denied the application because the work “lack[ed] the human authorship necessary to support a copyright claim.” 

It’s a well-settled principle in copyright law that originality and human authorship are non-negotiable prerequisites to federal copyright protection, but should copyright law expand to recognize types of non-traditional authorship like generative AI? Thaler argued that AI should be “acknowledge[d] . . . as an author where it otherwise meets authorship criteria, with any copyright ownership vesting in the AI’s owner.” He pointed out that copyright law has a history of embracing works created with newly-developed technologies—after all, the Copyright Act states that copyright is available for “original works of authorship fixed in any tangible medium of expression, now known or later developed.” 

The District Court for the District of Columbia ultimately reiterated the importance of human creativity, explaining that this requirement has stayed consistent throughout the development of new technology. Works created by newly developed tools or media are still only copyrightable if the human creativity is channeled through those new tools or media. However, “copyright has never stretched so far…as to protect works generated by new forms of technology operating absent any guiding human hand, as plaintiff urges here.” As of the current standing of copyright law, “authorship” is synonymous with human creativity. 

Despite seeing this case as relatively clear-cut, the D.C. court noted the following:

“Undoubtedly, we are approaching new frontiers in copyright as artists put AI in their toolbox to be used in the generation of new visual and other artistic works. The increased attenuation of human creativity from the actual generation of the final work will prompt challenging questions regarding how much human input is necessary to qualify the user of an AI system as an “author” of a generated work, the scope of the protection obtained over the resultant image, how to assess the originality of AI-generated works where the systems may have been trained on unknown pre-existing works, how copyright might best be used to incentivize creative works involving AI, and more.”

Caroline Kloster

Caroline Kloster is a 2L at the University of North Carolina School of Law