A bill introduced in the US Congress on Tuesday intends to force artificial intelligence companies to reveal the copyrighted material they use to make their generative AI models. The legislation adds to a growing number of attempts from lawmakers, news outlets and artists to establish how AI firms use creative works like songs, visual art, books and movies to train their software–and whether those companies are illegally building their tools off copyrighted content.
The California Democratic congressman Adam Schiff introduced the bill, the Generative AI Copyright Disclosure Act, which would require that AI companies submit any copyrighted works in their training datasets to the Register of Copyrights before releasing new generative AI systems, which create text, images, music or video in response to users’ prompts. The bill would need companies to file such documents at least 30 days before publicly debuting their AI tools, or face a financial penalty. Such datasets encompass billions of lines of text and images or millions of hours of music and movies.
“AI has the disruptive potential of changing our economy, our political system, and our day-to-day lives. We must balance the immense potential of AI with the crucial need for ethical guidelines and protections,” Schiff said in a statement.
Whether major artificial intelligence companies worth billions have made illegal use of copyrighted works is increasingly the source of litigation and government investigation. Schiff’s bill would not ban AI from training on copyrighted material, but would put a sizable onus on companies to list the massive swath of works that they use to build tools like ChatGPT – data that is usually kept private.
Schiff’s bill, which was first reported by Billboard, has received the support of numerous entertainment industry organizations and unions, including the Recording Industry Association of America, Professional Photographers of America, Directors Guild of America and the Screen Actors Guild-American Federation of Television and Radio Artists.
“Everything generated by AI ultimately originates from a human creative source. That’s why human creative content–intellectual property–must be protected,” said Duncan Crabtree-Ireland, SAG-AFTRA’s national executive director and chief negotiator.
Prominent artificial intelligence companies such as OpenAI are facing lawsuits over their alleged use of copyrighted works to build tools like ChatGPT. Both Sarah Silverman and the New York Times have filed copyright infringement claims against the startup. OpenAI has hired a raft of top lawyers in the past year, according to the Washington Post, as it gears up to face over a dozen major lawsuits.
OpenAI and other artificial intelligence companies have denied wrongdoing and claimed that their use of copyrighted material falls under fair use, a legal doctrine that allows for some unlicensed use of copyrighted materials under certain conditions. The legal strategy poses a major test for copyright law, and the result may wreck artists’ livelihoods or OpenAI’s bottom line. In a submission to a UK government committee earlier this year, lawyers for OpenAI contended that “legally, copyright law does not forbid training.” OpenAI also stated in that submission that, without access to copyrighted works, its tools would cease to function.
As generative AI companies have expanded their tools’ capabilities, entertainment industry workers have also pushed back against the technology and its potential threat to artists’ rights. Last week, a group of over 200 high-profile musical artists released an open letter calling for increased protections against AI and demanding that companies not develop tools that could undermine or replace musicians and songwriters.