Hiltzik: Who’s winning in Sarah Silverman’s copyright suit against OpenAI?


If you’ve been following the war between authors and the purveyors of AI chatbots over whether the latter are infringing the copyrights of the former, you might have concluded that comedian and author Sarah Silverman and several fellow authors suffered a crushing blow in their lawsuit against OpenAI, the leading bot maker.

In his ruling Feb. 12, federal Judge Areceli Martínez-Olguín of San Francisco indeed tossed most of the copyright claims Silverman et. al had brought against OpenAI in lawsuits filed last year.

That’s the way much of the press portrayed the outcome: “Judge dismisses most of Sarah Silverman’s copyright infringement lawsuit” (VentureBeat). And “OpenAI Scores Court Victory” (Forbes). And “Sarah Silverman, Authors See Most Claims Against OpenAI Dismissed by Judge” (Hollywood Reporter).

If someone tells you it’s not about the money but the principle, they’re really talking about the money.

— Robin Feldman, UC College of the Law

Well, not really. Of the six counts in the authors’ lawsuit, one — whether OpenAI directly copied or distributed the plaintiffs’ works — wasn’t even before the judge because OpenAI hadn’t asked him to dismiss it. It’s a key allegation, and it’s still alive.

Of the other five, the judge cleared one to proceed; that’s a claim that OpenAI engaged in an “unfair” business practice under California law. He dismissed four others but gave the plaintiffs permission to amend their complaint and try again. The amended complaint is due before him by March 13.

At best, this is a mixed victory for both sides. But this lawsuit and a couple of other similar cases provide a road map for how the copyright issue may play out, in and out of court: with settlements that outline how much the artificial intelligence industry should pay copyright holders for using their works, and how those payments should be made.

Any such settlements would have to recognize that AI chatbots are here to stay, but also that they can’t mine published material for free.

“It’s hard to imagine that you could put the genie back in the bottle — that courts would decide that generative AI may not be used under any circumstances at any time,” says Robin Feldman, an expert in intellectual property law at UC College of the Law. “At the same time, it’s hard to imagine that generative AI could end up free to do whatever it wants at any time with copyrighted material.”

It’s fair to imagine, as well, that the issue is going to pose a headache for judges right up to the point that it lands before the Supreme Court, as Feldman believes is likely. That’s because of two aspects that are anything but cut-and-dried: copyright law and a new technology. U.S. copyright law is extremely complicated, and the technology bears features that don’t resemble anything seen in earlier technology transitions. Put them together, and the complexities are magnified exponentially.

Before going further, let’s define the landscape.

OpenAI is a high-tech firm with an investment from Microsoft that has been reported to be as much as $13 billion. Its best-known product is ChatGPT, a chatbot that spits out human-sounding answers to questions posed in plain language, though sometimes the “humans” it strives to emulate come off like idiots or plagiarists.

As I’ve reported, the chatbot business, like artificial intelligence research throughout its history, has been infected with hype. But it’s currently the target of a high-tech gold rush based on expectations that it will dramatically remake industries such as manufacturing, medicine, law — almost anything you can name. We’ll see.

As I’ve also reported, it’s a misnomer to call chatbots “artificial intelligence.” They’re not intelligent by any common definition of the word; they’re just good at seeming intelligent to an outsider unaware of the electronic processing inside them — a simulacrum of human thought, not the product of cogitation.

Chatbots don’t create content, as such. They have to be “trained” by pumping their databases full of human-produced content — books, newspaper articles, junk scraped from the web, etc. All this material allows the bots to generate superficially coherent answers to questions by generating prose patterns and sometimes repeating facts they dredge up from their databases.

That brings us back to the copyright issue. Silverman and other plaintiffs, including the writers Michael Chabon and Ta-Nehesi Coates, who filed a complaint similar to hers last year, contend that in using their works to train its chatbots, OpenAI is copying their works without permission, compensation or credit. Having “ingested” their works, the bots are “able to emit convincingly naturalistic text outputs.”

Indeed, Silverman’s lawsuit states that when asked to do so, ChatGPT is able to generate accurate summaries of the copyrighted works — “something only possible if ChatGPT was trained” on those works.

Among OpenAI’s defenses is that its use of copyrighted material falls within the exemption known as “fair use.” That’s a concept that allows snippets of published works to be quoted in reviews, summaries, news reports, research papers and the like, or to be parodied or repurposed in a “transformative” way.

OpenAI argues that previous court rulings say that creating copies of a copyrighted work as a preliminary step in developing a new, non-infringing product falls safely under the fair use protection, and that’s all it’s doing.

But it’s not at all clear that OpenAI’s interpretation will stand. In copyright law, fair use is a moving target, interpreted by judges on a case-by-case basis. “There are no hard-and-fast rules, only general guidelines and varied court decisions,” according to a digest by Stanford University librarians.

As chatbot developers snarf up more content to “train” their products, the potential copyright claims are only going to multiply. A disclosure: At least three of my books are in a database used to train some chatbots. I’m not a plaintiff in any of these lawsuits, but since they’re all fashioned as class actions in which I might qualify as a class member, it’s conceivable that if any go to trial and end with a class settlement, I might get a (probably vanishingly tiny) payout.

The lawsuits by individual writers are only one category. As I reported earlier, Getty Images has sued an AI company for copying millions of historical and contemporary photographs to which it holds licensing rights, allegedly to build a competing business. Dozens of music publishers have sued another AI firm for its “mass copying and ingesting” of copyrighted song lyrics to enable its bot to regurgitate them to its users by generating “identical or nearly identical copies of those lyrics” on request.

A lawsuit brought by New York Times Co. against Microsoft and OpenAI has attracted heavy attention not only because of the prominence of the plaintiff but because the newspaper produced evidence that OpenAI’s chatbot actually spits out lengthy verbatim passages from Times articles. This allows the Times to assert that the chatbot is cutting into the market for its work, a factor that judges have sometimes considered to reject a fair-use defense.

That’s a claim that the Silverman and Chabon lawsuits weren’t able to back up with evidence, which is what prompted Judge Martínez-Olguín to put some of their copyright claims on hold. He invited the plaintiffs to come back with allegations “that any particular output is substantially similar — or similar at all — to their books,” at which point he might reconsider.

Feldman observes that this entire legal issue is in the early “posturing” stage. The AI industry bases its defense on the principle that it’s doing nothing wrong and doesn’t owe creators anything. The creators say the principle is that what the chatbot developers are up to produces “an irreparable injury that cannot fully be compensated or measured in money,” to quote the Silverman lawsuit.

But money has settled previous donnybrooks over new technologies. Most notably, the recording industry and broadcasters solved their dispute over radio and television broadcasting of music with a licensing arrangement initially reached more than 80 years ago and that has survived in its essence to cover not only radio and television stations but also “streaming services, concert venues, bars, restaurants, and retail establishments.” (That’s not to say that artists are necessarily fairly compensated for these uses.)

That’s the best bet for how the chatbot issue will unfold, in time: with a financial arrangement sufficiently fair to both sides to be blessed by a judge. Feldman advises not to buy into the assertions on both sides that with principles at stake, no financial arrangement is possible. The New York Times, indeed, says that it filed its lawsuit only after negotiations to place a financial value on the use of its content failed to produce a “resolution.”

Feldman cites an adage (often attributed to the turn-of-the-century humorist Kin Hubbard) that holds: “If someone tells you it’s not about the money but the principle, they’re really talking about the money.”



Source link