Emerging Litigation Podcast

Artificial Intelligence Meets Copyright Law with Ryan Phelan and Tiffany Gehrke

Tom Hagy Season 1 Episode 104

What are the implications of recent court decisions for artificial intelligence systems trained on copyrighted materials?

In this episode I get to speak with two repeat veteran guests of the podcast about two important cases dealing with fair use analysis in the context of large language model training. 

Here are a couple of highlights: 

• Courts found AI training to be "transformative use" because the process changes the works significantly through tokenization and processing.
• Judges distinguished between legally obtained training data (dismissed claims) and pirated training data (allowed claims to proceed).
• Both judges signaled that if plaintiffs had focused on AI outputs reproducing substantial portions of their works, outcomes might have been different.
• The transformative nature of AI training was deemed significant enough to qualify as fair use even for commercial enterprises
• One judge noted that if copyrighted works are essential for training models worth "billions, if not trillions," developers must find ways to compensate copyright holders.

About Our Guests

Ryan Phelan and Tiffany Gehrke are recognized thought leaders in emerging technology law and artificial intelligence. Both are partners at Marshall, Gerstein & Borun LLP and returning guests on the Emerging Litigation Podcast.

Ryan has written extensively on digital innovation, including intellectual property issues related to cutting-edge AI systems. He is the moderator of PatentNext, a blog focused on patent and IP law for next-generation technologies. We based our discussion on his excellent article about copyright law meeting AI, titled U.S. District Court Issues First Decisions on AI Model Development and Copyright Fair Use. Ryan holds a J.D. from Northwestern Pritzker School of Law and an MBA from Northwestern’s Kellogg School of Management.

Tiffany is known for her expertise in intellectual property and technology policy, and for advocating balanced, ethical approaches to AI regulation. She chairs Marshall Gerstein’s Trademarks and Copyrights Practice. Before entering law, she worked as a software engineer. Tiffany earned her J.D. from Loyola University Chicago School of Law.

Together, they bring deep practical insight and academic rigor to the evolving legal landscape surrounding artificial intelligence. 

I appreciate them returning to the podcast and sharing what they know!

Tom Hagy
Host
Emerging Litigation Pocast

Speaker 1:

Hello and welcome to the Emerging Litigation Podcast. I'm your host, tom Hagee. Today, I'm joined by the authors of an article titled Copyright Law Meets AI. These are key takeaways from cases called Cadre versus Meta and Bartz versus Anthropic. In this piece, our guests describe recent US court decisions and their effects on artificial intelligence systems trained on copyrighted materials. The article covers four-factor fair use analysis as applied to large language model training. We're going to talk about how courts address digitizing legally purchased books for internal AI training, in contrast to building collections from illegal sources. The article I refer to also covers questions related to market effects, licensing, revenue dilution caused by AI-generated works. Our guests present information for AI developers, copyright owners and model users, including the importance of lawful sourcing, dataset review and output controls. Legal precedence developing in this area, of course. That's why we're talking about it here on the Emerging Litigation Podcast.

Speaker 1:

It's my pleasure, then, to introduce my guests Ryan Phelan and Tiffany Gerke. They're both recognized thought leaders in the fields of emerging technology, law and artificial intelligence, and they're both partners at Marshall Gerstein and Boren Artificial Intelligence. And they're both partners at Marshall Gerstein and Boren, and they have appeared on the Emerging Litigation podcast before. Can you imagine, they're very willing to share their insights and expertise with a wider audience, and that's what I appreciate about them.

Speaker 1:

Ryan Phelan, partner at Marshall Gerstein I'm going to say it right Marshall Gerstein and Boren. He's contributed and written extensively on digital innovation, including intellectual property such as patent and copyright issues regarding cutting edge AI systems. He's a graduate of the Northwestern Pritzker School of Law. Tiffany Gerke, also coming back a glutton for punishment. She's known for her expertise also in intellectual property technology policy and her advocacy for balanced and ethical approaches to AI regulation. She earned her law degree from Loyola University, chicago School of Law. So together they bring us a lot of insight and academic rigor to conversations on the evolving legal landscape. Here they are, ryan Phelan and Tiffany Gerke. Hope you enjoy it, ryan and Tiffany. Thank you very much for coming back to the Emerging Litigation Podcast.

Speaker 2:

Great to be here. Thank you for having us.

Speaker 3:

Great to be back.

Speaker 1:

Okay. So, ryan, so the first one's to you. We're going to talk about some landmark decisions. As I said in the introduction, this is based on an article that we will link to that in the show notes and in the summary. So can you summarize the main findings You're talking about BARTs versus Anthropic and I think it's CADRE versus META and explain how these cases influence the current understanding of what is fair use in AI training and copyrighted materials? Sure Thanks.

Speaker 2:

Tom, yeah, so these cases are first of their kind and pretty impactful, in the copyright world at least. And what people have been wondering for quite some time is how can these large language models that we've seen crop up in recent years, such as ChatGPT and cloud and all of these things, you know, how how can they take information from authors without the permission and use, use that to train their model? And if they do so, is there any recourse by the authors through copyright law, you know, for these model trainers, or is these model trainers have in, in fact, some kind of defense in the copyright law, called fair use, which is a common defense used in traditional copyright cases? But how does that apply in our new digital world, especially with AI? And so the interesting thing is that these cases involve two different sets of plaintiffs and defendants, and the cases are both in the Northern District of California. It's the district court there and the cases were decided by two different judges of that court. So the judges had the opportunity to explore similar issues you know, in the copyright realm, you know and come to their own conclusions, but they in fact came to very similar conclusions and rulings, but, you know, one did have some criticism of the other. We can get into that later.

Speaker 2:

But you know and I'll refer to the two cases as the defendants, just because they're more widely known as entheropic being the first case and meta as the second and what happened in each one of these cases is each one of these defendants entheropic and meta has a model. It has their cloud model, which is very famous. Meta has their Lama model, and both of them did similar things for training that the authors did not like or approve of, and they took book data, book information and used that to train their respective models. In one instance, they took books and scanned them in. From books that they had purchased, they stripped off the covers and chopped out the headings and the page numbers and stuff like this, but they took the text of the books themselves and used it to train. And then they also got some shadow libraries or pirated books too, too, and also used that in their data, and the author's theory of liability under copyright was that.

Speaker 2:

You know, both of those two things constituted copyright infringement owners the defendants, in each case, raised the legal defense of fair use in order to defend themselves, in order to state that, even though that they did do that, they did take and use that material, that it was fair use and therefore they were not subject to the copyright law.

Speaker 1:

Right, I'm sorry, I'm laughing, I'm just, it's like I never really. They were literally copying materials, which you know it's right in the word copyright. You know what I mean, but that's not, that's just me being playing with words, but because it also interests me too, as a writer, I'm relying increasingly on AI and I always want to attribute everything, and so, uh, it always makes me a little nervous, are they? You know? Am I? Am I not attributing something by accident to something that maybe somebody copied in for training? Uh, they're not publishing it, but they did use it to train. So, uh, this is interesting to me personally too. So, tiffany, over to you. So let's talk about, you know, ryan's mentioned fair use, so let's talk about transformative use and fair use. How did the courts define transformative use in these cases, and why is this critical for fair use assessments when it comes to AI training?

Speaker 3:

So I thought the Anthropic case was particularly interesting here, because the judge went through like painstaking efforts to describe what they view as the transformative use.

Speaker 3:

And so they talked about, first, that each work was selected and copied from a central library to create a working copy of a training set. Then the second step they said that happened was that each work was cleaned to remove some of the information like page numbers and, you know, headers, footers, that kind of thing. Third, they then took each cleaned copy and translated that into a tokenized copy where they messed with the words right, they might have changed the stemming of a word or grouping different characters from letters and words together. And then the fourth thing they did was then they each fully trained LLM then itself retained a compressed copy of the work that they had trained on.

Speaker 3:

And I thought it was interesting that the judge in this case wrote out each of those steps to help to inform. You know all of us readers that here's what I'm viewing as transformative. You took this thing, you did one, two, three, four, and then here's what the end was, and so that's what that judge said. Those are the steps that made this a transformative use In the meta case. I felt the judge was just like yes, this is transformative. You know, it's not a book. We're taking it and we're using it for training of LLMs and therefore it is transformative. And for each of those, you know, they did discuss that it's only one of the four steps of fair use, but in each case they found it in favor of the large language models that it's only one of the four steps of fair use, but in each case they found it in favor of the large language models that it was transformative, in favor of fair use and against infringement.

Speaker 1:

Okay, gotcha Interesting distinction. So on to data legitimacy, ryan. So why did the courts distinguish between lawfully obtained and pirated training data? What's the practical impact this might have on AI data sourcing in the future?

Speaker 2:

you know the copying copyright infringement can be found when you you know when and where you copy. So first of all, when you copy information for model training, you have to stick it somewhere like on a memory of a computer, and then you know, once you have the books and the memory, then you can train your model and then perhaps the model can output something that's the same or substantially the same. You know which would trigger, you know, copyright infringement. Um, the courts were, with respect to pirated versus non-pirated data. Uh, they were focused on the first, of those where you know you're just copying the information uh and uh originally and sticking in your memory and the computer memory for training and so. So the courts seem to be saying like we don't care, like how much transformative use you may have done after the fact, after you've copied the pirated works, like copying pirated works is is never okay and there's so much pointing to do.

Speaker 2:

They didn't get into that in the in their rulings because it was aings, because there were motions for summary judgment. They talked about dismissing the case. They did dismiss the case under fair use in each court Each court did this. They dismissed the infringement under summary judgment for the legitimately copied works from the books that they had. You know presumably under, you know, first sale doctrine offense because they had obtained the books lawfully. But for the pirated works when those were taken and stored in the memory they did not dismiss those counts based on summary judgment. That's to be decided later during trial and you know, presumably made an anthropic or not. Not looking forward to that.

Speaker 1:

Okay, all right. So how did the courts evaluate whether the AI inputs were substantially similar to the copyrighted works and what evidence is most important for determining infringement risk? Going forward, ryan.

Speaker 2:

So there's two along the lines of what I just mentioned. There's two ways that you know you can be found infringing copyright for AI training. The first is when you initially store that information in the memory and then the second is the output. So you're kind of focusing on the inputs and the outputs and looking at infringement at that manner. But the second of these it was interesting that the authors did not allege, or did not, you know, fight, at least in the summary judgment motion, that the outputs were infringing.

Speaker 2:

Presumably they could not create, you know, output that was similar to a book. Let's say you had a book you know that was in the lawsuit and it had like a paragraph in the book. You know, presumably the authors cannot get the AI to output the same paragraph or at least substantially similar paragraph. In fact, in one of the cases, one of the defendants I forget, which I think it was in Meta they had an expert try to output a same amount of text from one of the books and that expert was only able to output 50 words at most of the output basically trying to, and basically argued on behalf of the defendant that that was not sufficient, it wasn't enough to have One of the fair use doctrine was that you have to take heart or you know substantially all, or like what portion or how much like quantity aspect of it, and she could not substantiate that. The expert and so both of the courts were eager to address this issue. Like can your model output you know the same sentences or lines and phrases of the actual underlying works, and the courts indicated that they would have been very eager to address that. That was the case, but that was not raised, so that second copying was not just was not at play and you can. It's very hard, I guess, in the textural realm to do that.

Speaker 2:

Other cases that are still ongoing, like the Getty Images case, it's more easier to show a similar output. In fact, in the complaint in the Getty Images case, you have two pictures side by side of soccer players and they both the real image has a Getty watermark on it, and so does the output AI image seems to have a Getty watermark on it too. And you know the question is like why would your, your generated Getty image have a I'm sorry, your generated AI image have a Getty watermark on it if it wasn't trained on? You know Getty real? So it's kind of a giveaway, yeah, and so here there's not that telltale giveaway, and so presumably the authors didn't do that.

Speaker 2:

But going back to the original question, the first of those was focused on when that information was stored, and, as Tiffany had mentioned, they had transformed that stored data in so many ways that, you know, it was not deemed to be a copy. That was not deemed to be a copy because the information had to be transformed down to the token level, which, if you're thinking about the English language, tokens are like syllables and words, and those are much different from the original text compared to the tokenized version of that text, for model training was considered to be significantly transformative, and so was the output, by the way, and so that was one of the reasons that termed the court's decision for fair use. They both said that it was highly transformative in nature.

Speaker 3:

Okay, I have two thoughts on what Brian just said that I'd like to share. So the first was when you were just talking about the Getty case. That made me think about really the difference in visual versus written. You know, copyright protection and expressive works, and I'll be curious to see how this goes over time, as we're talking about things that are text so it can be jumbled up a little bit more that you might not catch it on the output side, whereas your example with Getty, where they could see the watermark on there, I think that's really powerful and it's a little bit more of an indicator that you know where it probably came from right, whereas with words you don't always know that you don't have that indicator on the outside. So I think that's a really interesting case to be watching as well, and thanks for sharing thoughts on that, ryan.

Speaker 3:

The second thought was in the case with Judge Alsup. He repeatedly said throughout the opinion basically we're only focused on inputs here. The plaintiffs haven't really alleged anything similar on the outputs and it was mentioned enough times throughout the opinion that it made me think. Is he trying to signal he might have found differently if they had talked about the outputs, more differently if they had talked about the outputs more. I'm not sure on that because you know in that opinion it also came down pretty pro-LLM, but I thought it was interesting that it was repeatedly mentioned throughout the case.

Speaker 2:

I took note that he both judges kind of mentioned it throughout their cases as kind of like a hint, hint, nudge, nudge for future yeah yeah, please do this so we can, you know, discuss this. Uh, you know the outputs.

Speaker 3:

It just wasn't explored, um, you know, but both judges seem to be eager to talk about it and the second judge even went so far I think it's maybe somewhat bolder than even a hint hint and basically said like if they had put this evidence in they might have had a winning argument.

Speaker 1:

Oh, wow, wow, Okay, in your careers have you seen judges do that the nods and the winks and need I say more kind of a thing? I mean, I've only seen it recently. You know like Supreme Court, you know opinions and diss. Seen it recently. You know like Supreme Court, you know opinions and dissents and saying you know what you really ought to be doing.

Speaker 2:

So I'm wondering have you seen that in your careers with judges doing that?

Speaker 1:

Yeah, all the time it's called dicta. Usually they'll say stuff like that and you know you could have done this.

Speaker 2:

I mean, they're not supposed to do that, they're supposed to, you know decide on cases and controversies.

Speaker 3:

But it happens, that's right. And I think in these types of cases right now too, where they know there's a lot of eyes on them and they're probably going to get appealed up, I think we are seeing it more. It's starting to trend. You know, trend that direction.

Speaker 1:

Yes, okay, yeah, thank you, that's. That's interesting. So, tiffany, we're going to come back to you. How did the courts reconcile the strong protection for expressive works like novels and plays, with their fair use findings and AI training contexts?

Speaker 3:

Yes, so the courts really seem to focus on the fact that the books were focusing on the or, I'm sorry, the large language models were focusing on using the books for input, so they were, you know, saying these weren't going to be, you know, creating another book to compete, at least initially. Right, that's where we get to the output discussion that we had earlier. The courts were instead focusing on although we want to protect books, we want to protect authors, we don't see the training of the model as something that's harmful to authors. At least that's what I saw from the judge in the Anthropic case. I found the judge in the other case. You know the original human being authors.

Speaker 1:

So back to, let's say, commercial use and copyright. So why do the courts allow even commercial AI training uses to qualify as fair use, and how might this reasoning affect future IP policies and litigation Tiffany?

Speaker 3:

I think they did it here, because both courts acknowledged just how transformative the training of the model use was and they thought well, in that purpose it's okay, even if it is commercial in nature. Even if it is commercial in nature and you know, in the Meta case the judge also went on to say I thought they didn't, they stopped describing it here. So it said the plaintiffs in the Meta case acknowledged that the large language models have end uses, including serving as tutors, assisting with creative ideation and helping users generate business reports, and that several of the plaintiffs testified that using the large language models for various purposes were all distinct purposes from creating or reading an expressive work like a novel or biography, and because those functions were different from the functions that the books were going to be used for, you know, for reading and enjoyment or learning that they thought copying the books as a whole to develop this tool was in fact going to create a different use, even though it was a commercial use developer practices.

Speaker 1:

How do you think these rulings are going to shape how AI developers manage training data, and what steps should organizations take to minimize legal risks and comply with copyright law?

Speaker 2:

Yeah, I think that this sends a strong message to AI model developers that they need to use, you know, purchase licensed, legitimate data and not, you know, pirated data.

Speaker 2:

In fact, you know, we've seen in the news recently, amazon agreed to pay, I think, $10 million to the New York Times to license data from you know the paper for you know, presumably for use for training purposes, and so model developers are going to be very careful about where they get their data from. You know, hopefully from licensed sources, hopefully from sources like books that they've already purchased, that they can rely on, like first sale doctrine, where you know, if you own it, you can use it. You know, similar to it, similar to what we all know from the days of buying CDs and stuff like that. You would own, you got a first right sale on the music, on that CD and couldn't get sued by copyright. So they're going to want to do something similar to that. So I think that that was the lesson learned for them. I think the big surprise was that, you know, the courts did consider it highly transformative and they just need to make sure that their data is clean.

Speaker 1:

Yep, okay, industry challenges. So what challenges or responsibilities do these rulings create for companies using AI and copyright owners? Are there collaborative solutions or industry standards that you would recommend or look to?

Speaker 2:

Yeah, I would say that you know, for copyright owners that have proprietary data you know secret data they certainly want to protect that with like an NDA before they share it and they license it. You know the problem in these two cases was that the data was publicly available. Right, the authors wanted to sell these books. They wanted them to be public, so you know they could sell their books and information to users, and so they're highly public and you know they simply cannot have proprietary data in that case. So I'm sure they want to seek a license with the model trainers in order to get some type of revenue stream.

Speaker 2:

I know that there's a bill currently floating in Congress that allows authors to have protection with respect to the works that they get some kind of licensing revenue for, you know, if their model is or if their data is used, and perhaps some control over whether their data is used for training.

Speaker 2:

Although I think that's only one of the two houses I don't think it's, you know, advanced very far, and whether or not you know the president signs out into law is also another thing. You know the president was on a podcast recently with the AI developers you know, including, you know, the hardware manufacturers, nvidia, and then also some of the AI model developers, including OpenAI, and he seems pretty AI model friendly, so to speak, where the current administration believes that training AI models and developing them is an important strategic objective for competing with other countries, including China. So you know, whether or not he would sign a bill that would limit AI use and protect authors is questionable authors, because that would supposedly hamper their desire or their ability or speed to develop these AI models or be competitive as a country compared to other countries.

Speaker 1:

You know this is evolving, you know the litigation is evolving. Obviously, the technology is evolving rapidly. People are using it with various skill levels, people are bumbling through it. I'm right in the middle of it as a writer. Um it it, you know, early, early days, I it did make up four cases that I was looking for, you know. But fortunately, um, I'm used to checking and checking and rechecking things and, uh, I even worked with a, a paralegal out in california. He's a very good paralegal, he's a master researcher, you know, and he's like well, no, I don't think so. Maybe they were in some little court in some county in Texas that you're looking at. Well, this is an antitrust case, so they're not going to be in a very, you know, a small town, municipal court or something. So, and I do know one, at least one I don't know if it was an attorney or the firm was fined in Florida because, infamously, put in some cases that were completely made up. I mean, who doesn't check those things? But anyway, it happened.

Speaker 2:

Apparently, a lot of people don't check those things. I see almost every week some court somewhere is angry or sanctioning an attorney for filing a brief. Uh, yeah, uh, you know had made up citations, so, um, in fact saw a article uh this week about attorneys proposing a a fix where there would be an automated tool to scan briefs filed to see if there were any made-up citations, which I I thought was amazing that courts would even consider or do that.

Speaker 3:

Have AI. Check the AI.

Speaker 1:

Yes, I think that's smart. I do, I think that's very smart. I've started doing that a little bit myself, but I think having AI yeah, search for itself I think that's brilliant. I mean that's a good use of it.

Speaker 2:

It's also kind of flies in the face of the ethical rules that attorneys should be subject to. So I think there's a debate there, but I don't have a solution.

Speaker 1:

We can come up with all we want. You can't.

Speaker 1:

I can because you represent people and you have clients. Nobody cares what I think, but it is interesting. It's fascinating as a writer, especially in law, where I spent my 20s and a good part of my 30s as a legal reporter and I was reading a lot of court opinions. I mean thousands of them. I mean thousands of them. And the time it took to make sense of a case. You know, you guys know it takes time and I'm looking at Adobe and Copilot, which I guess Microsoft they've got a partnership with ChatGPT and all this.

Speaker 1:

You give them a document. It does a really good job of hitting the highlights for you if you need to understand it quickly. A really good job of hitting the highlights for you if you need to understand it quickly. And the cool thing I like about that is there's no waking anything up because it's got the document and it's got where it got the information. So you know what I mean, so you can treat it like a junior writer or something. So there's super. I mean it's just super pros and cons, but I'm enjoying learning about it. But enough about me. So what do you expect in future litigation or legislation? Ryan, you mentioned something that's in Congress now. I feel like they're focused on other things. I don't know. I read the news once in a while. It looks like they're busy with other stuff. What do you expect to see in the future?

Speaker 2:

There's a federal bill that I mentioned, that certain congressmen and women are concerned about authors and their works being used to train these models, to make sure that they're compensated in some way. So whether or not that that bill is successful or gets changed or is ultimately signed is, you know, anybody's guess. The states are also coming up individually with their own frameworks for, you know, ai and it's kind of the Wild West right now with respect to you know what protections states are offering. You know California is considering some. You know other states are also considering some. There's also a big push and this is tangential, but there's also kind of this data privacy slash, digital what's the word? A digital twin of people, like making sure that your appearance is kept free of usage. A lot of these cases come from the use of AI to oh, your image and likeness.

Speaker 1:

Yeah, thank you.

Speaker 2:

Image and likeness, where people you know AI to put somebody's face on somebody else's body, with or without clothes, and so it's you know. States are, you know, putting uh laws into place to protect individuals from that?

Speaker 1:

Mm-hmm. Yeah, yeah, yeah, there's some pretty sophisticated tools that are. I mean, some of them are hysterically funny, some of them are just very funny. So it's amusing, but in the right and in the other hand it's dangerous. I see things and you got world leaders making declarations, you know, and it's very convincing, and it's not them at all. It's just very scary. Tiffany, what about you? Any outlook for the future?

Speaker 3:

Yeah, I think, on the litigation side.

Speaker 3:

So I think both of the cases that we primarily talked about today are both going to probably be appealed, Right.

Speaker 3:

I think both of the cases that we primarily talked about today are both going to probably be appealed right. But broader than that, I think the message is that future plaintiffs in these cases need to focus, you know, broader than just training without permission or, you know, seeing if the output can generate the exact text, and they need to focus on what both judges here signaled is important, what both judges here signaled is important, and that is are the output of the large language models going to create materials that compete with the authors and the underlying protected works, or are they going to dilute the marketplace? Are they? You know what's going to happen on that output and plaintiffs need to focus on that. And then my last thought. I thought from the judge in the Meta case had a nice little succinct soundbite where they basically said you know these products are going to generate billions, if not trillions, of dollars, and if using copyrighted works to train those models is essential, then they need to figure out a way to compensate the copyright holders for it.

Speaker 1:

Mm, hmm, yeah, yeah, as I'm listening, as I'm thinking about it, I feel like maybe you guys have an opinion, but I feel like the, the people that are going to be that are going to be most threatened by this, or squeezed by this, or or the creative folks, um, because, um, you know, writing is like.

Speaker 1:

You know, I can say, write something in the style of Kurt Vonnegut and it'll get pretty good, and you know, 10 years from now, it's going to be amazing. You're not going to know the difference, or make a painting that looks like a Van Gogh, and it's going to look amazing. And you know, with printer technology, you know what? Do you call it? 3d printing? I'm sure there's technology that does painting. I'm not even I've never seen it, but I'm positive it exists. So I think so the creative people are going to get squeezed on one side, and then what they do create is going to get potentially diluted or used, or or just transformed, and, and so I think I feel like the creative person, people are going to be under pressure here. I don't know what you guys think about that, or I don't know what to do about it, obviously, but yeah, I agree.

Speaker 3:

I think both sides need to figure out a working solution going forward, because I think the advances in technology that can be gained through this are incredible and are very powerful. But I agree that you want to continue to incentivize the next great authors to create novels, to feel comfort in doing that and that they would be protected in the future, and so I think there's got to be a balance that is struck here, with compensation or other limitations. I don't have a great idea on what that will be yet, but I think it needs to get there.

Speaker 1:

Well, let's come up with it, Ryan. You have a solution, don't you?

Speaker 2:

The closest solution I've heard, which you know, I don't know if it is, is something akin to what happens in the music world now, where now music is, you know, authors don't receive a license directly. They, you know, kind of goes into this, uh, collective, um, you know, system where, you know, maybe a per per play of their song. They get a fraction of that, and then you know, and so there's, you know, kind of a spotify model, so to speak, is what I've heard. Um, how that plays out with model training could be different, because people aren't. You know, once you train the model, it's trained with the information. It's not like people are listening to the information again, right, the model is not listening, unlike, you know, people listening to songs over and over. So there, there's similarities but there's differences. But there is this aspect of authors creating material and that having an input into some, you know, larger model or system, and then the, the author is getting compensated for that, which you know, seems to seems to be the way to go.

Speaker 2:

Otherwise, you know, you get to a world, like Tiffany suggested, where authors are just not incentivized to, you know, create new things and the AI is just learning on, you know, its own output and therefore everything starts to look the same and sound the same. There's nothing new in the world. I guess I'll be old and dating myself if I say this, but I can remember the music back from two decades ago. It's quite different from what we have now and I don't know if the the spotify model has uh impacted that or if I'm just old and I like the music from the decade uh. Well I mean, but maybe we, maybe we enter a world like that if we don't incentivize enough creative talent to create new, uh, good, good art it, just it speeds up, but in in terms of music it speeds up what humans would do too.

Speaker 1:

You know they would hear the Beatles and the Beatles, would, you know, influence another band and they would influence another band and pretty soon you get very like it goes in different directions kind of very superficial derivative. You know people do it, but AI will do it faster. So you get this great music that turns into, like you know, becomes an elevator music. So it's going to be interesting to watch.

Speaker 1:

I've got a nephew who teaches creative writing in Wisconsin and all kinds of writing creative writing and just other kinds of writing expository and it's a real challenge and you know he's very good at spotting what is AI language, as am I. In fact, somebody who specializes in this sent me a list of words that Google algorithms look for to see whether something is AI written, and there are some like. So I'm writing about law and you guys will appreciate this. I ask it, give me a summary of a case, just so I can know what it was, and then it'll spit out. So you want me to do a blog? I said, sure, write a blog, see how it does In this landmark case, you know every case is a landmark case.

Speaker 1:

Right, and even some of the notes I did for this. It uses the word landscape, the legal landscape. It's always landscape and it's a landmark, it's like. So I've got a whole list now of words that I have to avoid. It's awkward because some of them are words you would normally use Tiffany and Ryan. Thank you guys very much.

Speaker 3:

Yeah, thanks for having us. Thank you, John.

Speaker 1:

The Emerging Litigation Podcast is a production of Critical Legal Content which owns the awesome brand HB Litigation. Critical Legal Content is a company I founded in 2012. What we do is simple we create content that's critical on legal topics for law firms and legal service providers. That kind of content can be blogs, papers, they can be podcasts, webinars and we have a good time doing it. And as for HB Litigation, well, and we have a good time doing it. And as for HB litigation, well, that's the name under which we publish interesting at least interesting to me legal news items, webinars, articles, guest articles, all on emerging litigation topics. Once again, I'm Tom Hagee, with Critical Legal Content and HB Litigation. If you like what you hear and you want to participate, give me a shout. My contact information's in the show notes. Thanks for listening. No-transcript.