Floss Weekly Episode 744 Transcript
Please be advised this transcript is AI-generated and may not be word for word. Time codes refer to the approximate times in the ad-supported version of the show.
Doc Searls (00:00:00):
This is Floss Weekly. I'm Doc Searls This week Katherine Druckman and I talk with Damien Riehl. Damien Riehl is the most out there. Constructive, useful, open source, friendly, subversive, pirate, kinda lawyer. I mean, doing amazing stuff that is from the future. It's really, we cover a lot of topics on this one, and I highly invite you to tune this in. This is a really good one.
... (00:00:31):
Podcasts you love.
(00:00:33):
From people you trust.
(00:00:35):
This is TWiT.
Doc Searls (00:00:39):
This is Floss Weekly, episode 744, recorded Wednesday, August 9th, 2023. A Chill Pirate Lawyer. This episode of Floss Weekly is brought to you by Kolide. That's Kolide with a k. Kolide is a device trust solution for companies with Okta. And they ensure that if a device isn't trusted and secure, it can't log into your cloud apps. Visit kolide.com/floss to book an on-demand demo today.
Leo Laporte (00:01:12):
Listeners of this program, get an ad free version if they're members of Club twit. $7 a month gives you ad free versions of all of our shows Plus membership in the club. Twit Discord, a great clubhouse for twit listeners. And finally, the twit plus feed with shows like Stacey's Book Club, the Untitled Linux show, the GIZ Fizz and more. Go to twit tv slash club twit and thanks for your support.
Doc Searls (00:01:38):
Hello again, everybody everywhere. I am Doc sles, and this is Floss Weekly. Joined this week by Catherine Druckman, herself.
Katherine Druckman (00:01:47):
Hello,
Doc Searls (00:01:48):
Down in Houston, Texas. Yep. Where there's <laugh>, where it's That
Katherine Druckman (00:01:52):
Is correct. Where it is. Blazingly hot.
Doc Searls (00:01:55):
I just say, you know, you don't say hot as hell say hot. It's hot.
Katherine Druckman (00:01:57):
It's not Okay. We are not Okay.
Doc Searls (00:01:59):
It's hot as Houston now, you know, that's, yeah. Yeah. It's a feature of Texas, isn't it? That <laugh>? It's not a bug. No. I'm
Katherine Druckman (00:02:07):
Gonna call that a, I'm gonna call that a bug. Yeah, no, it's,
Doc Searls (00:02:09):
Yeah, you would call it, call it a
Katherine Druckman (00:02:10):
Bug. Not my favorite. <Laugh> not my favorite part of living here. Yeah.
Doc Searls (00:02:14):
Well here, here in Indiana they've you know, gonna be really bad air today, and it's gonna be super hot, and then it gets to like 89 or something, and it's like, it's almost disappointing <laugh>. But anyway and, and our, our pool in the back, which I was in, was in my care, turned as green as a leaf a couple days ago and had to throw some stuff in it. And I still haven't been in that pool. I don't have to wait for the poison to finish killing off the algae <laugh>. It's <laugh>. I'm, I'm being a poor Californian my other home doing that. So our, our our, our guest today is Damien Riehl. And you and I talked to him before, like ourselves. We did. It's, it's been a while on 2.0 and I thought it was a few months ago, and it turns out it was over three years ago. And we mostly
Katherine Druckman (00:03:04):
Talked about, tell me how that works now
Doc Searls (00:03:05):
His, all the music thing. So I'm really interested in having the update on that. Yeah, I,
Katherine Druckman (00:03:11):
Yeah, me too. All, all the music is one of the more interesting projects we have, we ever talked about on the other podcast, honestly. So I'm pretty excited about, about hearing an update. I won't spoil it too much. No.
Doc Searls (00:03:25):
You know, we,
Katherine Druckman (00:03:25):
But it's very
Doc Searls (00:03:26):
Interesting. Yeah. So, so I'm gonna introduce Damien. Damien is our, our guest. He's an attorney. He's based in Minnesota. He is one of the most interesting characters I know on the list that we are both on. And but I think he got on that list. I think I even pushed him onto that list. I'm not sure it way back then. Just because he's too interesting to ignore and he does really creative things. I agree around the law including all the music. And he is got another thing called Sally, which we can talk about. So I just wanna welcome to the show, Damian. How
Damien Riehl (00:04:04):
You doing? Thank you so much. It couldn't be better. I'm thrilled to be here. Thank you for having me on.
Doc Searls (00:04:09):
Thanks a lot. So, so let's, let's start with, with all the music, which I won't describe it because I'll get it partly wrong and you'll get it All right. So I think our, but I think our audience would be really interested in it, so.
Damien Riehl (00:04:21):
Sure. so we I've created 471 billion melodies and copyrighted them all. And the way that I've done that is to much like you can brute force a password by going a, A A A B, a a C. I did that with music. So I went ray me till I mathematically exhausted every melody that's ever been. And every melody that ever can be to the tune of 471 billion melodies, I wrote them all to disc. Once they're written to dis they're copyrighted automatically. You don't need to do anything else beyond that. And so then I put everything in the public domain under Creative common zero, as you can see on your screen right now. Yeah. Real public
Doc Searls (00:04:58):
Domain.
Damien Riehl (00:04:58):
Yeah, that's right. Real public domain to be able to protect. You stole my melody lawsuit defendants. And the idea is that if I created a thing at 300,000 melodies per second perhaps someone should not get a monopoly that is a copyright for life of the author, plus 70 years for something that my machine cranked out at a millisecond.
Doc Searls (00:05:18):
Did, did
Katherine Druckman (00:05:19):
You stick? How many notes are required? Oh,
Doc Searls (00:05:21):
Sorry. I was, I was gonna ask a similar question, but do you stick to the chromatic scale?
Damien Riehl (00:05:25):
So my initial the, you showed my Ted talk just a bit ago. My initial dataset of 68 billion was just the diatonic scale that is doremi. And then we moved on to the chromatic scale. And so we we, that is all the white notes and the black notes on the keyboard. And then to Catherine's question about how long it is so we've been talking about how many doremi, that's eight up, there's range and, and then chromatic, that's 13 up. And then the across we with the di scale, we went to 12 notes. And with the chromatic scale, we went to 10 notes across.
Katherine Druckman (00:06:00):
Okay. So every possible 10 or 12 note melody is, is protected, which is fascinating. Well, right. 'cause I mean, well, we could go on and on about how nothing's you know, it's all been done before, right? Nothing is anything truly original. What what is, you know, human innovation and, and art and, and all of these things. And, and I love the idea of protecting that and, and, and saving the world from unnecessary lawsuits, which, you know, we do see from time to time in music. And there are, that's
Damien Riehl (00:06:30):
Right. And, and really, the, the question yes. Is anything really original is the right question to ask. And also is in mathematics is everything just a already a grid that we are just plucking out? That is everything is just I've mathematically exhausted all the things. So really is a melody, just mathematics that maybe is unoriginal, therefore unc copyrightable. And so that's really what before my talk, every defendant in these kind of, you stole my melody, lawsuits, ev every defendant has lost. But after my Ted Talk which has been seen now 2 million times and I always hope that would be seen by judges and lawyers that are arguing these cases. After my talk now, every defendant that's been sued, has used my arguments, and largely is one. So led Zeppelin has used my arguments unoriginal, therefore on copyrightable. Katie Perry used my arguments not original, therefore on copyable. And more recently, ed Sheeran in the UK case made those same arguments. Correlation is not causation, but there's pretty good correlation.
Katherine Druckman (00:07:31):
Fascinating. Yeah, that's quite an update. I <laugh>, that's yeah. That's impressive. I saving music one lawyer at a time, <laugh>. I love this.
Damien Riehl (00:07:42):
Well, it's, it's really, and I think really opening up a question, and what I'm hope we're gonna be talking about soon is with generative ai is that, you know mm-hmm. <Affirmative>, there's my brute force is a very crude method of being able to exhaust all the melodies. But the crude method the crude melodies are the ones that are being sued over. So, for example, Katy Perry was sued over a melody that was literally dun, dun, dun, dun, dun dun dun dun. And that is if you're a music major, even not a music major, you say, that is so trite. You're just going from the tonic down down two steps. Mm-Hmm. <affirmative> but she got dinged by a jury, $2.8 million for that very trite melody that showed up in my dataset, 8,128 times. So he got $2 million over this very trite melody that the jury said was worth 2.8 million. But then that was on a Tuesday that that jury verdict happened. My Ted talk was on a Saturday just a few days later. And then after my talk got released to the public in January of 2020, the judge actually reversed the jury verdict after that talk and march saying that that melody is unoriginal, therefore unc copyrightable. So again, correlation is not causation, but there's some good correlation there. I'm, I'm wondering if you could have,
Doc Searls (00:08:55):
Okay. You could get an AI to generate a choir of robots, okay. And the choir of robots. These are images, you know, animated images who would then sing not only the, the particular song in question, those 8,000 songs, but do it in, like, just start doing it in 8,000 different ways where you have a Latin beat on this one, and you have a, a, a, you know, a a five bar thing here and something else there. I mean, 'cause I mean, it seems to be part of your case is that all of these things are silly because music is inherently a purely human thing. Anybody could do it. And so therefore none of it should be copyrightable.
Damien Riehl (00:09:39):
That's right. And it really I, I, I think that my project really focuses on the difference between an aspect of music and the song. So for example, what is a song? But it's a combination of the melody, which I addressed, and the harmonies and the rhythmic beats, and the tambour of the things, right? And the lyrics, if you have lyrics, right? So these are all components of a song. And so we, as a society has said that one aspect of the song, the drum beats, those are unc copyrightable. 'cause There are so many drum beats, right? Mm-Hmm. similarly chord structures, unc copyrightable, 'cause there are only so many chords. So any of those, if you try to sue an isolation on the drum beat or just the chords you're gonna lose we have decided, have decided that. So really all I'm doing is just marching through to say, maybe melodies are part of that. Maybe, you know, chord structure, unc copyrightable. 'cause There are only so many chords. Maybe melodies are un copyrightable. 'cause There are only so many melodies that everyone will inevitably step on another melody that's already been,
Doc Searls (00:10:38):
And a couple, go ahead, Catherine.
Katherine Druckman (00:10:41):
I was gonna, how does this, how does this kind of apply to other art forms? Like, are there media i, I don't know, visual art, painting, printmaking, photography film, anything else? Does, does, do these fundamental concepts apply in other areas?
Damien Riehl (00:10:56):
Yeah. I, I think the, the difference is the mathematical permutations in those other genres versus music. So in music I talked about the diatonic scale, Doremi vaso lati, that is eight. And then if we go to the 12 notes across the mathematics on that is eight to the 12th power, which is 68 billion. Okay. so if you were to say how many three note melodies would there be? That's eight to the third power. And that's, that's only what eight times. Eight times eight is, you know, 507, right? 512 maybe. And so that's only 512 3 note melodies versus the mathematics on words, for example, for literature there are, i, I think 127,000 English words. So that's 127,000 to the third power which is a lot, right? So much bigger than the finite which is just music that that's exactly right, what you're showing on your screen. So the idea is that it's easier to accidentally step on someone else's melody, where it's much harder to accidentally step on a sentence, for example,
Katherine Druckman (00:12:00):
Or, or, I, I, I guess where my mind was going is, you know, when I think about this kind of like the, the, the, the lawsuits around music that you, you mentioned, I kind of, my mind goes straight to Anish Kaur and Vanta Black. The black is black and having the rights to, to the, so the sole ability to use the black is black. And then somebody else comes up with a just a slightly black or black, and then, okay, and then we're all fine. And then somebody else comes with the, with the Pinkest pink and says, well, everybody can use it except for that one guy. And it, it has the same tone of silliness, in my opinion. And a lot of the arguments about the, about music over, you know, a tiny little melody being copyrightable. And I just wondered if there's any, you know, possible correlation there. But, but I guess it is a bit of a different argument. Yeah.
Damien Riehl (00:12:44):
I I think that, you know, much of any art whether it's the blackest black or the melody that I have, much of that is just the story that we attribute to it. Andy Warhol, you know, his, his prints, he made thousands of them, but they were valuable because of the story that he put to it. So the, the fact that it's the blackest black is not necessarily valuable because it truly is the blackest black, but it's the story that the artist put behind it. So really when people look at my my project, I assume that there was gonna be a lot of pushback from artists. But they surprisingly, most musicians are very happy with the work that I'm doing. They see me as being able to essentially they say, I write music, and I feel like I have a target on my back all the time, that if I becomes popular enough, I'm gonna get sued.
(00:13:25):
And it's over a song I maybe have never heard before. And so I say to people with my Alder music project that the value is not in the melody. The value is in the story that you provide as, as a singer, as an artist, et cetera. And in the same way with generative art generative ai the value is not in the gen AI that doc just mentioned a bit ago for, you know, acquire people singing. No one's gonna pay a dollar for that because it doesn't have the human story to be able to do do that. So we should not be fighting over the particular melody, or we should not be fighting over the particular chord structure. We should essentially make money on the humanity which is what is actually going to make money that differentiates us from the AI generated r
Doc Searls (00:14:11):
And, and just put in our back chat Pantone and Pantone colors. And didn't Pantone have some lawsuits somewhere about something or other, or many of them? I'm, I'm guessing because they're in the business of owning colors.
Damien Riehl (00:14:26):
I, I think, I think that's right. And if I'm remembering correctly, and some others might might correct me on this, but I think that the lawsuits have largely not been on the color itself, but on the number that's attributed to that particular color. Huh. so you could have a, maybe a right, it's not a, not a trade secret, but maybe a copyright on the collection of those colors. And that's maybe trademark maybe trade secret hard to say. But either way, I think it was on the attribution of the Pantone, not on the color itself.
Doc Searls (00:14:53):
The, so I wanna get into what it means to be human versus the kind of thing you could sue over. And I kind of have a, a long wayed question about that, and I wanna bring that up after this break. Okay. So, an interesting thing about human beings, and you're alluding toward this, is that we're all different. We look different. We sound different. It's how we tell each other apart. We know people by their voice. We know people by how they, how they look. Dogs go further. They could tell us apart by how we smell. And apparently we all do smell different. And that's a unique thing. And, and that changes over time. And that's interesting as well. And that shows up in our art. It shows up in our interest. It shows up in the way that we learn.
(00:15:51):
Which is also interesting. I mean, we have a, a, a bird with a nest on our front porch right now, and it's one of the sloppiest, like a nest I've ever seen. But apparently morning doves built sloppy nests. And, but they didn't, they didn't, that bird didn't learn how to do that, like humans would. It just already knew. And that seems to be, but learning and changing over our lifetime seems to be also a particularly human trait. So I wanna reconcile that with both open source, which depends on lots and lots of people contributing, what only they can contribute in some ways. So none of them are the same. And oh, where are I going with that? But I mean, that's part of it. Oh, oh, and also the need that what, oh, lemme put it this way. What actually want production to chop out my, my mumbling at this point.
(00:16:53):
But but it's what Walt Whitman said, that humans were unique in that they were demented with the mania of oing things. And that speaks to exactly where your practice is with all the music and some of the other stuff you're doing, that people who are demented with the mania of owning something are trying to own something that really is not inherently ownable. You know, I Thomas Jefferson said, if, if you light your candle with mine, who owns the flame? I mean, it, it's, that's, that's compressing about 700 words in, in one letter to a guy. But that's pretty much it, you know, it's not, how do you reconcile all that other than taking it to court and hoping for a good result?
Damien Riehl (00:17:34):
I, I, I think you're asking the very right questions. And, you know you and I and everyone on this listening to this can remember back in the early days of open source, at least my early days in the early 1990s, and the late eight and 1980s, and you think about people were saying, open source will never work because people want to own the thing. They would never freely give away value. Because if you, it has value and monetary value, why would anyone give it away? And I think what the open source community has demonstrated is that there is value in a collective ownership that is non ownership of the thing that the open source community creates. That is, I'm contributing to a thing that is greater than myself, therefore, I'm getting a value in feeling good about creating a thing that is now being used all over the world, like Linux or something else like that.
(00:18:19):
And so I think that we as musicians largely have been kicked and I would say musicians and then artists more broadly have been kicked so many times by not getting paid, that we tend to try to latch onto something that we see as getting paid, which is often copyright. That is you know, I copyright my song, therefore I put the song on the radio, and then the song, you know, the radio pays me money. The downloads pay me money. And that's, I attribute the copyright to the thing that is making me money. But I think that's a misattribution that is not you don't own the music necessarily. You own the thing that has value, which is the people listening to the music. So all that's to say that that we should think more as a commons, that the art is a collective commons from which each artist draws.
(00:19:10):
And we put our spin on that commons on that particular idea, on that particular melody, on that particular chord structure. And if it's if it's unique that is, nobody's ever done it before, nobody's gonna listen to it <laugh>, that is because it's gonna be so weird that nobody's gonna do it. So we all have to, you know, pull from the same chords because the same chord structure generally is more relatable, therefore more popular, therefore, it makes more money. And so if we recognize that we're all pulling from the same chord structure, we're all pulling from the same diatonic melodies maybe we see it less as us owning these things and more as us putting our work in the world. And the artists the audience that pays us money is the audience that really cares about me as a human. Not on that particular melody that I pulled out of the comments.
Doc Searls (00:19:58):
So, let me probe a bit deeper in this, because it, I think if we were to compress what you just said, we originally came up with this idea of, of copyright applied to music, which probably goes back to everybody lamenting Stephen Foster hardly got died, died with three dimes in his pocket. And but then it went to player pianos. Like you could put the tone in the player piano, and I think it was Victor Herbert's, you know, wanting, wanting, you know, setting up a regime where composers got the money because he heard his compositions being played in a bar and then sheet music. And, but it was always a attached to a thing. And most recently, containers, the containers were vinyl, and then they were plastic and little spinning discs. And we imagine that it's in a thumb driver versus in some other container, and maybe even what's on Spotify and Apple Music.
(00:20:50):
And Amazon music is itself a kind of a container. 'cause You go to this kind of place that's an app or something on the web, when in fact, what really matters is this relationship you're talking about. And that is not, there's something about that's uncontained is kinda like, you don't, you can't bottle that, but it has value. And a, a thesis that I have is that people do wanna pay for music. They'd be glad to pay for music on their own terms, and in, in a ways that are much more friendly to the artists and, and the people who contributed to the, to a particular production of a, of a song, and would love a role in how that gets played out. The composer gets this, the band gets this, the lead artist gets this, you know, and we have less of the, you know, the need to have the, the the movies about, you know, 20 feet from stardom for the backup girls and the, and the, the, the wrecking crew for all those uncredited people who were behind every hit from Los Angeles and Muscle Shoals and oh, and the, and the Funk Brothers in Detroit.
(00:21:56):
That kind of got ripped off in a way by Motown. And I'm, I'm wondering if you see a new regime emerging out of this that starts with that relationship, that starts with what it is that we as human beings like about music and value coming from artists.
Damien Riehl (00:22:15):
I, yeah. I think that you hit it right on the head that what are we as audience members and we as paying members of music? What are we paying for? And I think it's largely us paying for us wanting to be able to listen to that artist that gets us so excited. We're not paying for the melody, even though the melody is nice, right? We're not paying for that chord structure, even though the chord structure is nice. We're taking, paying for the whole package as made by a human. And we really want to compensate that human for that great thing that the human has put in the world. That is the whole, not the parts that is the sum of, but we want the gestalt of the song. And so really, I, I think you're right, that we we are paying for the human toil that goes into the thing and not the components of each of those things. Not just the melody, not just the chords, not just the drum beat.
Katherine Druckman (00:23:05):
I, I wonder, could we pivot a little bit to talk about how this relates to things like current events, the the actor strike, the writer strike and their concerns about ai how, so we're talking about you know, human creation, but what about human likeness and things like that? H how do we kind of, how do we address those issues?
Damien Riehl (00:23:25):
I, I, I love that you asked that. And that's something I've thought about a lot over the last nine, 10 months. And the, the idea you know, a lot of the people are complaining and rightfully so, saying that I've written books and you've ingested all of my books. I've written music, you've ingested all my music, I've created art, you've ingested all of my art. Each of those things are copyrightable. How dare you ingest mm-hmm. <Affirmative> to be able to make other things like mine. And in copyright law, there's a concept called the idea expression dichotomy. Mm-Hmm. <affirmative>, that is, ideas are unc copyrightable. Only the human expression of those ideas are copyrightable. So, for example two teenagers falling in love and tragically dying is Romeo and Juliet. That idea is UNC Copyrightable.
(00:24:08):
But West Side Story, the expression of that idea is copyrightable, even though Romeo and Juliet is in the public domain. So ideas, unc copyrightable, only the expressions of those ideas are copyrightable. So, when you think about this as applied to large language models or applied to any generative AI ingestion if, for those more technical people on who are listening to this podcast, they think about what are they doing to the inputs that is all the books that are being ingested, all of the all the internet text it's putting all of those into vector space that is, it's extracting the Bob Dylan ness of the thing. It's extracting the earnest hemmingway ness of the thing. These are all ideas. They, and, you know, they, once it takes the Bob Dylan ness and puts it into vector space, it jettisons the expression of the idea that is get, gets rid of the, the, the particular poems or the particular words.
(00:25:02):
And it only keeps the vector weights the vector expressions. And then on the output, it doesn't create recreate Bob Dylan songs. It recreates the ideas of Bob Dylan songs. It recreates the ideas of Ernest Hemmingway is, so on the input side, it's really not ingesting expressions, just ideas. And on the output side, it's just creating new expressions of ideas which are on copyrightable. So I think that we have, as a society have to think about how do we compensate, if at all these inputs is there you know, is there something that should be paid to them? And if so, how do you attribute it to Earnest Hemmingway when G PT four is built on earnest hemmingway and millions of other authors? So, how do I say that this particular output of G PT four is attributed to this vector weight of this particular earnest hemmingway thing? So this idea of attribution all the way through the curve that's next to impossible to be able to do, because Ernest Hemmingway made the same ideas as a lot of his contemporaries of Faulkner. And otherwise,
Katherine Druckman (00:26:04):
Going back to the likeness idea, if you're an actor, let's say maybe you're very famous, or maybe not, but I wonder if I'm, let's say a video game creator and, and I have used generative AI to create something that looks, I don't know, kind of like Tom Cruise, but as maybe the essence of Tom Cruise, maybe a little bit of the mannerism and expression that we've come to associate with Tom Cruise, but it doesn't quite look like him. Like, how does that work?
Damien Riehl (00:26:29):
Yeah. So so the, I I love that you brought that up because it, it demonstrates difference between copyright law, which I've been talking about. And then there's likeness that's misappropriation of likeness which is what the Tom Cruise would be. And so copyright law doesn't cover things like, you know, Tom Cruise's face Yeah. You can't copyright a face. Yeah, that's right. But you can, you, Tom Cruise can be able to say that this is Tom my likeness, and you can't use this for commercial purposes without my, so this is more, the likeness is more like trademark. And the idea is that this is an unfair competition that and this goes back to there have been lawsuits related to Tom Waits sued because somebody asked Tom Waites Hey, we want you to do a commercial.
(00:27:11):
Tom Waites gave his price, and they said, no, that's too much. So they created a Tom Waits lookalike, or soundalike mm-hmm. To sound like Tom Waites. Oh, okay. Yeah. And so they misappropriated his likeness. And Tom Waits won in that lawsuit saying, you can't rip, rip off my likeness. So in the same way there's been lots of these kind of things. And so I would say generative AI is the same idea, is that if you have a, something that looks like Tom Cruise or something that you know, sounds like Drake then that falls not on the copyright side, but instead on the misappropriation of likeness,
Doc Searls (00:27:41):
I'd like to get into a patents, which is another thing you've written and talked about. After I bring up the right tab on my computer, <laugh>, I'm sorry about that, but there it's here somewhere. Oh, geez. There we go. Okay. <laugh>, after, let everybody know that this episode of Last Weekly is brought to you by Kolide. Kolide is a device trust solution for companies with Okta. And they ensure that if a device isn't trusted and secure, it can't log into your cloud apps. If you work at security or it, and your company has Okta, this message is for you. Have you noticed that for the past few years, the majority of data breaches and hacks you read about how something in common its employees? Sometimes an employee's device gets hacked because of unpatched software. Sometimes an employee leaves sensitive data in an unsecured place, and it seems like every day a hacker breaks in using credentials they've fished from an employee.
(00:28:44):
The problem here isn't your end users, it's the solutions that are supposed to prevent those breaches. But it doesn't have to be this way. Imagine a world where only secure devices can access your cloud apps. In this world, Phish credentials are useless to hackers, and you can manage every os, including Linux, all from a single dashboard. Best of all, you can get employees to fix their own device security issues without creating more work for your IT team. The good news is, you don't have to imagine this world, you could just start using collide. Visit kolide.com/floss to book an on demand demo today and see how it works for yourself. That's K O L I D e.com/floss.
(00:29:30):
Okay. Well, I, I brought up patents before the break, and so let's touch on that. 'cause You already brought up copyright and trademark and patents are yet another thing. And it's one of those things where I don't suppose there is a finite limit to the variety of things that could be patented, unlike, say, music and maybe even colors, which you talked about earlier. So there's a lot of room for invention in there, and protection of invention, and also for ambiguity. Like, like Jefferson was against patents and ended up, you know, being the, the first patent, <laugh> the first patent chief in the country. So
Damien Riehl (00:30:12):
Yeah, it depends on which which ox is being gored as the case could be. Yeah. If, if you can make money from your ox being gored, then Thomas Jefferson would out to have the patent. To the, to the point of so after my al the music project became popular, a friend of mine, Mike Bato, said let's try to apply this to patents. He said, wouldn't it be great if we could take every patent that's ever been filed and take every claim from each of those patents, and then recombine each of those claims into new prior art and be able to publish hundreds of thousands or millions or billions of new inventions that recombine the old claims from each of those patents. And so that's the project that we're doing right now. The old project was all the music with copyright we're now doing all the patents to recombine all of those <laugh>.
(00:31:01):
And so they really, the question is if you are trying to recombine claims that have previously happened maybe a patent examiner will say, no, Bombery and real, and Noah Reuben we did, they did that in 2023, that exact same claim combination, therefore, you can't patent it. And in that way, patent is different than copyright, where in copyright you have to access the actual copyright to be able to say that you copied that access in patents, that's not necessary to access it. You just put it on a public. It's a website, say, on archive.org. And then merely putting it on the website is publication, therefore patentable.
Doc Searls (00:31:39):
So, so in, in a way, what it seems to be like basically making everything possible, patented, effectively, you've got infinite submarines, but you've also prevented any submarine from coming up under existing patents as well. Is that right?
Damien Riehl (00:31:58):
That's exactly right. Yeah. And if, if you are going to patent something that has never been claimed before, that is if you're introducing new things more power to you, right? I, our project would not cover that. It's merely by recombining all of the old patents claims into trying to do a new thing with old wine schemes to mix a metaphor. That's where our project would come in. So if you're gonna innovate more power to you, if you're just gonna recombine maybe look out,
Katherine Druckman (00:32:25):
Oh, we have a, we have an interesting question from Jonathan Bennett, who is another boss Weekly co-host, but he's in the, he's in the back channel, and he has a question in particular about GitHub co-pilot, which is a, which is a, a, a great topic actually to bring up a question about, but co-pilot it is in, in ingesting all the, all the code, you know, of GitHub, basically everything. Maybe it's only sticking to open source licenses, but I, you know, who knows, right? It is, it sufficiently stripping the expression from the dataset to make it no longer copyrighted from the original.
Damien Riehl (00:33:03):
That's a, I I love that that person brought up copilot because it's, it's a great case. Yeah. And it's currently being litigated. So of course, the law is whatever a judge and jury decides the law is. So really from going forward on this topic, I'm just gonna be prognosticating like everybody's prognosticating mm-hmm. <Affirmative>. But I would say that much like we were talking about earlier with the books where it's ingesting all of the ideas and jettisoning the expressions also, I would say for the code that's being done regardless of whether it's an open source license or whether it's a proprietary license that's being ingested within the GitHub repo really what copilot is doing is taking the statistically likely next thing that happens. So it's essentially and so the output of GitHub copilot is saying if common, then output.
(00:33:52):
And so if common, then how copyrightable is that expression? That is, if it's a particular type of code string of code that happens 10,000 times or a hundred thousand times that is the thing that is most likely to be output by copilot. So if it's used to a hundred thousand times, does any one person have, even if they put it under a proprietary license, is that string of code really proprietary to them? And this goes back to the commons, or is it instead just we're all drawing from the same wellspring and we happen to be using the same string that is copilot is outputting. And so really, of course if there is gonna be, you know, little things here and there where it's gonna be able to say, you know, this, this string of code that gi Kub out output is maybe only used in 10 other things, of course, there's gonna be those types of scenarios. But for the most part, it's largely just outputting things that are common, therefore, on copyright,
Katherine Druckman (00:34:48):
I guess every, every developer thinks their, their their own code is brilliantly unique. But I, I've seen, I've seen a few instances where people have claimed that GitHub copilot spit out their exact code block, you know, that, that they feel is unique enough that it should not have done that. And I think that's the kind of, I guess, you know, I guess as you say, we'll see, we'll see how these, these cases shake
Damien Riehl (00:35:11):
Out. And I would ask that person that they ask that like, do a control f across and find search across GitHub to Right.
Katherine Druckman (00:35:18):
All of
Damien Riehl (00:35:18):
Github. Yeah, that's right. How many, if that, that shows up 10,000 times maybe you thought it was unique to you, maybe it's not.
Katherine Druckman (00:35:25):
Yeah.
Doc Searls (00:35:28):
So I'm curious, is that is that the case being litigated right now? Is that California d o e versus GitHub, or is it something
Damien Riehl (00:35:35):
Else? It's John Doe It's Doy GitHub is the, oh,
Doc Searls (00:35:38):
Yeah, it's ca yeah. California, d o e I guess that's what it means.
Damien Riehl (00:35:40):
Yeah. Yep. Yeah, that's right. It is because the, the coders that were suing originally wanted to be wanted to be anonymous. So they they didn't want retribution based on their their lawsuit.
Doc Searls (00:35:52):
Yeah. And, and Jonathan just put in the the back channel. Github did just introduce a new feature where you can get a warning when co-pilot is suggesting something that's clearly taken from another project. So they're getting ahead of that, I guess, in some kind of way. That's true.
Damien Riehl (00:36:08):
But, and, and, and particularly for this case for that GitHub co-pilot case the, the Microsoft OpenAI and others are, are relying on previous precedent, which is the Google Books case. And you might remember that Google scanned pretty much every book that ever existed, and the publishers and the authors said, Hey, you can't scan that book. That book is clearly copyrighted. But Google's argued, and the court agreed that the use of scanning is transformative. That is they weren't doing it. They weren't scanning it to be able to reproduce it and let people read it like on a Kindle. But what Google was instead doing was to be able to scan it to be able to create an index to make the entire book Corpus searchable. So the court said because it's an index, that is a transformative purpose.
(00:36:55):
That is fair use, that is not copyright infringement. So anyway, so the GitHub case is referring to that Google Books case saying, well, you know, in Google Books, you can go to Google Books today and reproduce three or four pages verbatim, right? So that is that is fair use according to the Second Circuit. So why wouldn't GitHub's use where it's just you know, taking not the expressions of three or four pages, but merely the ideas and merely the common ideas may be used in 10,000 different lines of code. Why would if Google Books is permissible, why wouldn't GitHub reproducing those UNC copyrighted all ideas be similarly permissible.
Doc Searls (00:37:32):
So I remember back was in the aughts, I think, I don't think it was earlier when Google was doing that. They were doing it with the Harvard Library, which is the largest academic library on earth, I think, plus Michigan. They, they also hit Michigan up, and they had plans to go farther, but they kind of halted it. I think it's kind of stuck at a certain point in history, and I remember it became a lot harder to use at some point. I guess maybe it's because of the transformative nature of it. They wanna keep it in a transformed form rather than in a useful form. <Laugh>, I didn't a form. I mean, I, I, I wanna look through my own books that I've contributed to and try to find what I've said, and it's not as easy as it was when they first started doing it, but maybe, I dunno if that's a good thing or not.
Damien Riehl (00:38:18):
I, I think that's right. And I think that one of the reasons it stopped is 'cause what is the commercial value of Google to be able to make all the books copyrightable or all the books searchable rather because of course you can make maybe some money on ads maybe some money by pushing people to Amazon or someplace where they could buy the books. But I think it stopped not because of the usefulness, but just because of the commercial. Can you make money on the thing?
Doc Searls (00:38:41):
Yeah. So I think I wanna take one more break before we go and talk about Sally, which is which is another one of your projects. Okay. So let, let's talk about sally.org, which I think is similar to these other really subversive projects that you've come up with. <Laugh> mm-hmm. <Laugh> <laugh>. Tell us about that one. It's s a i.org.
Damien Riehl (00:39:06):
Sure. SA Sally is the standards advancement for the legal industry, ss a l i.org. And that is also free and open source. It's under the m i t license on GitHub. And what it is, is taking everything that matters in the law and putting it into an ontology, a taxonomy of all the things that matter. So, for example the the GitHub case is a copyright case in the Northern District of California and it is arguing breach of contract. Each of those is a Sally tag that we are taking those three Sally tags and literally 13,000 other Sally tags and creating an ontology in R D F Ss, all, which is represented in X M L. And what that does is you can now ingest all 13,000 things, breach of contract, Northern District of California, also you know, every cause of action, like breach of privacy, breach of a criminal you know, manslaughter murder, each of these things that matters in the law is cataloged and is being used by large language models for retrieval augmented generation.
(00:40:12):
So the idea is that these are all tags that I can tag up all the breach of contract breaches of contract motions to dismiss in the northern district of California. And once you've retrieved the 20 or 50 tags of those documents that have those things, you can then run the large language model across those and be able to say, now give me the arguments and the cases statistically most likely to win for breaches of contract in the northern district of California before this judge. So really, we've now made this all free and open source to be able to help the legal industry be able to not only quantify the things that matter, but also to be interoperable amongst the clients and the law firms and the legal tech vendors, like the ones I work for. And the, the only reason that we've made it free and open source is because that's the right thing to do, <laugh>, and this is and for our open source audience if we were to make it proprietary, and there are legal tech companies that I'm not going to name that have made their taxonomy proprietary no one would adopt it, and therefore we would not permit the interoperability between law firms and clients and vendors.
(00:41:21):
But it's only because it is open source and because it's really good if I say so myself, that now the entire community in the entire legal community, not just in the United States, but around the world, are now rallying around this open source tool that has 13,000 ontological things that is the open source that they can use freely. And that's the reason that it's being adopted so widely.
Katherine Druckman (00:41:45):
And how, how long have you been, how long has this project been in existence?
Damien Riehl (00:41:50):
I started in January of 2020. So the, so 2020 was a good year. All the music projects started then <laugh> and then then Sally, I started with them. But yeah, when I joined, they had 1000 taxonomical items. I built it from 1000 to 10,000, and now we're at almost 13,000.
Katherine Druckman (00:42:06):
Wow. And, and what, what is the adoption on that? Like,
Damien Riehl (00:42:09):
It's everyone it's a who's who of the legal industry that's being adopted. So the largest law firms, the largest legal tech vendors like Thom Reuters, Lexus, US at v Ls and then the largest corporations like Microsoft and and others they're all rallying around. You know, my friend Jason Barnwell at Microsoft has says that every time he has a business question he's tagging it up using Sally, and then telling his law firms that are doing work for him why don't you as a law firm tag it up using Sally so that when you give me your work product, I can then put it into my knowledge base so the next time a business has that legal question, they can use those tags to be able to find the things that matter.
Katherine Druckman (00:42:48):
Have you, do you have any, are have you, are you hearing like, you know, the use of Sally has, has I, I, you know, I don't know, I <laugh> not having any expertise in this field, but this is solved. Like, you, you have, with the, all the music thing, you, you, you have at least some, you know, correlation probably with, with some high profile cases. Do you have anything like that here?
Damien Riehl (00:43:10):
Well, I, I will say that because in the legal industry, so many people have made so much money on selling their taxonomy that we at Sally are now giving away as free and open source taxonomy. I've been told by major people in the industry that I have a target on my back <laugh> saying, Hey, you've we've had this gravy train of making a lot of money on this taxonomy, and now you're giving away for free and you're making it much cheaper to be able to do a thing that we've charged a lot of money for. And so I would say and I said to that person, and I'll say to everyone here if you're making money on a taxonomy that should be free, like Wikipedia is free then you shouldn't be making money on that. And I'm not going to apologize for making free and open source something that shouldn't be free, no open or shouldn't be made money on. There are bigger problems we should be solving than than taxonomies.
Katherine Druckman (00:44:02):
Do you, do you this is also from Jonathan in the back channel, but is this does is only play, is it only US law, US cases?
Damien Riehl (00:44:10):
No, it's, it's it's all, all the so my v l is a worldwide company that is, we have 110 countries around the world. And so I'm running Sally, our open source tagging set across our billion documents. And it's remarkably good. Everyone has murder everyone has breaches of contract <laugh> everyone has you know, patent infringement, right? And so there are little variations here and there. For example, we have an Indian working group India is attributed. They have, we have about 120 lawyers that are working to implement Sally and see if there's any variations on the theme. And one example is that the Indian lawyers said, you don't have a thing called strict liability. I said, no, no, no. We have strict liability. It's strict product liability. And they said, no, that's not what we're talking about. We're talking about strict liability. If a car hits another car you don't have to decide who's at fault as to whether the insurance company just pays. I said, oh, we call that no fault claim. And so really what the United States calls no fault, India calls strict liability. So there's little wrinkles here and there mm-hmm. <Affirmative> that is variation with every country. But Sally is universal. It really and we just accommodate those wrinkles as we come across them.
Doc Searls (00:45:21):
So you just mentioned V l a css. I just wanna make clear that you're talking about your employer in this case, which is when you speak in the first person, plural, it's V L E x dot com. It's, that's right. It's a world, a worldwide company that specializes in ai, which is also interesting. AI powered legal research. And so, so let's let's talk some AI for a while. 'cause I, my own feeling about it is that we are in the very earliest days of this thing. But it seems like it's, I mean, yesterday's news is old hat already, you know, and you mentioned G P T four we're gonna be at 5, 6, 7 in the next few months probably. And, and they're not alone. If Facebook's on llama apple's has to be on something or they're doomed. You know, Google, of course has its thing.
(00:46:18):
I'm finding more and more people I know are employed to do ai. Suddenly their jobs have changed. Where do you see this going? I mean, it, it said to me, it, it seems like it's like a stage of the world becoming digital and everything you've done with, with Sally, with all the music and the rest of it is kind of massive adaptation in a, in an industry to a digital world and reframing it on digital terms. But now we have machines that can emulate and use human expression in novel and very useful ways, but it's very easy to imagine bad uses, but there is for everything. So that's old hat in that sense.
Damien Riehl (00:47:03):
Yeah, I would say that. I agree. I would agree with you that we are in the very earliest days and and I vacillate between whether we are going to reach the utopia or hellscape. I am not sure which, which we're gonna actually go to, but, or both,
Doc Searls (00:47:16):
You know, maybe,
Damien Riehl (00:47:17):
Maybe both, right? Simul, depending simultaneously the cla depending on the class, maybe the highest class will be utopia and the lower class will be hellscape. But I would say that really we you know, specifically the legal domain versus the, the large macro in the legal domain, everything that we do is words. That is, we don't create widgets. All we create is words. We ingest words, we synthesize words, and then we output words. And it turns out that large language models, they do words really well. And much like the musicians, what they all music projects say, you know, I, my melody is unique to me until you point out to them, Hey, that same melody was used, you know, by Bach or by Mozart, right? <Laugh>. so, so something that we think is unique to us maybe is not necessarily unique to us.
(00:48:00):
So in the same way, lawyers when what we're finding is through large language models we at v a CSS have a billion legal documents that we're running through a vector database. And if you ask a legal question, we have now we're creating a tool to be able to output a legal memorandum, much like an associate would output a legal memorandum. And when lawyers that I've showed this to said, you know, my God, you've done in one minute, what my associate would've taken two days to be able to output that is that associate would've said I'm a snowflake. That you can't replicate what I've done. But this output is replicating what they have done and done it faster, better, and stronger. And so really as I vacillate between, you know, our utopian future and our health Hellscape future we could go two ways in the hellscape future.
(00:48:44):
This is where all of a sudden our production is going to go, you know, we're gonna get we're gonna fire all the associate lawyers and just gonna have the partners on top, and you're not gonna have worker bees anymore because you have this machine that's gonna go away. That's a scarcity mindset. But on the other side, there's an abundance, abundance mindset where you could say, now, because we've shrunk the cost of legal services, because we now have, have this tool to do in one minute, what, what you previously took two days maybe legal services are gonna be cheaper, therefore the demand for legal services is gonna be higher. And then you'll actually have more work that is clients because it's cheaper, will ask you more questions, and then you'll be able to generate more work in that way.
(00:49:28):
That's the abundance mindset. That is the utopian. I, I hope that we have the abundance mindset. And that abundance mindset is much like with the accountants in 1980 had this amazing ai, it was called spreadsheets. And they said, my God all I do each day is add and subtract numbers. And all of a sudden the spreadsheet can do what took me a week, and they could do it in a second. Where is our jobs gonna go? It's gonna eat our jobs. But the clients realize that, wait, now this is only gonna take me a minute to get this thing back from the accountant rather than a week. I'm gonna ask question number two, run this scenario. Question number three, run this scenario. And now there are more accountants than there ever have been, certainly more than in 1980. So that's the abundance mindset that I, I hope that we as society are gonna go to, is that as we shrink legal costs to be a way that more people like you and me and everybody else can pay for will actually have better access to justice in a way that is better for lawyers too.
(00:50:25):
In, in the back channel, we have,
Doc Searls (00:50:26):
Again, from Jonathan probably just for put 'em on the show, frankly. But anyway, there was a, there was a lawyer that got in quite a bit of trouble for using l l M to create documents that hallucinated rulings that didn't exist. How do we avoid hall hallucinations in important output?
Damien Riehl (00:50:42):
And that, that is, that is the right question to ask. And that lawyer his name is Schwartz sadly there's now colloquialism within the legal community that boy, you really Schwartz that up. <Laugh>.
Doc Searls (00:50:53):
No. A poor guy.
Damien Riehl (00:50:54):
A poor guy.
Doc Searls (00:50:55):
Let's like put the name Karen got destroy, ruined baseball's completely.
Damien Riehl (00:50:58):
Exactly. Wow. So, so sadly what that person did was I asked a legal question and then said, give me cases for that. And it had hallucinated not only the case names but then when the lawyer asked for chat g b t to give me the case itself, that it hallucinated the entire case. So the, the lawyer had not actually read the case. It just read the chat, G b t hallucination of that case. And so, rule number one is for any profession and especially the law, is use the right tool for the job and chat. G P T is not the right tool for legal research the right way to do this. And to the, to answer the person's question, like how do you avoid the hallucinations is you know, option number one is to Schwartz is to ask chat j p t.
(00:51:38):
Option number two is to go to a legal database like I have to my billion legal documents, and then use retrieval augmented generation to save for breach of contract in the northern District of California on a case that relates to you know, coders suing over licenses. You then put that into vector space and retrieve the 12 cases that relate to that question. And then from those 12 cases, say, go ahead and summarize those 12 cases and not the world that G P T has ingested, but just limit your universe to those 12 cases. And then when you do that large language models like G P T, like Lama and others, the all almost never hallucinate because you've constrained the universe. So I think that's the solution is to not ask you know, creative things, don't, don't me but instead be able to constrain it to a retrieval augmented generation that will not hallucinate.
Doc Searls (00:52:33):
And
Damien Riehl (00:52:36):
I'm not sure if I could've said that. And that is a, that is a, that BSS word is something that's been in lots of academic paper. So I hope that's okay for your audience,
Doc Searls (00:52:45):
<Laugh>. So did we cover Getty images versus stability ai? I don't think
Damien Riehl (00:52:53):
We have or
Doc Searls (00:52:55):
Your, your, your own argue AI to derivative works which is also pretty freaking interesting.
Damien Riehl (00:53:01):
Yeah, I, I would say, yeah. So the Getty Getty images much like the coding you have a question as to is the ingestion of all the images taking the ideas of the thing or the expression of those ideas. And so really, you know, there's a, a, the Getty Images stability, ai case has a famous picture of soccer players you know, doing a soccer player thing. And then the Getty Images logo at the bottom of that. So I would say that, you know, a soccer player doing soccer player things is merely an idea, is not the expression of it, the idea. But then the Getty images when that's at the bottom of the many of these why is the large, why is the generative AI outputting image Getty Images logo? It's because it's in the hundreds of thousands, if not millions.
(00:53:44):
That logo is in millions of things. So the, the model has said, oh, if you look at a soccer player, Getty Images' logo is an attribute of a soccer player. So it's essentially thinking that the Getty Images logo is an idea that it is then reproduced in the output. So really the, I, if I were the lawyers representing sustainability ai, I would say that you know, if common un copyrightable Getty Images logo is pretty common similarly if if just it is the idea of a soccer player kicking a ball that is not copyrightable. Therefore you can't sue about that either, but
Katherine Druckman (00:54:22):
The original image but I suppose, yeah, I mean, it's, it's generated, but the original images are expressions of ideas. I mean, they're created by a photographer, are they not?
Damien Riehl (00:54:34):
They are. That's right. And so that, that gets back to, is, is a book ingested by Google Books copyrightable? Right? Of course, it's right <affirmative>. But but what is Google Books extracting out of it? And is the output of Getty, of the stability ai is the output of that the expression of that particular photograph, or the idea of that photograph and 10,000 others that are of a soccer player kicking a ball?
Katherine Druckman (00:54:56):
Right. That it is trained by expressions, which it, from which it extracts ideas.
Damien Riehl (00:55:01):
That's exactly right. And those ideas in in geek speak, and I know that we can speak geek geek here they are in in transformer weights. That is on the the, that's the the vector space. The idea of soccer player ness lives somewhere in vector space. Right. And
Katherine Druckman (00:55:16):
Vector weights far more comfortable with that speak than the legal speak <laugh>.
Damien Riehl (00:55:20):
Indeed. That's
Katherine Druckman (00:55:21):
A better way to
Damien Riehl (00:55:22):
Understand it. But, but yeah, so that, that's one of the things that, you know, geek lawyers like me you know, you have to translate the geek speak to the lawyers. Yeah, that's great. And the judges and the juries who certainly don't know the geek speak and so, you know, that's, that's what makes tech lawyers jobs hard is because you have to take this arcane technological thing to an arcane legal thing and try to explain it to normal humans that don't care about any of those.
Katherine Druckman (00:55:45):
And they're doing the Lord's work, as they say, <laugh>. That's right. That's right. Somebody's got to do that. Right.
Doc Searls (00:55:51):
So we're getting down toward the end of the show here, and I'm wondering if there are any, any topics we have left uncovered here that you'd like to cover before we get off? Because, and, and they put that into context, maybe it might make a stronger question. It's two or three years from now. I mean, it seems like you're, you're living in the future of more than anybody, <laugh> a lot of other people we know. And it's sort of normalizing on where we're gonna have to go. And so, and sort of putting the, the, putting the, the stones in the path that we're gonna step on and are way across the, the raging waters. What are we be talking about two years, three years from now, or even a year,
Damien Riehl (00:56:30):
I think, of large language models and generative AI largely as being a, a tidal wave that mm-hmm. <Affirmative> we can run in front of for a while. But maybe eventually the wave is just gonna crash over us. <Laugh>, that's, that's my pessimistic view, right? That's my scarcity view. But I think that, you know, we if we think about what we as humans provide as value to the world the question we should be asking is, what can I do that the machine cannot yet do? And whatever that is, chase that until the machine can then do that thing, and then we move on to the next thing. So I think that what generative AI has done for lawyers maybe doctors and writers certainly has said, gosh, you know, if this gives a pretty good first draft, like I'm doing a first draft what good am I?
(00:57:14):
And my answer to that is, you know, do better the G P T, which today isn't that hard. And people don't buy your books because you've just output generalized texts. They out, they buy your books because of your voice, your human voice that they can relate to. And so if you can do that, that the machine can't yet do, then you're doing well what somebody you know, will say Bob Dylan, right? Will somebody do a machine will they buy a machine output that someone said create a song in the style of Bob Dylan with a melody that sounds like Bob Dylan. Is anybody gonna pay money for that? No. They can see, they care about Bob Dylan <laugh>. They don't care about a machine's mm-hmm. Synthesis of Bob Dylan. So anyway, so if, if we are not gonna have this tidal wave crash over at us, and if we're not gonna be in this hellscape, it will be because humanity will realize that there are human things that the machines can't yet do that that we should keep chasing for as long as we can before the wave crashes over us.
Doc Searls (00:58:10):
And there seems to be no limit to the number of human beings. We imitate Bob Dylan as well, <laugh> without, without, without being Bob.
Damien Riehl (00:58:17):
That's right. And, and think about if, if Bob were to sue those humans would he win? No. because you can't they, you're just stealing the idea of Bob Dylan. You're not stealing copyrightable things,
Doc Searls (00:58:28):
And people are walking around with ideas about everything at all times. Anyway, the kind of profiling that we use to understand the world, and that AI is used as well, they just ingest all of it and cough backs something similar. My own thought about this. I'm, I'm an optimist. I'm, I'm I, two things about that. One was, I was sitting in a, a class by Clay Shirkey at N Y U where you tasked students to come up with ideas for things. And one student said, I can't give you this idea, 'cause I can think of so many bad uses for it. And he said, it must be a great idea because only great ideas have bad uses. <Laugh>, you know, so like email, what a great idea. 99 point x percent of it is spam, but, you know, fun way to deal with that.
(00:59:12):
But what I'm looking for, what I want, this is sort of a wish I just keep throwing out there, is I want my own damned ai. It's like, I, like, we're in 1974, we haven't invented the personal computer yet, and personal computer is an oxymoron. You know, only mainframes are real computers. Well, it's kinda like, we don't need just chat g p T, we can, I mean, and I, I want AI for my own life here, you know, scan the books behind me here, scan the spines of those, tell me what those are, you know, and let's match 'em up with, with my receipts that I have over in a box here. Love to shove those through a reader that can make sense of all of 'em. And say, this one came from c v s, this one's just, I'm sorry, beep. This one's just, I'm from New Jersey.
(00:59:56):
It can't help it. But, but, you know, you know what I'm saying? But know our own lives, because I think once we know our own lives better, not because they want to go buy the next thing, which is kind of where, where the, the advertising world wants to go and say, you are buying something at all times. We're gonna show ads at all times, and we're gonna understand you as well as possible. So it make you buy more stuff. That's not how the world works. The world works in a much more you know, we're owning things all the time. We're not buying things all the time and a lot of things we own, we don't really own either. You know, we're just renting them or we just have them because they're in our possession, what we think is our possession at the moment. And that sphere of possession as well, I think is something you're blurring very well along with the whole open source community, <laugh> here and why we're still in business with this show.
Damien Riehl (01:00:42):
I think that's right. And that's I, I love your optimistic take on open source in the idea of large language models and those who have been following the space know that, you know, G B T of course was solid, followed soon after by by Dolly, d o l l y, version two which is free and open source, m i t licensed which is followed by m pt, free and open source and then followed by of course, llama which is free and open source up to a certain point. So all of those models, at least for legal tasks, I've heard and people that I know better say that that's about G P T 3.5 like, so it's not quite G P T four, but it's about 3.5, but it's all free.
(01:01:19):
It's all open source. So I think that the open source community has really done a fantastic job of chasing the proprietary models and giving you doc and you Catherine, and anyone who on this this podcast who wants to do it to download to their own machines and be able to have their own their own large language models to do with it what they want. Mike Bato, who is the guy who beat the bar exam, you might've heard that G P T four beat 90% of humans on the bar exam. That's my friend Mike Bato along with Dan Katz and others. Mike, I was on a Stanford in a room in Stanford with him. He downloaded one of those open source models onto his laptop, and then he ran the bar exam through that open source model, and it got the first five, 100% correct. Free and open source beat the bar exam as well. So really I think that the future doc that you're looking for is, is probably not too far behind. It's not gonna be as advanced as the proprietary models probably but it's gonna be following close behind, and I think that's that open source world good for all of us.
Doc Searls (01:02:22):
I wanna just I think we actually are outta time at this point. Are, are, are very close. What this, what Jonathan also likes to ask me has in this time what's the weirdest use you've seen so far of any of these incredible goods that you're creating?
Damien Riehl (01:02:39):
I've talked about my friend Mike Bato a lot. The weirdest use that I've seen thus far is also from Mike Bato, where he took a very dry topic that is the Federal Register. And anyone who's read the Federal Register is just you know, federal regulations come out by the obscure federal agency, and it's in the driest lawyer speak you've ever heard. So he used a large language model to take the federal register and said, summarize this like a chill pirate lawyer. And the output is saying, you know, sorry to, to cruise your morning tide, but here's the updates from the and you know, and created this California pirate speak on dry legal language that was really amazing for for people to read. It's weird, but it's also indicative of how weird our large language model future is going to be.
(01:03:29):
Because it can take, essentially we were talking about ideas. There's a great cartoon that is this guy said, Hey, I took my one bullet point and turned it into an email that that nobody reads <laugh>, right? And then the person on the receiving end said, look, I took this email and I turned it into one bullet point that I could pretend that I read. And, and so I think that the idea of that, it started with a bullet point. We put something in the middle, and then it ended in a bullet point is indicative of where the world is going. Because really, why did we even deal with the thing in the middle? Why did we deal with a chill pirate lawyer? Why did we chill with this? Why don't we start with the idea and then explain it like a third grader, or explain, explain it to me like I'm a a, you know, a, a kindergarten teacher, right? If, if all that matters is the ideas, maybe the ways that we communicate are gonna be vector, embeddings, <laugh>, right? This is merely the ideas themselves. And then you can interpret my vector embedding in however you want. You can interpret it like a chill pile pirate lawyer or a kindergarten teacher. And this way I think that's the most weird and profound thing about where we are today, is that ideas matter. The expressions in the middle really don't.
Doc Searls (01:04:37):
Well, this has been a fabulous hour. And before we go, a, a question we ask everybody what are your favorite text editor in scripting language? <Laugh>? It sounds like you do programming at least some.
Damien Riehl (01:04:50):
I, I do, and I am and I, I've been coding. I, I started in basic in 1985 at aged age 10. So, but anyone who works with me will say that I'm the worst coder they've ever met, and they're probably right. So I would say that that I, I dabble with Python and I use vs. DSS code for an i d but I'm, I'm not good. Copilot makes me better, but I'm awful
Doc Searls (01:05:10):
<Laugh>. Excellent. Well, thanks so much for being on the show. We will have to have you back. And I suspect it may be by popular demand as well as it's a simple need to catch up on what you're doing ahead of everybody else. <Laugh>,
Damien Riehl (01:05:26):
I'm grateful for having me on. Thank you so much.
Doc Searls (01:05:28):
Yeah, thank you. So, Catherine how was that for you? Different than I? That
Katherine Druckman (01:05:36):
Was great. I, yeah. Yeah, you know, it's there's, so, I mean, well, the, a lot has changed in the last three years. I, you know, we, we keep saying time has no meaning, but thing, many, many things have changed. The, the whole conversation around AI is, well, I I, we weren't having it really in 2020. I mean, some of us were, I mean, we were, it was all pandemic. We were identifying objects in our, in our surveillance cameras, and we were, we were thinking, we were talking about things like you know, bias and, and, and privacy issues and, and things that, you know what was the, the thing that was using all of our flicker photos and then identifying people mm-hmm. So then we were kind of having that con, but generative AI has completely, completely changed that that conversation has completely changed. And so, and with those changes come big legal questions. So it's really nice to have somebody with that kind of expertise to weigh in on things that, you know, many of us are just completely clueless about. Or even if, you know, if we know something, it's a, just the tip of the iceberg.
Doc Searls (01:06:34):
There's a, there's a sort of two metaphors, you know, the, the genie that's come out of the bottle that actually has more bottles and more genie, and it doesn't stop. And the sourcers apprentice, which is sort of the same thing, <laugh>, you know? Yeah. Only the, the sourcers not gonna come back and, you know, and turn, turn all those shards into one thing. Again, I think this is, this is we're in, we're in new territory here. Yeah. So the interesting thing to me, yeah, go ahead. Sorry.
Katherine Druckman (01:07:01):
No, no. It just is so new. It's just my really, my mind's blown every day.
Doc Searls (01:07:06):
So, okay, so we actually went a bit long, so gimme your plug and oh,
Katherine Druckman (01:07:11):
Right, sure. Other podcasts that's the only thing I have to plug. Please listen to open at Intel and Reality 2.0 if you enjoy the sound of my voice.
Doc Searls (01:07:21):
<Laugh>, which, which is you new microphone. I, I have, there's
Katherine Druckman (01:07:25):
My new microphone is
Doc Searls (01:07:26):
Better, better than ever. So and, and I, I'm actually prepared this time that a guest next week is back also by a kind of popular demand. Somebody told me that they not only met an in North Carolina, but actually <laugh> that actually she loves to listen when Dave Tat is on. So Dave Tatts gonna be on next week. We're talking space fun hacks projects. We're gonna geek out with Dave next week. So that's coming up. Dan Lynch is gonna be the co-host, so come back for that. Until then, I'm Doc Ss and we'll see you then.
Scott Wilkinson (01:08:04):
Hey there. Scott Wilkinson here. In case you hadn't heard, Home Theater Geeks is Back. Each week I bring you the latest audio, video news, tips and tricks to get the out of your AV system product reviews and more. You can enjoy Home Theater Geeks only if you're a member of Club Twi, which costs seven bucks a month. Or you can subscribe to Home Theater Geeks by itself for only 2 99 a month. I hope you'll join me for a weekly dose of home theater geekitude.