In theory, could you then just register as an AI company and pirate anything?
Well no, just the largest ones who can pay some fine or have nearly endless legal funds to discourage challenges to their practice, this bring a form of a pretend business moat. The average company won’t be able to and will get shredded.
What fine? I thought this new law allows it. Or is it one of those instances where training your AI on copyrighted material and distributing it is fine but actually sourcing it isn‘t so you can‘t legally create a model but also nobody can do anything if you have and use it? That sounds legally very messy.
You’re assuming most of the commentors here are familiar with the legal technicalities instead of just spouting whatever uninformed opinion they have.
You can already just pirate anything. In fact, downloading copyrighted content is not illegal in most countries just distributing is.
That would be hilarious if someone made a website showing how they are using pirated Nintendo games (complete with screenshots of the games, etc) to show how they are “training” their AI just to watch Nintendo freak out.
No, because training an AI is not “pirating.”
If they are training the AI with copyrighted data that they aren’t paying for, then yes, they are doing the same thing as traditional media piracy. While I think piracy laws have been grossly blown out of proportion by entities such as the RIAA and MPAA, these AI companies shouldn’t get a pass for doing what Joe Schmoe would get fined thousands of dollars for on a smaller scale.
In fact when you think about the way organizations like RIAA and MPAA like to calculate damages based on lost potential sales they pull out of thin air training an AI that might make up entire songs that compete with their existing set of songs should be even worse. (not that I want to encourage more of that kind of bullshit potential sales argument)
The act of copying the data without paying for it (assuming it’s something you need to pay for to get a copy of) is piracy, yes. But the training of an AI is not piracy because no copying takes place.
A lot of people have a very vague, nebulous concept of what copyright is all about. It isn’t a generalized “you should be able to get money whenever anyone does anything with something you thought of” law. It’s all about making and distributing copies of the data.
Where does the training data come from seems like the main issue, rather than the training itself. Copying has to take place somewhere for that data to exist. I’m no fan of the current IP regime but it seems like an obvious problem if you get caught making money with terabytes of content you don’t have a license for.
the slippery slope here is that you as an artist hear music on the radio, in movies and TV, commercials. All this hearing music is training your brain. If an AI company just plugged in an FM radio and learned from that music I’m sure that a lawsuit could start to make it that no one could listen to anyone’s music without being tainted.
That feels categorically different unless AI has legal standing as a person. We’re talking about training LLMs, there’s not anything more than people using computers going on here.
So then anyone who uses a computer to make music would be in violation?
Or is it some amount of computer generated content? How many notes? If its not a sample of a song, how does one know how much of those notes are attributed to which artist being stolen from?
What if I have someone else listen to a song and they generate a few bars of a song for me? Is it different that a computer listened and then generated output?
To me it sounds like artists were open to some types of violations but not others. If an AI model listened to the radio most of these issues go away unless we are saying that humans who listen to music and write similar songs are OK but people who write music using computers who calculate the statistically most common song are breaking the law.
A lot of the griping about AI training involves data that’s been freely published. Stable Diffusion, for example, trained on public images available on the internet for anyone to view, but led to all manner of ill-informed public outrage. LLMs train on public forums and news sites. But people have this notion that copyright gives them some kind of absolute control over the stuff they “own” and they suddenly see a way to demand a pound of flesh for what they previously posted in public. It’s just not so.
I have the right to analyze what I see. I strongly oppose any move to restrict that right.
It’s also pretty clear they used a lot of books and other material they didn’t pay for, and obtained via illegal downloads. The practice of which I’m fine with, I just want it legalised for everyone.
I’m wondering when i go to the library and read a book, does this mean i can never become an author as I’m tainted? Or am I only tainted if I stole the book?
To me this is only a theft case.
And what of the massive amount of content paywalled that ai still used to train?
If it’s paywalled how did they access it?
the training of an AI is not piracy because no copying takes place.
One of the first steps of training is to copy the data into the training data set.
This isn’t quite correct either.
The reality is that there’s a bunch of court cases and laws still up in the air about what AI training counts as, and until those are resolved the most we can make is conjecture and vague moral posturing.
Closest we have is likely the court decisions on music sampling and so far those haven’t been consistent, and have mostly hinged on “intent” and “affect on original copy sales”. So based on that logic whether or not AI training counts as copyright infringement is likely going to come down to whether or not shit like “ghibli filters” actually provably (at least as far as a judge is concerned) fuck with Ghibli’s sales.
court decisions on music sampling and so far those haven’t been consistent,
Grand Upright Music, Ltd. v. Warner Bros. Records Inc. (1991) - Rapper Biz Markie sampled Gilbert O’Sullivan’s “Alone Again (Naturally)” without permission
Bridgeport Music, Inc. v. Dimension Films (2005) - any unauthorized sampling, no matter how minimal, is infringement.
VMG Salsoul v. Ciccone (2016) - to determine whether use was de minimis it must be considered whether an average audience would recognize appropriation from the original work as present in the accused work.
Campbell v. Acuff-Rose Music, Inc. (1994) - This case established that the fact that money is made by a work does not make it impossible for fair use to apply; it is merely one of the components of a fair use analysis
That case is about fair use for parody.
Case law suggests using AI for parody is legal.
So streaming is fine but copying not
Streaming involves distributing copies so I don’t see why it would be. The law has been well tested in this area.
Well how does the AI company consume the content?
Which company us “the AI company?”
Copyrighted material can be used or reproduced only with a license that allows for it. If the license forbids you from using the copyrighted material for business purposes, and you do it anyway, then it’s pirating.
Well I agree in principle (I disagree that AI training is necessarily “stealing”), but downloading copyrighted material for which you do not own a license is textbook piracy, regardless of intent
deleted by creator
hello yes I’m an ai company. let me torrent all the things pls thank you
That’s exactly what Meta did, they torrented the full libgen database of books.
If they can do it, anybody should be able to do it.
I like how their whole excuse to that was “WE DIDN’T SEED ANY OF IT BACK THOUGH” which arguably makes it even worse lol.
It doesn’t. You can download anything you want, distribution is what is illegal and criminal.
Downloading is still infringement. Distribution is worse, but I don’t think it’s a criminal matter, still just civil.
Maybe in some weird countries.
Torrent means you download and also upload to others when you have some parts.
No, not really. First of all, you can disable uploads. Second, you can use a seed box hosted in a country which doesn’t prosecute uploaders. So, you can be clean for all legal intents and purposes.
Technically it was never illegal in the US to download copywritten content. It was illegal to distribute them. That was literally Meta’s defence in court: they didn’t seed any downloads.
they didn’t seed any downloads
So Meta, 100% leeching.
Yeah no, only a select few special Ai companies, of course
My mind is AI and I need this content to train it.
I’m not sure if my brain counts as artificial, but with all the microplastics, it sure ain’t organic.
It’s like the goal is to bleed culture from humanity. Corporate is so keep on the $$$ they’re willing to sacrifice culture to it.
I’ll bet corporate gets to keep their copyrights.
Absolute fastest way to kill this shit? Feed the entire Disney catalog in and start producing knockoff Disney movies. Disney would kill this so fast.
With a mercenary death squad, probably.
That’s exactly what i was just thinking.
Where’s Disney in all of this?Probably getting in on it tbh
Good point. I wouldn’t be surprised if they have deals with all the ai companies.
https://futurism.com/the-byte/disney-mocked-fake-cgi-actors-crowd-scene
Using AI to cook up some fake actors as of a couple years ago.
The record companies already have all the data and all the rights. Petitions like these are meant to rig the game in their favor, so we get the official Warner Music AI at a high price point with licensing fees, and anything open source is deemed illegal and cant be used in products.
If you’re on the side that stands with Disney, you are probably on the wrong one.
Can the rest of us please use copyrighted material without permission?
As long as you use AI to generate it
The AI just gives you a 1:1 copy of it’s training data, which is the material. Viola.
Yes.
God I hope so.
On the other hand copyright laws have been extended to insane time lengths. Sorry but your grandkids shouldn’t profit off of you.
It’s never the grandkids. The Beatles sold the rights to their songs.
I mean they were trained on copyrighted material and nothing has been done about that so…
It only seems to make a difference when the rich ones complain.
What is the actual justification for this? Everyone has to pay for this except for AI companies, so AI can continue to develop into a universally regarded negative?
why do you say AI is a universally regarded negative?
Edit: if you’re going to downvote me, can you explain why? I am not saying AI is a good thing here. I’m just asking for evidence that it’s universally disliked, i.e. there aren’t a lot of fans. It seems there are lots of people coming to the defense of AI in this thread, so it clearly isn’t universally disliked.
Because overall people don’t like it, particularly when it comes to creating “art.”
I am aware of a lot of people who are very gung-ho about AI. I don’t know if anybody has actually tried to make a comprehensive survey about people’s disposition toward AI. I wouldn’t expect Lemmy to be representative.
Because pretty much nobody wants it or likes it.
That’s just not true, chatgpt & co are hugely popular, which is a big part of the issue.
Nazism was hugely popular in Germany in the early 20th century, but was it a good thing?
You do realize the root of this thread was this question, right?
why do you say AI is a universally regarded negative?
In the early 20th century, Nazism was not a universally regarded negative.
Hugely popular, mostly with a bunch of dorks nobody likes that much.
People are getting the message now, but when it first came out, there were so many posts about what ChatGPT had to say about the topic, and the posters never seemed to understand why nobody cared.
I want it and I like it. I’ve been using llms for years now with great benefit to myself.
Like any tool one just needs to know how to use them. Apparently you don’t.
I think you’re mistaken – there are a large number of people who vehemently dislike it, why is probably why you think that.
I don’t know the rest but I hate the spending of resources to feed the AI datacenters. It’s not normal building a nuclear powerplant to feed ONE data center.
You’ve explained your personal opinion, and while I think it’s a sensible opinion, I was asking about the universal opinion on AI. And I don’t think there is a consensus that it’s bad. Like I don’t even understand how that’s controversial – everywhere you look, people are talking about AI in broadly mixed terms.
AI doesn’t copy things anymore than a person copies them by attending a concert or museum.
This is such a bizarre rejection of reality
This is 100% correct. You can downvote this person all you want but their not wrong!
A painter doesn’t own anything to the estate of Rembrandt because they took inspiration from his paintings.
So if you take away all the copywrited training data then it makes the same images?
No. And that’s the point…
So if it can’t function properly without other people’s work deciding what the art will look like that’s called copying.
If human beings get shit for copying famous art or tracing we need to hold AI to the same standard.
Copy
- noun a. an imitation, transcript, or reproduction of an original work (such as a letter, a painting, a table, or a dress) b. one of a series of especially mechanical reproductions of an original impression c. matter to be set especially for printing; also: something considered printable (such as an advertisement or news story)
- verb a. to make a copy or copies of b. to model oneself on c. to transfer (data, text, etc.) from one location to another, especially in computing
I can’t believe I just had to provide you with a definition of the word copy.
Are you freaking serious!!!
Being inspired by and creating an original production is not the same as copying if that original work is inspired by other artists!!!
By your definition of copying because Elvis Presley was inspired by Muddy Waters they made the exact same music!
LLMs don’t produce copyrighted material they take inspiration from the training data so to speak. They create original productions.
In the same way that you can envision the Mona Lisa in your head but you couldn’t paint it by hand.
You know copying literal brushstrokes and traces identifiable from real artists is different than being inspired, it’s amazing the level of denial you cultists will self induce to keep it making sense.
Your god is not valuable enough to give more rights than human beings. Sorry
I don’t care what techbro conmen told you.
AI will never be a replacement for actual creativity, and is already being legislated against properly in civilized countries.
You need to learn how your god functions.
If it needs training data then it is effectively copying the training data.
How funny this is gonna get when AI copyrights Nintendo stuff. Ah man I got my popcorn ready.
They’re not gonna do anything about it for the same reason any other litigious company hasn’t done anything thus far. They’re looking to benefit from AI by cutting costs. If the tech wasn’t beneficiary to these big tech conglomerates they would’ve already sued their asses to oblivion, but since they do care they’ll let AI train on their copyrighted material.
“Generate a movie in the style of star wars”
But you, casual BitTorrent, eDonkey (I like good old things) and such user, can’t.
It’s literally a law allowing people doing some business violate a right of others, or, looking at that from another side, making only people not working for some companies subject to a law …
What I mean - at some point in my stupid life I thought only individuals should ever be subjects of law. Where now the sides are the government and some individual, a representative (or a chain of people making decisions) of the government should be a side, not its entirety.
For everything happening a specific person, easy to determine, should be legally responsible. Or a group of people (say, a chain from top to this specific one in a hierarchy).
Because otherwise this happens, the differentiation between a person and a business and so on allows other differentiation kinds, and also a person having fewer rights than a business or some other organization. And it will always drift in that direction, because a group is stronger than an individual.
And in this specific case somebody would be able to sue the prime minister.
OK, it’s an utopia, similar to anarcho-capitalism, just in a different dimension, in that of responsibility.
should start up our own ai company anyone is free to join
I identify as an AI company ☠️
no no, i mean people should actually start utilizing this bullshit. Anyone can start a company and with some technical knowhow you can add somekind of ai crap to it. companies dont have to make profit or anything useful so there is no pressure to do anything with it.
But if it comes to copyright law not applying to ai companies, why should some rich assholes be only ones exploiting that? It might lead to some additional legal bullshit that excludes this hypotetical kind of ai company, but that would also highlight better that the law benefits only the rich.
Thought experiment: What if AI companies were allowed to use copyrighted material for free as long as they release their models to the public? Want to keep your model private? Pay up. Similar to the GPL.
Fun fact: Copyright is also the basis on which you enforce copyleft provisions such as the those in GPL. In a world without copyright, there are no software licenses yet alone copyleft.
I know it’s very challenging for “this community” (FOSS users & developers let’s say) because a significant number of them also support shadow libraries such as Sci-Hub and Library Genesis and Anna’s Archive so how do we reconcile “copyleft (therefore copyright) good” with “copyright bad”?
I don’t have a clear answer yet but maybe the difference is as simple as violating copyright for personal purposes vs business purposes? Anyway…
The GPL uses copyright because it’s the legal mechanism available to enforce the principles that the GPL wants to enforce. It’s entirely consistent to believe that copyright shouldn’t exist while also believing that a law should exist to allow/enforce the principles of the GPL.
That’s fair! Though I find it (new laws that enforce the principles of copyleft) pretty unlikely so I’d much prefer a world with copyright + copyleft (GPL) than a world without either where mega corporations can exploit the commons without being obliged to share back.
It’s literally called copyright because it’s about the rights to copy something. The new law would still be a form of copyright.
Without copyright there would be no need for copyleft. Its right there in the name.
Without copyright there would be no need for copyleft. Its right there in the name.
It sounds plausible but it’s wrong. Without copyright, you are allowed to copy, use, and distribute all digital works regardless but being legally allowed doesn’t mean (a) that you are able to (e.g. copying might be ~impossible due to DRM and other security measures) and (b) that you are entitled to the source code of such work so someone can take your FOSS code, put it in their proprietary software, and then distribute only the binaries.
Copyleft licenses, through copyright, enforce sharing.
The whole point for many, me included, is for everyone to be able to use any works in any way we want. Including putting “open source” code into “proprietary” binaries. Because there are no proprietary binaries without IP protections - everyone can just decompile the code and reuse it.
I don’t think it’s accurate to say that everyone can just decompile the code and reuse it. Decompiling and reverse engineering a binary is incredibly hard. Even if you do that there are some aspects of the original code which get optimised out in the compiler and can’t be reproduced from just the binary.
As someone who has extensive experience with decompiling, I can say that working with binaries is usually a lot easier than with a source code.
“Yeah, well, you know, that’s just, like, your opinion, man.”
The copyright industry would never accept that. Where’s the money for them?
It still devalues the work of individual creators.
Oh good I see Labour are dealing with the real issues in society.
Modern Labour and not giving a fuck about workers, name a more iconic duo.
Please, save the copyright industry! If using these for AI isnt made ridiculously expensive, we will never be able to build a proper monopoly on top of this tech!
They get popular artists to sign these things but its the record companies (all three of them) that are really behind this.