Do you feel the same about spreadsheet software reducing the need for accountants? What about textile workers creating fabric for clothing by hand? Or do we only romanticize artists such that they're entitled to an income?
Of course not - you are confusing competition with theft.
It's the difference between you writing a piece of competing software and taking my market share, versus you stealing my software and selling it as your own.
Or you attempting to write a competing best seller versus copying mine wholesale and selling under your own name on Amazon.
One activity is ultimately creative destruction that pushes society forward - the other is simply destructive free riding.
ie theft involves no creation of value.
Now the situation with tech companies is that there is theft, but also value add ( like somebody stealing your book and putting a better cover on it online ).
Just because there is some value add doesn't forgive the theft.
Did you edit your post above? Even slightly? Because I was replying specifically to the idea that artists have a right to a livelihood, and I think you changed a few words to add the copying bit. I should be more consistent about pasting in the text I'm replying to.
Regardless, you really should learn the difference between "theft", "copyright violation", and "training", because they are different things.
If I read a bunch of books, learn from the authors' use of word phrasing, then pick one of Vonnegut's 8 Story Shapes, and make a book in some author's style, it's not illegal. I don't see why I can't have a computer do that for me.
None of this really matters. If you feel strongly about it, you should go bribe congress to make a law. Because the existing laws about theft and copyright don't cover "learning from billions of examples and interpolating or extrapolating from them".
I edited it to make it more readable - not changed it's meaning.
In terms of the substance - bottom line they have taken something without permission and sold it on. Sure - they have added value in the process - but if I steal your car and mash it up with another one before selling it on - it's still theft.
The original post was implying there was no harm as a result because it's just copying - the original owner was not deprived of anything.
My point was they are potentially being deprived of a living - and that's through stuff being taken without permission - not through fair competition.
What matters here is not the semnatics of theft or copyright or whatever - what matters is fairness - and I accept that's a judgement.
I don't see a problem with these companies having to either pay to incorporate material into their models or/and the authors having the right to refuse to license.
Note - that's not to stand in the way of the development of these tools, but to ensure that the effort that went in to creating them ( which includes the generation of the source material ) is properly rewarded.
If OpenAI etc al think creation is a trivial part and it doesn't need rewarding - they are free to bootstrap their models by creating all the inputs from scratch.
Perhaps you think it's fine if I took a copy of ChatGPT model without permission and started a competing service - which was cheaper because I didn't have to pay for the training costs?
They haven't lost anything - just took a copy.....
Are they going to stand in the way of me making the output of chatgpt more widely available through my cheaper pricing?????
And note I'm selling access to the output - which is different everytime ( I use a different random number seed from them ) - so I'm not selling the copy of model per se...... perfectly fair use....
> Perhaps you think it's fine if I took a copy of ChatGPT model without permission [...]
There are laws about copyright and trade secret.
> They haven't lost anything - just took a copy.....
Correct. This is why it's a copyright violation and not a theft.
> [...] fair use....
Fair use is also a legal term, and it has some (reasonably) specific meanings. It's noteworthy that the large copyright protection industries don't respect those terms and have automated DMCA takedowns to abuse people for things which age legal:
"the fair use of a copyrighted work, including such use by reproduction in copies or phonorecords or by any other means specified by that section, for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright."
> This is why it's a copyright violation and not a theft.
But is it? Copying the OpenAI model is only potentially copyright - as you have to prove it's not exempt via fair use etc. Note I'm not selling it on - I'm just selling the output - which isn't soley determined by the model - it's determined by the model plus random numbers plus context - what I'm selling is only partly determined by the source model I copied.
Now if I copied it and used it to undercut your original buisness - then clearly that's not fair use - but that's rather my point no?
These companies have clearly copied source material without permission on a huge scale - but because it's copying and the people haven't lost the original - there is in effect another test - do the original people lose out as result etc.
It's quite clear - say in the news industry which might be supported by advertising - that copying content and then presenting a summary version so that people never visit the source material is clearly damaging the underlying copyright holders.
> I edited it to make it more readable - not changed it's meaning.
I think you might have added the copying bit, and that changed its meaning if not the whole topic. Then you claim I'm confusing competition with theft instead of addressing the "right to a living" part. That's kind of insincere and dishonest on your part, but fine, this other topic is interesting to me too.
I have no idea whether OpenAI, Google, Meta, Anthropic, or any other company got valid licenses for all of the books they trained on. If they didn't, they likely broke some specific laws. Go after them for that if you want. This is copyright violation, not theft.
But if I legally obtain a hundred books and pay a really smart kid to read them all to learn their style, then I pay that kid to write a new book using the style he's learned, that all seems fair and legal to me. It's the way things have been for a very long time.
For any argument you're going to make about this, please imagine a really smart kid doing it instead of a computer. And if you think there should be different laws for computers vs really smart kids, go get it into legislature.
> That's kind of insincere and dishonest on your part,
Ad hominem attack - great.
> This is copyright violation, not theft.
I'm arguing it's copyright violation because it's theft of revenue. If it didn't result in any loss of revenue then it would hard to argue it wasn't fair use.
Note I'm not using theft in any special legal sense - just in the common sense English sense.
If 'learning their style' included incoporating large recognisable chunks - that smart kid would fail his English degree on plagarism grounds.....
The point about LLM's is they an do everything - from an unrecognisable 'original' mashup - to what are quite clearly regurgations of the input. Note also that the kid didn't steal the books he learnt from.....
The question is what's happening fair and good for society, not what is convenient for some very well funded companies in a hurry who see existing laws as annoying things getting in their way, rather than something to respect.
No, it was not. You moved the goal posts from artists deserving a livelihood to copyright issues, and I called you out for that.
> Note also that the kid didn't steal the books he learnt from.....
I should note this? I explicitly stated it as part of the hypothetical.
As I said above, there are already laws about copying. If you're sure they broke those laws, maybe you should criticize the powers that be for not enforcing them.
> The question is what's happening fair and good for society
I think this is a good question.
Personally, I think the benefit from having automated tutors that are attentive and patient and can answer questions about almost any topic known to man dwarfs the benefit from defending intellectual property. I hope they get cheap enough to be accessible to every person who can't afford a traditional education and accurate enough that we trust them more than typical teachers (not a high bar, unfortunately).
I donated to Wikipedia for years specifically because of its educational value while being freely available. Watching people I know learn from LLMs, and do useful interesting things with what they learned, I think the potential is much higher.
>Personally, I think the benefit from having automated tutors that are attentive and patient and can answer questions about almost any topic known to man dwarfs the benefit from defending intellectual property.
Why is it one or the other? Your argument is like saying we shouldn't pay nurses a fair wage because it get's in the way of great care for everyone.
It's not an either/or situation - it's how you allocate rewards for the different contributions to the new tech. Currently tech companies are saying there is zero value in that training data - that's clearly not the case.
I think I finally understand your misunderstanding - I'm not arguing AI should be banned as it destroys a musicians job because in the future all music will be AI generated. That's not my concern - I'm not saying anyone deserves a job in perpetuity.
My point is simply that in building the models they have to respect the current laws - and that means respecting the content owners rights and either paying what they ask or not using it.
> Your argument is like saying we shouldn't pay nurses a fair wage because it get's in the way of great care for everyone.
This argument strategy, where you make a strained analogy/metaphor, and then apply it back to the original topic - it's fragile and depends on how comparable the two ideas are. If you're just interested in winning discussions, it's a bad tactic because it opens up a whole new avenue for your opponent to attack.
Can I COPY nurses into equally valuable robots? Because if I can, then yeah - the world would be a MUCH better place with abundant and affordable nurse robots, and the human nurses can go find other jobs. I have some friends who are nurses, and after watching them fight with the medical system for their own health issues, I'm pretty sure they'd agree.
Picking at tangential points while avoiding the main argument hmm....
Admit it - you misunderstood my original point and accused me of then changing the argument.
Bottom line - the original poster was implying there was no harm because simple copying doesn't create a loss. I was pointing out that a key test ( in considering copyright issues ) is whether such an action causing harm - and in this case there are many very good cases to be made about resulting loss of revenue.
Let's be clear, I think LLMs etc are a huge technical advance - I just think it's wrong to try and ignore the law because it get's in the way of large companies attempts to make money.
> Picking at tangential points while avoiding the main argument hmm....
I've tried (and occasionally failed) to avoid the parts of what you wrote which were just the typical flame war bait. And of course I'm guilty of trying to antagonize you in a few places. The topic is interesting, but our conversation about it was not.
I appreciate the link to the UK law, but the rest of this comment thread is mostly two people talking past each other.
Because that's how it works in reality. Once the copyright holders get their teeth in something, it gets paywalled. For instance, poor people don't have (free/legal/easy) access to lots of research papers/articles which were paid for with government grants. And copyright industry associations (MPAA, RIAA, CCC, AAP, ...) lobby to extend the laws so that creative works take lifetimes to enter the public domain.
You think you're arguing in favor of the little guy who made a series of blog posts or digital art? That's naive.
> My point is simply that in building the models they have to respect the current laws
So go enforce those laws. The rent seekers will thank you.
> So go enforce those laws. The rent seekers will thank you.
Seems you have bought into the idea idea that companies like Google, Facebook and Microsoft are the poor little guys. Wow.
What we are talking about here is certain companies trying to gain a defacto monopoly on the sum of human knowledge - without paying any of those people who built it in the first place.
This is the real story.
Now it may well be their moat isn't as big as they thought it was and the greedy investors trying to do this heist will fail - but that's what they are attempting - and you are cheer leading for it.
> No, it was not. You moved the goal posts from artists deserving a livelihood to copyright issues, and I called you out for that.
Nope. I never said artistic's deserve a living - I said that people deserve protection from their living being stolen via copyright violation. You are confusing what I said, to what you mistakenly understood. I don't understand why your original misunderstanding is somehow a character flaw of mine.
Note how there is an exception for automatic processing - but only for non-commerial use and note the researchers still have to pay for access.
There is also a 'fair dealing' clause.
The key clause here is:
"does using the work affect the market for the original work?"
Content creators are strongly asserting that what these tech companies is doing does.
Also
"s the amount of the work taken reasonable and appropriate? Was it necessary to use the amount that was taken? Usually only part of a work may be used"
Obviously the entire works are being consumed.
> Because the existing laws about theft and copyright don't cover "learning from billions of examples and interpolating or extrapolating from them".
Laws are written in a way where they attempt to predict the future - they are written from a first pirnciples approach - such as the fair dealing clause above - so your claim that it doesn't explicitly ban it is a red herring.
I assumed you were in the US, where companies like OpenAI, Google, and Anthropic are located. If they're breaking UK laws, you should appeal to your government to enforce those laws in your territory, or ban access to them, or whatever you think is relevant.
> [...] so your claim that it doesn't explicitly ban it is a red herring.
Why should I care about your laws any more than any other country I don't live in?
The point I'm trying to make is the ways the laws are structured leaves them open to interpretation - and that's deliberate because trying to nail down everything up front is bound to fail and will allow people to evade the spirit of the law.
ie what matters - both in interpreting todays laws and if they are insufficiently clear, drafting future amendments - is why the law was created in the first place.
The UK law is well drafted and makes it very clear that the aim of copyright is not to stop you copying something per se - but to copying something for commercial gain that is simultaneously damages the copyright owner. I suspect those principals are the same the world over - whatever the exact drafting.
So the question you have to address is do the actions of the AI companies fit these critieria.
Are they copying copyright material without permission - tick.
Are they making money as a result ( it's a commercial operation ) - tick.
Are they damaging the original copyright holders in the process - this is the only one remotely in doubt - but I'd argue it's pretty clearly a tick in many areas - bit bit less clear in others.
In terms of lobbying - it's the big tech companies that are currently trying to get the law changed in the UK - to make what they have done legal ( while stil arguing they haven't done anything wrong..... as laws are not typically changed retrospectively ).
From what I read, I agree. However, UK law isn't very relevant to the matter. It's a small market among many that doesn't create the models or much of the content to train them.
Perhaps - though the UK punches above it's weight in UK English language cultural output - I'm sure you have come across some of it.
By the way, once the UK ruled large part of the world - including the US - but that empire wasn't sustainable for a small country as the other countries caught up in terms of development.
The US is currently facing that issue, and I have to say, not dealing with it very well. The US is going to need friends on the way down and right now all it's doing is making enemies.
Yeah, the US is doing some terribly stupid stuff. No disagreement there.
However, the UK might be more relevant if you hadn't withdrawn from the EU. From the outside it sure looks like you guys decided it was more important to keep out the poor people (or other ethnic backgrounds) than to be part of something with actual collective bargaining power.
The debate about leaving the EU was multi-faceted [1] - but a significant part was a kind of nostalgia for when Britain was indeed Great, and the idea that a UK free from the shackles of the EU could be great again. Take back control was the slogan - a complete misunderstanding of the difference between lost sovereignty and pooled sovereignty.
I see the same forces driving the US now as it undermines international organisations.
[1] and sure xenophobia played a depressingly large part.
I hadn't thought about it before, but I think the nostalgia angle can explain a bunch of the US attitudes. It seems like a lot of people across the spectrum think the 1950s were a better time: The bigots because of race issues, the incels because of sexual expectations, the young generations envying (and resenting) how the boomers "had it easy", and the hippies thinking new technology is ending the world.
I think they're all wrong, but there's no fixing it. Assuming there are no civil or world wars in the near future that radically change the trajectory, China will rise and the US will end up lower on the ladder.
> nostalgia for when Britain was indeed Great
I'm sure you didn't intend it, but that capital G sure sounds a lot like the slogan for a US political party.
Does spreadsheet software have to be trained directly on the copyright-lrptected work of the accountants it's replacing without their consent (and often against their loud protesting)?