this post was submitted on 24 Jun 2025
634 points (98.9% liked)

Technology

72041 readers
2791 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related news or articles.
  3. Be excellent to each other!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, this includes using AI responses and summaries. To ask if your bot can be added please contact a mod.
  9. Check for duplicates before posting, duplicates may be removed
  10. Accounts 7 days and younger will have their posts automatically removed.

Approved Bots


founded 2 years ago
MODERATORS
(page 4) 50 comments
sorted by: hot top controversial new old
[–] [email protected] 39 points 4 days ago* (last edited 4 days ago) (4 children)

Gist:

What’s new: The Northern District of California has granted a summary judgment for Anthropic that the training use of the copyrighted books and the print-to-digital format change were both “fair use” (full order below box). However, the court also found that the pirated library copies that Anthropic collected could not be deemed as training copies, and therefore, the use of this material was not “fair”. The court also announced that it will have a trial on the pirated copies and any resulting damages, adding:

“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages.”

[–] [email protected] 6 points 4 days ago (22 children)

So I can't use any of these works because it's plagiarism but AI can?

[–] [email protected] 19 points 4 days ago (6 children)

My interpretation was that AI companies can train on material they are licensed to use, but the courts have deemed that Anthropic pirated this material as they were not licensed to use it.

In other words, if Anthropic bought the physical or digital books, it would be fine so long as their AI couldn't spit it out verbatim, but they didn't even do that, i.e. the AI crawler pirated the book.

load more comments (6 replies)
load more comments (21 replies)
load more comments (3 replies)
[–] [email protected] 22 points 4 days ago* (last edited 4 days ago) (2 children)

Ok so you can buy books scan them or ebooks and use for AI training but you can't just download priated books from internet to train AI. Did I understood that correctly ?

[–] [email protected] 5 points 4 days ago (4 children)

Make an AI that is trained on the books.

Tell it to tell you a story for one of the books.

Read the story without paying for it.

The law says this is ok now, right?

[–] [email protected] 7 points 4 days ago (6 children)

As long as they don't use exactly the same words in the book, yeah, as I understand it.

load more comments (6 replies)
load more comments (3 replies)
load more comments (1 replies)
[–] [email protected] 16 points 4 days ago (3 children)

Makes sense. AI can “learn” from and “read” a book in the same way a person can and does, as long as it is acquired legally. AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?

Some people just see “AI” and want everything about it outlawed basically. If you put some information out into the public, you don’t get to decide who does and doesn’t consume and learn from it. If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.

[–] [email protected] 5 points 4 days ago (19 children)

AI can “learn” from and “read” a book in the same way a person can and does

This statement is the basis for your argument and it is simply not correct.

Training LLMs and similar AI models is much closer to a sophisticated lossy compression algorithm than it is to human learning. The processes are not at all similar given our current understanding of human learning.

AI doesn’t reproduce a work that it “learns” from, so why would it be illegal?

The current Disney lawsuit against Midjourney is illustrative - literally, it includes numerous side-by-side comparisons - of how AI models are capable of recreating iconic copyrighted work that is indistinguishable from the original.

If a machine can replicate your writing style because it could identify certain patterns, words, sentence structure, etc then as long as it’s not pretending to create things attributed to you, there’s no issue.

An AI doesn't create works on its own. A human instructs AI to do so. Attribution is also irrelevant. If a human uses AI to recreate the exact tone, structure and other nuances of say, some best selling author, they harm the marketability of the original works which fails fair use tests (at least in the US).

load more comments (19 replies)
[–] [email protected] 9 points 4 days ago (1 children)

Ask a human to draw an orc. How do they know what an orc looks like? They read Tolkien's books and were "inspired" Peter Jackson's LOTR.

Unpopular opinion, but that's how our brains work.

load more comments (1 replies)
load more comments (1 replies)
[–] [email protected] 6 points 4 days ago (1 children)

What a bad judge.

This is another indication of how Copyright laws are bad. The whole premise of copyright has been obsolete since the proliferation of the internet.

[–] [email protected] 7 points 4 days ago (3 children)

What a bad judge.

Why ? Basically he simply stated that you can use whatever material you want to train your model as long as you ask the permission to use it (and presumably pay for it) to the author (or copytight holder)

[–] [email protected] 2 points 4 days ago* (last edited 4 days ago) (13 children)

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

They may be trying to put safeguards so it isn't directly happening, but here is an example that the text is there word for word:

[–] [email protected] 3 points 4 days ago (6 children)

If I understand correctly they are ruling you can by a book once, and redistribute the information to as many people you want without consequences. Aka 1 student should be able to buy a textbook and redistribute it to all other students for free. (Yet the rules only work for companies apparently, as the students would still be committing a crime)

Well, it would be interesting if this case would be used as precedence in a case invonving a single student that do the same thing. But you are right

load more comments (6 replies)
load more comments (12 replies)
[–] [email protected] 1 points 4 days ago (1 children)

Huh? Didn’t Meta not use any permission, and pirated a lot of books to train their model?

[–] [email protected] 3 points 4 days ago

True. And I will be happy if someone sue them and the judge say the same thing.

load more comments (1 replies)
[–] [email protected] 31 points 4 days ago (5 children)

It's extremely frustrating to read this comment thread because it's obvious that so many of you didn't actually read the article, or even half-skim the article, or even attempted to even comprehend the title of the article for more than a second.

For shame.

[–] [email protected] 1 points 4 days ago

"While the copies used to convert purchased print library copies into digital library copies were slightly disfavored by the second factor (nature of the work), the court still found “on balance” that it was a fair use because the purchased print copy was destroyed and its digital replacement was not redistributed."

So you find this to be valid? To me it is absolutely being redistributed

[–] [email protected] 6 points 4 days ago

It seems the subject of AI causes lemmites to lose all their braincells.

[–] [email protected] 7 points 4 days ago (1 children)

I joined lemmy specifically to avoid this reddit mindset of jumping to conclusions after reading a headline

Guess some things never change...

load more comments (1 replies)
[–] [email protected] 7 points 4 days ago

Nobody ever reads articles, everybody likes to get angry at headlines, which they wrongly interpret the way it best tickles their rage.

Regarding the ruling, I agree with you that it's a good thing, in my opinion it makes a lot of sense to allow fair use in this case

[–] [email protected] 23 points 4 days ago

was gonna say, this seems like the best outcome for this particular trial. there was potential for fair use to be compromised, and for piracy to be legal if you're a large corporation. instead, they upheld that you can do what you want with things you have paid for.

[–] [email protected] -2 points 4 days ago (1 children)

This 240TB JBOD full of books? Oh heavens forbid, we didn’t pirate it. It uhh… fell of a truck, yes, fell off a truck.

load more comments (1 replies)
load more comments
view more: ‹ prev next ›