Meta staff torrented nearly 82TB of pirated books for AI training — court records reveal copyright violations

MicroWave@lemmy.world · 1 year ago

Meta staff torrented nearly 82TB of pirated books for AI training — court records reveal copyright violations

aramova@infosec.pub · 1 year ago

NYPL has around 10 million books and an additional 10 million manuscripts in its collection. Over 54 million total articles for lending.

Not the largest by far, but still mind boggling in size.

To torrent and ingest something of that size is crazy.

NotMyOldRedditName@lemmy.world · edit-2 1 year ago

Damn, that’s huge.

Never seen a library that big before. The university here has about 1.5 million and that’s a big library.

stringere@sh.itjust.works · 1 year ago

I had to look it up but the Library of Congress is over 30 million books. If I wasn’t busy working on an exit from this country I would have liked to take my kids to see it.