Summary
Court records in an ongoing lawsuit reveal that Meta staff allegedly downloaded 81.7TB of pirated books from shadow libraries like Z-Library and LibGen to train its AI models.
Internal messages show employees raising ethical concerns, with one saying, “Torrenting from a corporate laptop doesn’t feel right.”
Meta reportedly took steps to hide the activity.
The case is part of a broader debate on AI data sourcing, with similar lawsuits against OpenAI and Nvidia.
NYPL has around 10 million books and an additional 10 million manuscripts in its collection. Over 54 million total articles for lending.
Not the largest by far, but still mind boggling in size.
To torrent and ingest something of that size is crazy.
Damn, that’s huge.
Never seen a library that big before. The university here has about 1.5 million and that’s a big library.