Maddening. Fuck you Reddit!

Aer@lemmy.world · 3 years ago

Maddening. Fuck you Reddit!

RarePepeCollector@lemmy.world · 3 years ago

Too easy to scrape that data, the replies don’t update much after 2 days, and even then it’s pretty easy to re-scrape and check. And the data is not owned by reddit, its actually owned by their users. So if MS wants to scrape it, they need copyright permissions from the user, not reddit.

PrimalAnimist@lemmy.world · edit-2 3 years ago

True, users do maintain copyright of anything they write, but they also give reddit license to use it how it wants, including sub-licensing it to others. That means the corps absolutely DO NOT need the permission of users to train their AI. They just buy the rights to use the data from reddit.

When Your Content is created with or submitted to the Services, you grant us a worldwide, royalty-free, perpetual, irrevocable, non-exclusive, transferable, and sublicensable license to use, copy, modify, adapt, prepare derivative works of, distribute, store, perform, and display Your Content and any name, username, voice, or likeness provided in connection with Your Content in all media formats and channels now known or later developed anywhere in the world. This license includes the right for us to make Your Content available for syndication, broadcast, distribution, or publication by other companies, organizations, or individuals who partner with Reddit. You also agree that we may remove metadata associated with Your Content, and you irrevocably waive any claims and assertions of moral rights or attribution with respect to Your Content.

See https://www.redditinc.com/policies/user-agreement Section 5. “Your Content”

This includes images and videos that are uploaded to the reddit servers directly.

Reddit has the right to use the data and sell that data to others. Also, some data you can scrape, but there’s additional data that is available only through the API. Web scraping is not reliable, especially if reddit actively flags your spider and blocks it. They are not the idiots we want to believe they are. No mega corp is going to risk not having competitive access to data to feed their AIs when the cost for them to just pay is insignificant.

PhilxBefore@lemmy.world · 3 years ago

This shit was definitely not in the user agreement when I signed up in 2007.