This video shows that Reddit refused to delete all comments and posts of its users when they close their account via a CCPA / GDPR request.

  • crowsby@kbin.social
    link
    fedilink
    arrow-up
    39
    ·
    1 year ago

    The creator of tildes.net is a former Reddit backend developer, and believes this behavior is likely due to how Reddit caching works (or doesn’t work), rather than an intentional subversion of user intent:

    Yes, this is almost certainly a technical issue. The way reddit caches things probably isn’t the standard way you’re thinking of, like a short-term cache that expires and refreshes itself. There are multiple layers of “cached” listings and items for almost everything, and a lot of these caches are actually data that’s stored permanently and kept up to date individually.

    For example, when you view your comments page, Reddit uses a cached (permanent) list of which comments are in that page. There is a separate list stored for each sorting method. For example, maybe you’d have something like this with some made-up comment IDs:

    Deimos’s comments by new: 948, 238, 153
    Deimos’s comments by hot: 238, 153, 948
    Deimos’s comments by controversial: 153, 238, 948
    If I post a new comment, it will go through each list and add the new ID in the right spot (for example, in the “new” list it always just goes at the start). If I delete a comment, it goes through every list, and removes the ID if it can find it in there.

    One of the problems with this system (which is probably what’s causing @phedre’s issues, and affecting many other people trying to delete their whole history) is that all of these listings are capped at 1000 items. If you already have more than 1000 comments and you post a new one, the 1000th comment currently in the new list gets “pushed off the end”. The comment still exists, but you won’t be able to see it by looking through your comments page, because it’s no longer in that listing.

    Deleting comments also doesn’t cause previously “pushed off” ones to get re-added. If you have 5000 comments, your listing will only include 1000 of them. If you delete 50 of the ones in the listing, your listing now has 950 comments in it. If you delete all 1000 from the listing, your comments page will appear empty, but you actually still have 4000 comments that will be visible in the comments pages they were posted in.

    And this is only one aspect of it. There are also multiple other places and ways that comments are cached—comment trees are cached (order and nesting of comments on a comments page, for all the different sorting methods), rendered HTML versions of comments are cached, API data is probably cached, and so on.

    All of these issues are probably just some combination of all of your posts being difficult to find and access due to the listing limits or certain cached representations of posts not being cleared or updated properly.

    • eleitl@lemmy.world
      link
      fedilink
      arrow-up
      51
      ·
      1 year ago

      Luckily GDPR deletion requests don’t care about how they are implemented. And failures to comply en masse tends to get really expensive.

      • JohnEdwa@kbin.social
        link
        fedilink
        arrow-up
        10
        ·
        edit-2
        1 year ago

        Yup. I’m waiting for Reddit to come back with my GDPR data request (which has a time limit of 30 days, after which they can tell their excuses to extend it by another 30 days I believe), and assuming they have not reversed the API decision I’m ordering them to delete it all afterwards. And they even now have a handy list, the one they just gave me, of everything they have to purge - if they didn’t, it wouldn’t be on that list in the first place :)

        • ja534@kbin.social
          link
          fedilink
          arrow-up
          8
          ·
          1 year ago

          Still waiting for the GDPR request i made at the start of this shitshow, will be funny to witness the mass GDPR deletion requests of accounts at the start of July

        • dan@upvote.au
          link
          fedilink
          arrow-up
          4
          ·
          1 year ago

          It’s been 3-4 weeks since I submitted my CCPA request, and I still haven’t gotten my data yet. CCPA has a time limit of 45 days.

          • abff08f4813c@kbin.social
            link
            fedilink
            arrow-up
            3
            ·
            1 year ago

            That’s what’s so awful about this. Prices were announced May 31st, so for a CCPA request that was done that very instant, they can delay until mid July, when the API changes will make it much more difficult to delete your data, and there’s no recourse.

            Even for GDPR, maybe you’d get it the day before, for the shorter 30 day limit. But a day of a few hours could easily mean you’ve gone past and API is also a problem for you.

            This is some messed up timing, mates.

            • DrNeurohax@kbin.social
              link
              fedilink
              arrow-up
              2
              ·
              1 year ago

              I would hope that someone reaching out to press from ModCoord would pass these concerns on to journalists. A persistent journalist can uncover the extent of compliance to the GDPR and CCPA through proper questions. “Have you seen an increase in GDPR/CCPA requests wince the controversy started? What percent of those have you completed? What about reports that users are unable to delete their data?” etc. (only better because I’m not a journalist and probably oversimplifying).

              • BrikoX@vlemmy.net
                link
                fedilink
                arrow-up
                4
                ·
                1 year ago

                Reddit stopped answering requests for comment from objective journalists.

                People just need to start filling complains with their Data Protection Authority. Then the mainstream media will be forced to cover the stories to get the clicks.

    • PositiveNoise@kbin.social
      link
      fedilink
      arrow-up
      24
      ·
      1 year ago

      Based on this, I’d say that Reddit fully deserves to be banned in Europe and California, and fined into potential bankruptcy. Having deeply flawed technology that prevents them from ever being in compliance of a very serious law is no excuse.

      • lemmyvore@feddit.nl
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Not necessarily, although Reddit can definitely choose to play it that way.

        A lot of systems made in the pre-GDPR era (which is most of them) were not designed with the capability to decouple and erase content at a moment’s notice.

        Btw incompetence won’t hold up as a valid defence for violating GDPR. At most it can give them some stalling room.

    • Economizer@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      Oh God. Somewhat unrelated, but I felt like I knew the name “Deimos” from somewhere. Couldn’t put my finger on it. Finally realized who he was.

  • hiyaaaaa23@kbin.social
    link
    fedilink
    arrow-up
    29
    ·
    1 year ago

    This is one of the many legal issues Reddit now has.
    Reddit is very clearly eying an ipo, but who really wants to invest in this dumpster fire.

    I’m not an investor but I l personally wouldn’t invest in a website as shortsighted as Reddit.

    In an industry as cutthroat as social media having a site as active as Reddit, for 18 years. Should be celebrated.

    In this world, where platforms live and die in the span of single years, why would Reddit throw away a formula that has worked for nearly 20 years.

    • admiralteal@kbin.social
      link
      fedilink
      arrow-up
      14
      ·
      1 year ago

      Easy question to answer: they aren’t profitable and the free money of years of near 0 or 0% interest rates is over. The constant VC dried up and the website is insolvent. They have a massively bloated staff roster. They’re going to die if they don’t make a major change.

      And at the same time, all “traditional” monetization strategies for websites like these just… don’t work with the way Reddit works. Making the changes they need to make will kill the site.

      They never cooked up a monetization strategy that would work for them. They procrastinated. They felt free money would continue forever and underestimated how reliant their site was on volunteer labor. They got distracted by stupid side projects instead of refining the core product.

      Reddit will absolutely survive all this. I expect it to still exist, at the end of the day. But it’ll be smaller, and what remains will be a soulless shit hole. And it’ll still be borderline insolvent.

    • Froyn@kbin.social
      link
      fedilink
      arrow-up
      13
      ·
      1 year ago

      As an investor, I can say with near certainty that the objective is extremely “short” sided.

  • EvilMonkeySlayer@kbin.social
    link
    fedilink
    arrow-up
    26
    ·
    edit-2
    1 year ago

    I wonder at what point people start taking them to court. It seems like the usual idiot tech bro excuse of thinking terms of service/use somehow override the law which is hilariously naive.

    You cannot override the law in a TOS.

    Like if they wrote down that they were allowed to murder you written into their TOS and proceeded to murder you they’d still go to jail for murder.

    • PositiveNoise@kbin.social
      link
      fedilink
      arrow-up
      7
      ·
      1 year ago

      Username’s research checks out.

      (Sorry, I know people are kind of sick of funny tropes that were common on Reddit, but I couldn’t resist. I"ll see myself out now…)

  • What I noticed is that when restoring your comments they prioritize the ones with the most upvotes. Some I even deleted manually before the blackout reappeared too.

    I find this shit to be likely illegal. I understand that we gave Reddit permission to use our content by agreeing to their terms of service, but if my comment was “A” and I edit it so that it displays “B”, it is wrong for Reddit to still display “A” below my username without my authorization. They can exploit the content “A” however they want, but to show it under my username as if it were what I consented to display under my name feels like a breach to me.

  • MentallyExhausted@reddthat.com
    link
    fedilink
    arrow-up
    13
    ·
    1 year ago

    Wow, their legal department shot themselves in the foot putting that in writing. Idiots.

    I submitted a CCPA request weeks ago and have yet to hear from them. They also restored tons of content I deleted.

    Time for a class action yet?

      • JohnEdwa@kbin.social
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        1 year ago

        Sadly probably not. The GDPR fine can be “up to €20 million, or up to 4% of the annual worldwide turnover of the preceding financial year, whichever is greater” which would be around 26 million based on their 2022 revenue. The company has gathered over $1.3 billion in funding and was “valued” at around $10 billion quite recently.

        And that’s only around what a year of API calls would have cost for Apollo so clearly by discontinuing the API they are going to save that amount back in no time!

        • maynarkh@feddit.nl
          link
          fedilink
          arrow-up
          2
          ·
          1 year ago

          Yes, but a fine does not exempt you from compliance. If they are unable or unwilling to comply, the EU can ban them.

          And I’m not talking cutting off EU user access, it’s cutting off money dealing with EU customers, adverisers, etc.

    • TWeaK@lemm.ee
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      Their explanation for restored content will likely be something about the nature of how their CDN works.

      Granted, this excuse won’t hold up much, but it’s probably true and will limit their liability in the sense that it isn’t intentional.

      I’ve deleted my comments multiple times with PowerDeleteSuite and had things come back, a couple times over. However now I’m going through with shreddit (github version) using my GDPR files. It’s taking a long time because things panic every so many comments (I’m backing up everything, on file 75 so far but still 46,000 lines left from a 75,000 line file, however it’s panicking less now that the comments are more recent) and I haven’t had it restore any of the links I’ve checked from that process.


      Reddit changed the way they display comments in the profile a few months back. Now, you only see a limited number of comments under New, Hot, Top & Controversial. These are the lists that most deletion services access. So, if you use PowerDeleteSuite or any other service it will likely miss things. In particular, I opened up links to my older Top comments, ran the script, then found it had completely ignored replies underneath my comment that had low but positive karma - these wouldn’t have appeared on the lists. My new list only went back about 3 months (although I think it’s about number of comments rather than time).

      You really need to use the GDPR files to get everything. These contain CSV files with links to every single post and comment you have. However, it seems that reddit are delaying following through with most requests until after 1 July, when API requests (such as those that shreddit uses) will be blocked.


      Also PSA don’t use the shreddit website, they charge you $15. The github version is free and will take CSV files with the appropriate tag. But, again, in my experience it panics and hangs fairly often, so it will take a lot of work to use. I’ve had to run it, back up the terminal output, use the last link and delete everything in posts.csv and comments.csv before the one it stuck on, then resume with ammended files.

      Reddit really isn’t making it easy to follow through with your rights. Make records of this, then this can be used to convince local Data Protection Authorities to collectively throw down a bigger hammer than Huffman ever wielded, or even imagined.

      Also another PSA, reddit’s terms do not deny you ownership of your content. So even if they try to claim ownership themselves (as Steve Huffman has frequently publicly stated) they cannot deny you the right to edit your content and restrict what they do with it. It’s your information, and reddit hasn’t even paid for it.

      You can’t sell a microwave without paying for the nuts and bolts.

      • DrNeurohax@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        If you’re using the main repo for PDS then you probably have the one that doesn’t pause fro 5 secs between API calls (Reddit’s limit). The first fork version has the pause and works correctly, though slowly. Just be aware that there’s a bug in PDS that stops adding to the exported file if it hits an error (If you have 100 comments and get an error on comment #15 it will continue to edit/delete, but the exported file will only have 14 comments.)

      • May@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        it seems that reddit are delaying following through with most requests until after 1 July when API requests (such as those that shreddit uses) will be blocked.

        I was sooo worried about this and thinking that something like that would be done, back when i saw someone warn in the save 3rd party apps sub that u should request your data. Still i tried making a request bc i thought maybe reddit did not catch on yet or maybe bc it was before the blackout there can still be a chance, but till now i never got the data. :(

        probably i’ll just leave the comments and posts. I did not post a lot.

          • TWeaK@lemm.ee
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            TL;DR get the 1.6GB Pushshift torrent, then edit a script to extract your data, then edit another script to use that data to overwrite your comments.

          • DrNeurohax@kbin.social
            link
            fedilink
            arrow-up
            2
            ·
            1 year ago

            This did not get the traction it should have. It’s probably the best of the dozen-ish methods I’ve seen.

        • TWeaK@lemm.ee
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          You can still use the GDPR files to get at all your comments, you just won’t be able to use existing API methods to automate it. However, perhaps it would be possible to use the links to automate via a scraping method or something - maybe the PowerDeleteSuite method could be expanded upon.

          • abff08f4813c@kbin.social
            link
            fedilink
            arrow-up
            1
            ·
            edit-2
            1 year ago

            Yeah, you are right. It’d be tough to directly modify PDS as that’s javascript in a browser and there are strict restrictions on what JS can do on a filesystem in that case.

            But maybe someone can create a browser extension that does the same job. Extensions have fewer restrictions so maybe it could be fueled by a file.

            Or maybe someone will some up with some kind of shell script that can read the archive and copy & paste the URLs for each of your posts and comments, one by one, into the javascript console of your browser, allowing PDS to take care of the rest (visiting each one and simulating hitting the edit and delete buttons).

            The other issue is that PDS depends on old dot reddit dot com currently from what I understand. If that ever gets dropped, PDS will break until it’s updated to work with new reddit.

        • DrNeurohax@kbin.social
          link
          fedilink
          arrow-up
          1
          ·
          1 year ago

          There’s also a semi-auto delete user script that doesn’t use the API called so-long-reddit-thanks-for-all-the-fish.

          You go to your comments page, click a button, and it performs the actions within the browser. Without any further interaction, you’ll see the screen scroll to the bottom, click edit on the last comment, enter the text in the script (default is a link to the script, but you can change that to anything), click save, and move on to the next comment (pretty sure it can delete, too). For best results, use a neverending Reddit script and keep scrolling until there are no more pages loaded. Also, re-sort the comments by each option (top, newest, etc.) to check for any stragglers.

          You can still use your browser, though I recommend keeping the task in it’s own window (in case your browser or an addon unloads pages you haven’t accessed in x minutes). If you do something that makes the browser lag a little, it can cause the script to miss a comment, so you might need to run it twice. I used this on one account and it worked flawlessly for several thousand comments and skipped ~10, or so.

    • Doll_Tow_Jet-ski@kbin.social
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      This is the comment I was looking for. A class action from European citizens, for example, under the European privacy law, would really be bad news for Reddit (and good news for the Internet)

    • Eddie@l.lucitt.com
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      Would love to see the FOSS community take down reddit, especially if there’s legitimate merit to it.

  • PabloDiscobar@kbin.social
    link
    fedilink
    arrow-up
    12
    ·
    1 year ago

    I love it, it makes their intentions so obvious. Milking our content for AI training. Nobody will read our old conversations, except for AI´s

  • Brisolo32@lemmy.world
    link
    fedilink
    arrow-up
    7
    ·
    1 year ago

    Lovely thing is that there isn’t even a option to delete data via a LGPD (Lei Geral de Proteção de Dados) request. Well, for what I know at least

  • luki@kbin.social
    link
    fedilink
    arrow-up
    6
    ·
    1 year ago

    I don’t think they are actually restoring posts/comments. This whole thing is based on confusion about the blackout and many subreddits going private. Most people would think you can see all of your own posts and comments if you are logged in and go to your profile page, but if a subreddit goes private you cannot even see your own submissions in that sub.

    So after the blackout ended and most subreddits went public again, people who nuked their account history are now discovering that there’s still posts remaining. They think these posts were restored, but they weren’t even deleted in the first place.

    This is obviously a huge oversight on how Reddit handles your data and your profile page, but don’t attribute to malice which is adequately explained by stupidity.

    • soft_frog@kbin.social
      link
      fedilink
      arrow-up
      18
      ·
      edit-2
      1 year ago

      That still makes it impossible for a user to ever delete all their comments, which is the CCPA complaint

      • abff08f4813c@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        The caching issue would clear up eventually, just give it some time. The CCPA process is slower, so probably the caching issue would be resolved by the time the courts heard it.

        Private is different. What if I posted in a sub like r/BasicIncomeUSA that went permanently private during the blackout and never came back? 30 days, 45 days, still private. Worse, what if it’s a sub where the mods all delete their accounts - or they are unresponsive (because they quit using reddit without deleting their accounts).

        So yeah, private means that reddit has to be the one doing the deletion, as a regular user may not even have the tools to delete otherwise.

    • Eddie@l.lucitt.com
      link
      fedilink
      arrow-up
      11
      ·
      1 year ago

      If you watch the video, you would understand that this individual is deleting specific comments, then saw the exact same comments that he deleted return some time later.

      • luki@kbin.social
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        1 year ago

        Yes, but if you look closely all of those submissions were made on the javascript subreddit. It’s entirely possible that this sub was still private on the 24th, and went public on 25th. I don’t know for sure but that seems to be the most likely scenario.

        Edit: Looking at the blackout tracker, javascript was still private on June 24th, which is the day that the OP of the video was manually deleting his submissions.

        • May@kbin.social
          link
          fedilink
          arrow-up
          9
          ·
          1 year ago

          If the sub were private that time, wouldnt it have prevented him from being able to delete the comment in there in the first place (bc he wouldnt be able to see them when its privet?) In this case he was able to see them i guess because he was able to delete them specific. But am not sure

          • abff08f4813c@kbin.social
            link
            fedilink
            arrow-up
            3
            ·
            edit-2
            1 year ago

            Yup, I couldn’t see my own comments or posts on subs that were private. When I tried to delete them via API/script it got me an error too.

            However, there’s an exception. If you are a mod or approved user for a sub, then you can see and edit/delete as normal. I have never tried this scenario but maybe in this case when it go public again, any deletions are undone (because of the caching issue).

    • OrangeCorvus@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      1 year ago

      The law doesn’t care how they handle the data or if a subreddit is private. If someone requests their data to be deleted, everything must be deleted.

    • leftzero@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      It often happens to be both, though.
      Which seems particularly likely in this case, given Spez & co.'s track record of being both malicious and stupid, more often than not at the same time…

    • Jon-H558@kbin.social
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      I agree, blacked.out subs is why comments are coming.back on the profile page…but there is another issue about the 1000 post limit on the profile page. That means you can Google you comments but never see them on profile

  • t0fr@lemmy.ca
    link
    fedilink
    arrow-up
    5
    ·
    1 year ago

    I’m honestly not surprised at all. The content you created for them is valuable and they’re expecting individual users not to fight back or even notice. They have the power and thicker wallets on their side.

  • JackLSauce@lemmy.world
    link
    fedilink
    arrow-up
    2
    ·
    1 year ago

    I made the script here to overwrite AND delete comments because this move was about unpredictability as gravity

    It also has options to remove submissions, up/downvotes and subscriptions

  • DarkThoughts@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    I hope he just manually deleted it for the recording and then switched to a tool to do it automatically at least. lol

  • letsgo@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    1 year ago

    Reddit’s a US company and GDPR is EU law. Why would an American company be expected to follow EU laws?

    (Not a shill, just genuinely interested. It wouldn’t occur to me as a Brit to demand Reddit comply with GDPR.)

    • wholemilk@lemm.ee
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      It seems that foreign companies still have to comply if they are offering goods or services to or monitoring data of people in the EU. I’m not sure if this applies to Reddit in this case but it can be necessary for American companies to comply with the GDPR.