Sensationalist title yes, but this is something that is partially true.
TLDR; I am not spreading FUD. This space can be more safe than many, for the privacy aspect it was actually designed to maintain, which is the complete opposite privacy principle to where most new people are coming from. A monolith platform provides a measure of control over how public your engagement is while leaving you open to being tracked; open federated protects you from being tracked with a cost of having less control over how public your engagement is (and will remain). Some people do not understand this and will change the way they engage if they understand.
There is a lot of misinformation I am seeing (or at least glossed over information) that will potentially lead less informed to peril. I am hoping to provide clarity and maybe shift the attitude of some of the more technical among the community. Not everyone is educated in the same domains, and not every one will grasp some of these concepts easily.
Every thread started along the lines of “Discovered X in Lemmy is not private” is followed up with a comment “Eh, not really an issue. And I reviewed the code myself, an account deletion removes everything from the db”. I push my glasses up: “Ackchyually, that isn’t really true in practice. If defederation happens, or otherwise disconnected, (which always will happen in some capacity) a copy will remain in Lemmiverse, forever”. This is followed up with “well duh, that is how federation works, and everything you post on the internet is copied and there forever. It is no different than a scrape or a screenshot”.
There are nuanced but very important distinctions to a scrape or screenshot and a federated, distributed, indexed copy. Those distinctions will change the way many engage with the platform.
Most people are not having screenshots taken of every post they make, when they make them. Most don’t have to be concerned with wildly compromising material tanking their run for office. It takes a high degree of intent and effort for someone to go to external, and unauthorized sources of duplication. It may not be a complete profile history. Most archives are not going to be indexed and easily searchable on mainstream search engines. Unauthorized archives can get sued into oblivion or otherwise disappear.
Not everyone is able to grasp a platform that acts kind of like a single entity but is not a single entity, especially if they are a refugee from a monolith platform. Many just see it as a single entity initially and when they see “removed from the db” they will assume any such action means platform wide.
A federated copy is automatic and effectively instant by design. A federated copy will be a complete profile. A federated copy will show up in federated searches. A federated copy could end up readily showing up in external indexes. A federated copy may have engagement the user isn’t notified of. A user on an instance where defederation has happened may easily come across an entire profile history in a frozen state. Attention can be brought to content that the user desires censored because it will say “edited” or “deleted by user X” and a SnoopyJerkison could just switch to an instance account that has a copy with two clicks in the official app.
I have made an informed decision on how I will engage by recognizing this. I’ve accepted the folks my local are always going to see my spelling as impecab… impeccibahh… very good, while some other local may see me as the philistine that I am before an edit. I will inevitably doxx myself in some way but it might be nice to have a stalker. It’s just me and the damn dog on our private fiberglass island here and she isn’t much of a conversationalist. I am in a place in life where I’m pretty comfortable with myself and have no problem walking around here with no pants on. Not sure why I recently got onto using pant idioms at every opportunity, but I have accepted that if it follows me around with folks replying, “I know you, you’re that guy with no pants!”, I won’t be able to go back and remove the sources of the reference platform wide.
I’ve made comments I cringe a little at. Entirely benign and nothing I’m losing sleep over, but in haste they were not expressed in my usual voice nor really contributed to the discussion. If I had hesitated longer I would not have responded. Point being: I’m the one ringing alarm bells about this and I am still having to remind myself of the nature of federation.
Some people may not be comfortable with this, or could become less comfortable later. They should not be led to believe that it is a simple matter of “the internet doesn’t forget, but you can delete it from the platform” and understand they need to be very cognizant and thoughtful in how they engage because federation is very unforgiving and really doesn’t forget. This is a feature, not a bug. At its core, federation is balancing many goals. From censorship resistance, community safety, to privacy. It can actually provide an extreme level of privacy. But people will make mistakes, that will remain here, right in their face, if they aren’t extra careful. It won’t be in some dark archive. It won’t be in a screenshot never taken and never posted. The reminder of an accidental slip up will be here to perpetually haunt them. They will leave (likely traumatized by it for years to come).
A federated copy will have the perception of being more legitimate, true or not. The common, non-technical, person won’t understand if they find something you post hosted on a site you are ideologically opposed to, which it will be. Imagine my embarrassment at the next Pantless-Meeting-Pantless event when I get stopped at the door and shown the posts they believe I have actively made on “never-nude.social”. “But… but… federation!”. “Ok Captain Kirk. Here’s your pants. Now scram!”
Some want to have assurance they can remove content platform wide for other reasons. Revoking support for a platform is one that seems to be in vogue right now. I’ve seen posts like “that site we hate is restoring our retracted posts!”. But I’ve seen cases right here on Lemmy where a user has censored all their content, only to come across that same content on other widely used instances completely intact.
This loss of edit access happens fast. Every user at this local will be aware of the high profile cases of defederation. This is a feature by design, and one you can expect more of I suspect. There are also simply errors in federation at times. I’ve lost access to copies on a popular instance the second I posted them.
Maybe this will change. It will be a monumental challenge. And it isn’t the case now. Users have to fully understand this.
“So what, screw the normies. Let them find out the hard way. It’s getting too crowded here anyway. Like you pantless sinnerdotbin! Git outta here if you don’t like it here in the wwwild-wild-west”.
Yet another aspect some are failing to recognize: many of the instances exist in places where they do take privacy very seriously. There are laws about disclosing collection, use and retention of data. One day you may visit your trusty local and you may find a blank page with a single statement: “I keep having very expensive embodied suits appear on my doorstep holding crisp manilla envelopes. I may be breaking the law. I am shuttering immediately”. Hope I didn’t want a reputation of wearing buttless-chaps instead of no pants ‘cause I ain’t got access to modify any of it now.
I’ve seen admins advising others to block EU in their firewall because they are aware of this liability and the lack of a privacy policy. That is a big part of the world that will have limited contribution to this movement.
Policies go a long way to establish user trust. I have gained a high level of confidence in some admins. They are competent, capable, and thoughtful about their users. People have been investigating hardening beyond what I would expect from any admin. They could showcase this level of care and intent by explaining it in their policies.
Privacy policy frameworks can also help new admins navigate responsibilities that keep their users, and the wider platform, safe.
Don’t hand wave this aspect away with “don’t post anything you don’t want public on the internet”. This is a totally different beast. Educate those not as fortunate as you to understand how this actually works. It is designed for your actual traceable information to be kept safe by the gatekeepers, the admins. Users must be highly aware: everything else you do here is public in a way you may never have experienced before.
Don’t hand wave the concern about post/profile/vote/message privacy, explain how the privacy goal is different here and how one might mitigate the aspects they are not comfortable with.
I have started a project where I intend to provide basic policy frameworks that one might use as a point of reference and I would very much like further input on it.
https://github.com/BanzooIO/federated_policies_and_tos/
These policies are going to be terrifying for the uninitiated. I have drafted an optional privacy policy preface that may help admins express the clear distinctions between their responsibility, their users’ responsibility, and the actual real privacy goals in this emerging space.
https://github.com/BanzooIO/federated_policies_and_tos/blob/main/optional-privacy-policy-intro.md
- End transmission, engage pantalon. Zip


The difference is that rather than just having no expectation of privacy against recording (Reddit model), in federated space you are guaranteed an official subtitled hologram with sound is recorded by design and shipped to other town squares all over the world and shown. And you have no expectation that you’ll be able to convince those town squares to delete theirs once they have it and basically no chance to if your own town square is bulldozed or your town has gotten into a feud with theirs since you did your townsquare shouting.
Sure but practically this has always been true. Internet archive, ceddit, removedit, reveddit, whatever and they have been archiving information. Sure it’ll get trickier without api access, but it’s never been something you should reasonably expect anyway. I mean, what ever happened to the advice “once it’s on the internet it’s there forever.” It’s not strictly true, even for federated systems, but it’s good advice.
Ultimately people just have this fantasy notion of privacy on the internet. A false idea of control over their data. I’m pretty privacy minded, but you can sure as hell bet that anything I willingly post on the internet I’m expecting it to stay there forever out of my control. But data harvesting? Manipulative posting and amplifying? These are genuine privacy problems not borne of simple impracticality.
You’re not doing it, but the number of people I see who complain about privacy on lemmy, and then turn around and use every data harvested service known to man, from tik tok to google to reddit itself, I’m not sure I can take their complaints seriously. And ultimately this comes down to different conceptions of privacy, sure, but one of these conceptions is suspiciously impossible to fix yet simultaneously deflective of the other, that other being directly beneficial to companies and any seeking to control mass populations.
ceddit and others you have noted historically have broken for a variety of different reasons, and the others are are currently not functioning as the API they used was banned May 1st. Pushshift, which these services often used, had a mechanism to remove sensitive data you accidentally posted or otherwise wanted removed.
Archive.org is not searchable, not indexed in mainstream search engines. Also would be responsive to legal requests. It is hard to get a complete profile history on someone.
All of these external sources require a great deal of extra effort from someone to pry.
The concern to be aware of here isn’t that it could be scraped, which yes it can. The concern is that it is duplicated by design, wide and broad, on a platform that somewhat functions as a single entity, the instant you hit submit.
People make mistakes. The Unabomber got caught by doxxing himself with a single phrasing of an idiom. Not complaining, simply saying “be very, very, very, very, very, very, very, very, very careful here”
Exactly. The privacy goal on federation is different. If people are educated, they can be safer.
You can’t eat your cake and have it too.
The point I’m making is that there’s no reasonable way to expect that level of privacy in the first place. Those public facing services do it, anyone with a small server could do it (and they are, check out datahoarder). My explicit point here is that federated services just make this more obvious. In practice, federated servers will likely respond to and willingly comply with delete requests. A server that intentionally doesn’t can be easily defederated. Even moreso those servers which refuse delete requests on principle would undoubtedly run into legal trouble especially with GDPR. This parallels the fact that, in practice, most random comments on other social medias probably wont have a lot of interest in being backed up by anyone at all.
The idea that this nonprivacy is unique to lemmy, and not the base assumption people should have been making the whole time they’ve been using the internet, is the absurd part.
Indeed it could, but the level of scraping is very very different. Other social media scraping isn’t just your public facing content submissions, but everything else about your usage of the media too. Dwell time, what links you click on, what posts you look at and read longer, private message information, who knows what data ads can scrape just existing, etc. etc. On the other hand, lemmy has three vulnerabilities for scraping/privacy like other social medias that I can see:
Public facing information, which anyone can scrape any time they want
Private facing information, which servers could scrape from their own users, but that could be noisy to other servers. Things like direct messages, clicks, etc.
Unlimited signups and federated servers allow for the potential of bot manipulations. Different servers will have different approaches to dealing with this, but it likely wont be encouraged by most servers. (Could be by some, though, and people need to watch out for that.)
Even so, neither of these approach the level of data scraping that other social medias perform constantly. Now that said, I would like to see changes to number 2 to make sure that attempts to do so are noisy, but whether that’s even possible is another question, given the nature of servers. Nonetheless that’s the point of having to entrust your information to a given server, you have to trust them. So you should only provide information you trust any server admin to have.
This focus on the other conception of privacy focusing on public facing content is detracting from discussions and effort to focus on these other two vulnerabilities.
I feel you didn’t read the original post. It isn’t about expecting privacy, it isn’t a criticism of the fundamentals of Lemmy as many seem to be taking it (there are many ways I explain how it is more private from being tracked and profiled).
It is about understanding how privacy is maintained on a federated platform.
Many users coming from other platforms do not understand the mechanisms here and how they are different. Take a look for the comment here about vote privacy, which many assumed was private due to coming from a platform where this was.