Federated services have always had privacy issues but I expected Lemmy would have the fewest, but it’s visibly worse for privacy than even Reddit.
- Deleted comments remain on the server but hidden to non-admins, the username remains visible
- Deleted account usernames remain visible too
- Anything remains visible on federated servers!
- When you delete your account, media does not get deleted on any server
First - we’re all using alpha/beta software (Lemmy is 0.17.4, Kbin is 0.10.). None of these services are “production quality” software yet, so let’s keep that in our minds - we’re all early adopters.
The points mentioned in the OP are a bad look. Naturally. User should have expectation of their data being deleted on request - especially since this request might be regulatory privacy request (GDPR related). It’s a clear failure from the software and should be improved and iterated upon.
The expectation shouldn’t be “oh well it’s on the Internet, live with it”. While Facebook might keep mining your data after deletion request, our software shouldn’t behave like that, we should strive to be better with this stuff.
And finally, ensuring privacy in federated system is hard. Mastodon suffers from same problems. We shouldn’t give up on the idea though.
It is an early stage software and such things can be worked out, you’re right. But on the other hand, such basic elements should be based on a thorough concept before a single line is coded, and implementing something like a delete button with “Let’s just make it delete the most visible stuff for now, we can always improve that later when there is time” is recipe for disaster.
Agree, it’s a little late to change core architecture. But this is the philosophy the devs ran with, and it has the advantage of longevity when an instance goes offline, then it’s still visible to everyone else.
The more important part for privacy: Mail address is optional, and IP addresses are not stored in the database. A correctly configured instance (at least for EU legislation) also will not log IP addresses in the web server - with that you can have profiles that can’t be tied to an actual human, and you don’t have location and movement data.
The data deletion is pretty much a nice to have - it’s on the level of the Exchange feature to recall Emails: Sure, you can ask nicely, but outside of your own server pretty much nobody will care. Lemmy is federated over multiple jurisdictions, so even with full deletion implemented there’ll almost certainly be instances which will ignore the deletion request - and it will be completely legal for them to do so. More important is education about what you publish, and a basic understanding of the technical and legal realities you’ll have to deal with if you later decide you want that information gone.
I already had that discussion with my 6 year old when she wanted to publish some videos - and she understood the problems quite well.
Lemmy also seems to federate your matrix_user_id, that is clear personal data. It does not matter how the data gets to the federated server, this is still user data within the scope of the GDPR. It does not matter that that server does not have an agreement with the user, the instance that would ignore a GPDR related deletion request would be in direct violation of the GDPR. Maybe it can do that without consequences, though.
I completely understand that making Lemmy fully GPDR compliant will probably be impossible, however I don’t like the approach of “we will not succeed, so we don’t make any attempt”. Instances should actually delete data when that is requested, or instance hosts can get fined. For now, Lemmy has bigger issues to solve, but eventually they should do at least a best effort attempt to respect user data.
I had a look into the wording of the gdpr (more specifically the Data protection act as it is implemented in the UK) it seems to refer to organisations. I think most, if not all, instances are not hosted by organisations. (Just some group or individual hosting it on personal or rented hardware). Laws such as this are designed with centralization in mind, and kind of don’t make sense in the context of decentralisation.
Just like specifying an email address when signing up adding a matrix identifier is your personal choice. Lemmy is perfectly usable without either.
Not a lawyer, but I’d say the instance outside of EU, not targetting EU users would not be in violation - though EU instances transmitting data there might.
With that part I agree - but it should be made clear when deleting something that this is a local deletion, which may or may not propagate to other instances, and will almost certainly not remove the data from the internet.
This is an interesting thought, as data transfer between the US and EU has been an issue with other social networks. Federation between an EU instance and a US instance could be seen as the same thing - data for EU users is being transferred to non-EU servers.
It’s very possible that an EU instance that comes under regulatory scrutiny for whatever reason will have to start requiring Data Processing Agreements (DPAs) from every instance it federates with.
Ultimately that would likely result in a few paid, professionally run instances, which only federate with each other and maybe a few similar instances in other regions with the capacity to provide DPAs.
And next to that, a forest of independent, non-conforming instances flying under the regulatory radar; an entirely separate fediverse from the centralized one where instances disappearing is a regular occurence.
But is it solvable at all in principle? The only enforcement policy available is defederation, but that just means future posts won’t go to that instance, the older posts will still be there. Plus an instance could just lie when confirming delete requests and you’d never know unless the non-deleted posts leaked.
Not really, same as email. Once you send it out and it’s on somebody else’s server, you can request they delete it but that’s about it. They have a copy of your message and can do whatever they want with that.
This is not a principle that needs solving imo, it’s the nature of Internet. If you post it online then you should know that there’s a chance it’ll be there permanently.
Hmm, it’s an interesting problem. I’m afraid you are right and there’s really nothing left but defederation - on the other hand, then it’s the same as with stuff like the parsers that could show deleted reddit messages, or things like waybackmachine, which basically do the same, so the core logic of base lemmy source should be as privacy-respecting as possible.
I remember few years ago when I was reading about Signal that there is some way how you can verify that their server is running on the same code as the one published (and audited heavily), so you can be 100% sure that there were no modifications. Wouldn’t something like that be a solution? That would prevent servers from modifying the code that deletes data. I don’t know how it works, and I couldn’t find it when I tried looking for it again, but assuming such a thing is possible, each Lemmy instance could just have a verify widget on their VCS and you could be sure that this instance really does delete your data, since they didn’t modify the deletion code.
But this is just a theorycrafting, I wouldn’t really have enough experience to create something like that and I can imagine that it’s not an easy thing. But if anyone knows more details about the way Signal verification works, assuming I’m just didn’t misunderstood something (since it’s literally a memory I have of a single sentence from one random article when I was researching best private messages app), I would love to read more about the way it works!
But yeah, outside of that, I’m afraid that the following set of features is mutually exclusive:
Another option would be to create some kind of reputation system, where self-hosted bots could check for servers that still provide posts and comments that should be deleted, and flag offenders. But that’s overengineering anyway, and as I’ve already said - there’s still no way how to stop scraper or anyone from simply copying your data when they see it.