Hello, fellow lemmings!

I have a few quick updates about lemm.ee. If you don’t want to read a wall of text, then the key points are summarized here for you:

  • There is a Lemmy upgrade (0.18) on the horizon, executing this upgrade will require downtime for lemm.ee
  • I have made some improvements to our infrastructure in order to reduce those pesky 404 errors that some users have been seeing
  • It’s already looking like ~15% of our infrastructure bill for this month is going to be covered by community funding. A huge thanks to all financial supporters of lemm.ee! It’s extremely heartwarming to see that people believe in this platform and are willing to share the costs with me.

Upcoming 0.18 upgrade

With the next version of Lemmy nearing completion, I am starting to plan the upgrade for lemm.ee.

With the 0.17.3 -> 0.17.4 upgrade, I was able to keep lemm.ee online during the upgrade with no downtime. That’s how I would prefer to do all upgrades in the future as well, but unfortunately, there are some fundamentally incompatible changes in 0.18. This means that running a mix of 0.17.4 and 0.18 servers in our infrastructure at the same time will not work - effectively meaning that we can only execute this upgrade with some downtime.

In order to keep surprises to a minimum, I am planning to create a post with a title like “When this post is 1h old, the server will go down for an upgrade”. Once 1h has passed from that post, you will be unable to access lemm.ee until the upgrade completes. If everything goes smoothly, then total expected downtime will be around 15 minutes, but in case of any issues, it could be slightly longer!

It’s not clear yet when 0.18 will be fully ready, but if everything goes well, then this could already happen as early as next week. I will keep you all posted!

Why do we even want 0.18?

There are some very important optimizations landing in 0.18, which should help make the Lemmy UI feel considerably snappier and at the same time give the backend servers some much-needed breathing room. This should help take a lot of pressure of the federated network as a whole, and is a good first step towards scaling further.

Additionally, there are some key fixes that AFAIK will all land in 0.18, such as:

  • Additional posts will no longer automatically appear in your feeds while you’re scrolling
  • You should stop getting redirected onto a completely different post when opening other posts in other tabs
  • The front page will stop showing stale posts for all instances (lemm.ee users will have been enjoying this patch since yesterday already, as I am the author of the patch and decided to apply it early here 😃)

All in all, 0.18 is looking like a great upgrade, so I’m personally looking forward to it.

Random 404 errors

Several users have been experiencing errors on lemm.ee (and similarly on other instances) where some page loads will fail with a white page and a 404 error.

I have spent some time debugging and attempting to mitigate this issue today. I have identified the root cause (spikes in database load related to the amount of new posts in the federated network for every 5 minute interval), and after some database tuning, I have managed to significantly mitigate this issue. Previously, this issue was appearing for about ~6000 page loads every hour. In the hour following my changes, this error only appeared for roughly ~596 page loads! It’s still not 0, so I will continue to try and improve this, but we are starting to brush up against the limits of what our current database infra can manage.

In the longer term, we will seriously benefit from any Lemmy optimizations - I am hopeful that even 0.18 will start bringing down the load on our servers. Additionally, we have a lot of room to upgrade our database infrastructure, but of course this would mean increasing the budget, which I’m not in a position to do for now. This segues us nicely into the third and final topic I wanted to cover:

Server costs

As of today, our infrastructure has scaled up to the point where my own budget will allow. To be more specific, I am able to keep the servers running as is indefinitely, but I am not able to make any further upgrades to our servers out of my own pocket.

Thankfully, we have some extremely kind members in our community, who have already decided to begin supporting lemm.ee and thus ensuring that every single one of us can enjoy a well functioning platform and potential further upgrades down the line! As of today, we have 4 supporters who have signed up for monthly (!!) contributions on my GitHub sponsors as well as one supporter who has donated money through my Ko-Fi page. I want to seriously thank each of you! I am personally super excited about Lemmy as a network, and specifically lemm.ee as an instance, so I’m truly happy to see that others share this excitement and are willing to join me in funding all this.

Pinning updates on the front page

Finally, I am looking for some feedback on how you feel about update posts such as this being pinned to the top of your lemm.ee front page.

My current plan is to pin this post on the front page for the next ~24 hours, after that, I will unpin it, but you will still be able to find it in !meta@lemm.ee.

I have seen some comments complaining about too many pinned posts, so alternatively, I could start pinning the latest site update post to the top of the !meta community, and avoid pinning it to the front page altogether.

If you have thoughts about this (or anything else I have mentioned), please comment below!

  • Jenga@lemm.ee
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 year ago

    I’m new to this instance, but loving the communication so far! Definitely happy with pinned posts to share updates like this.

    • InEnduringGrowStrong@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Choosing the “right” home instance isn’t too obvious as a noob.
      This transparency, not only on funding, but also at the technical level is very nice.

    • kneelknee 🐖@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I also don’t mind the pinned posts. I like to know if anything new is going on, and it’s pretty easy to just scroll past them when there isn’t. If these pinned posts do get unpinned after a certain amount of time, I’d prefer a little longer (maybe 48 hours?)

  • TWeaK@lemm.ee
    link
    fedilink
    English
    arrow-up
    4
    ·
    1 year ago

    As of today, we have 4 supporters who have signed up for monthly (!!) contributions on my GitHub sponsors as well as one supporter who has donated money through my Ko-Fi page.

    What’s your preferred method for donations?

    • sunaurus@lemm.eeOPM
      link
      fedilink
      English
      arrow-up
      7
      ·
      1 year ago

      Thanks for asking! The difference is not huge, but Ko-Fi does take a tiny bit more fees than GitHub sponsors (especially for any recurring payments Ko-Fi takes 5%), so GitHub would be the preferred option for now.

      • OneDimensionPrinter@lemm.ee
        link
        fedilink
        English
        arrow-up
        3
        ·
        1 year ago

        Just signed up for a monthly thing on GitHub. Did a ko-fi donation before I saw this comment though. But here’s hoping we can get you more than enough to fund the instance!!

  • Notorious@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Hey @sunaurus I really appreciate all of your hard work. I already sponsored you on Github, but money only helps so much - don’t burn yourself out. If you need anyone to lend a hand please don’t hesitate to ask. Infra is one of my specialties, but happy to mod or whatever else you need.

    • Prefix@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      +1! I am not an infra guide, but a SWE by trade. Happy to lend a hand however I can.

  • holgersson@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Is it possible to also support you by other means than donations aswell? Im a DevOps engineer and could possibly help with Cloud infra, observability or automation.

    Thank you for your work so far! I actually like the pinned updates, they show that you care about the server and transparency.

    • sunaurus@lemm.eeOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Thanks a lot for the offer! The operational work is totally manageable for me so far, but if that changes, then I’ll definitely reach out to the community to try and find some help.

  • coliseum@lemm.ee
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Thank you for having a working instance. I tried to sign up on so many others, and they just plain did not function. It got so bad that I considered running my own…

  • Brad@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I like the pinned posts personally. I like to keep informed about what’s happening & what your plans are. Also thank you to the folks that have already donated, I hope I’ll have the money in the coming months to also be a regular contributor to the network.

  • borstis@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Additional posts will no longer automatically appear in your feeds while you’re scrolling

    Oh yeah!

  • two_wheel2@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Looks great! Thanks for doing this! I don’t see anywhere here the approximate monthly costs… only what the money is being spent on. Do you have a figure for how much goes into running this instance?

    • sunaurus@lemm.eeOPM
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      The current projected bill for our whole infra in the month of June is $147. This covers the load balancer, 3 servers + database server, object storage for image uploads and our e-mail service. This may increase a little bit if we go higher than expected on bandwidth, object storage or outgoing e-mails.

      • two_wheel2@lemm.ee
        link
        fedilink
        English
        arrow-up
        3
        ·
        edit-2
        1 year ago

        Alright, I’m tossing my tiny hat into the sponsor ring. Thanks so much for putting this community together! I’m excited to see it grow. Just out of curiosity, what does the incremental cost look like? Does it scale well with users? Or does it explode a little bit?

        • sunaurus@lemm.eeOPM
          link
          fedilink
          English
          arrow-up
          2
          ·
          edit-2
          1 year ago

          That’s greatly appreciated!

          In terms of costs of scaling, I would say we’re positioned a bit better than many other Lemmy instances at the moment, thanks to the fact that we employ horizontal scaling as much as possible for the Lemmy software itself.

          By the way, AFAIK, lemm.ee is the only non-experimental Lemmy instance that has chosen to go with horizontal scaling so far. If anybody knows of any other instance that is doing it, I would be super interested to know about it! All the admins I’ve spoken to so far myself have confirmed that they are only doing vertical scaling.

          More technical details below for anybody who is interested:

          There are two approaches you can generally take for scaling - horizontal, where you add more load balanced nodes of more or less the same power, or vertical, where you increase the power of an individual node (of course a mixture of both is also possible).

          One of the benefits of horizontal scaling is that in most cases, it’s significantly more flexible compared to vertical scaling. For example, at my current cloud provider, the only upgrade path for vertical scaling a server would be 8 CPU -> 16 CPU - 32 CPU -> 40 CPU. So if you’re on a 16 CPU server, and you need just a little bit more headroom, then your only option is to upgrade to the 32CPU server, which is straight up double the power (and cost!). Meanwhile, with horizontal scaling, you can just keep adding smaller servers (say 2 CPU each) one at a time, thus growing costs more gradually and appropriately for your actual needs.

          So for lemm.ee, this horizontal scaling means that when our backend servers start getting overloaded, I can just add one or two more servers without exponentially increasing costs.

          • OneDimensionPrinter@lemm.ee
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            As someone who has “been there and done that” at a much larger scale than many devs may ever get a chance to (not a brag, it can suck royally) this really seems like the smart choice.

            This is effectively a basic web server scenario and horizonal scaling tends to with really well to a point. And frankly it’ll be a long while before that becomes the bottleneck.

            Smart choices you’re making. All the best and I’m happy to help out monetarily where I can!

          • electromage@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Are your servers in one geographic region? Could you scale across regions for better performance?

            • Notorious@lemm.ee
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              Personal opinion is that is outside the scope for a single instance. The whole idea behind Lemmy is to have multiple instances to accommodate different geos and different languages.

            • sunaurus@lemm.eeOPM
              link
              fedilink
              English
              arrow-up
              0
              ·
              1 year ago

              I am already leveraging Cloudflare’s globally distributed cache, which helps improve performance even if you’re far away from the backend server. But this only helps partially, not with all types of requests.

              lemm.ee is hosted in central Europe, and based on monitoring, it does seem that most users are having a pretty decent experience on lemm.ee regardless of their geographic location so far. One key exception to this are short windows of database load spikes, which last for roughly 10 seconds every 5 minutes. For these spikes, everybody is suffering equally, regardless of where they are in the world 😅.

              But in general I agree with the sibling comment by @Notorious - rather than scaling one instance to be some massive globally distributed powerhouse, it makes sense to spread out the load amongst a lot of different instances.

              • electromage@lemm.ee
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 year ago

                Thank you for your work and communication! I agree it doesn’t make sense to invest in global infrastructure unless everyone does it, and the return wouldn’t be worth it. We’ll just have to get used to some performance issues as the fediverse takes off!

              • OneDimensionPrinter@lemm.ee
                link
                fedilink
                English
                arrow-up
                1
                ·
                1 year ago

                Are the DB spikes ACTUALLY every 5 minutes or is that just kind of a guess? I ask because if it’s consistent, it’s gotta be some sidecar process somewhere in the stack that can be fiddled with.

                That said, it really sounds like you know what you’re doing already so I’ll just go play with my new communities.

                • sunaurus@lemm.eeOPM
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  1 year ago

                  The spikes are caused by a specific reoccurring process which happens every 5 minutes. I have already significantly optimized it with a patch on lemm.ee, I’m working on getting it merged upstream as well!

          • xavier666@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            For storage, I can understand how horizontal scaling works (add more storage nodes to, say glusterfs). But how does it work for CPU? Since adding a 2CPU VM can be physically on another server, it would need lemmy to work in a highly distributed manner, i.e., CPU instructions need to cross the network.

            Is this distributed feature a part of lemmy or is there another abstraction layer?

            • sunaurus@lemm.eeOPM
              link
              fedilink
              English
              arrow-up
              3
              ·
              edit-2
              1 year ago

              This is where our load balancer comes in. All requests go through the load balancer, and this load balancer will try to evenly distribute the requests to all of our backend servers.

              Is this distributed feature a part of lemmy … ?

              In fact it’s the opposite - Lemmy has so far had some assumptions built in to the code which make it quite hard to run on multiple servers. I have made some modifications in order to improve this (and contributed those modifications back to the main repo as well). It’s one of the things I want to keep improving as we grow.

              • xavier666@lemm.ee
                link
                fedilink
                English
                arrow-up
                0
                ·
                edit-2
                1 year ago

                Here is my oversimplified understanding of the backend of lemm.ee This

                Am I correct? Or is there another loadbalancer in front of the DB?

                Sorry for asking so many questions, but I’m new to system design and trying to learn about practical deployments.

                • sunaurus@lemm.eeOPM
                  link
                  fedilink
                  English
                  arrow-up
                  2
                  ·
                  edit-2
                  1 year ago

                  That’s pretty close, but there are some nuances.

                  1. One of the servers is currently exclusively dedicated to handling images (processing, indexing, resizing, uploading to object storage)
                  2. One of the servers is only handling Lemmy HTTP requests
                  3. One of the servers is handling Lemmy HTTP requests + at the same time also handling Lemmy background tasks (different cleanups, updating the front page rankings, etc)

                  Additionally, we are not using Docker at all for lemm.ee. Not that I have anything against Docker - I use it regularly in other projects - it just wouldn’t provide any advantages for lemm.ee at the moment.

        • bric@lemm.ee
          link
          fedilink
          English
          arrow-up
          0
          ·
          1 year ago

          Same. I’m not putting in a ton, but monthly donations go a long way to help with monthly server costs. We just need 150 people to put in $1 a month and we’ll be covered indefinitely

          • two_wheel2@lemm.ee
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Exactly. I’ve tossed in $5/mo and I literally just realized that with Reddit never in my WILDEST DREAMS would I have imagined kicking in some money for something like gold or trophies or even Apollo (RIP), but $5 a month to contribute to supporting a distributed community of people beyond myself feels like nothing to me. I think that speaks to the potential federation + good will can offer the world

      • dan@upvote.au
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        If they’re all powerful servers, $147 is pretty good for that many of them! Out of curiosity, are you using Hetzner? VPSes or physical servers?

      • OneDimensionPrinter@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Shit. That’s a bunch of hardware/services. I hope the donations keep coming in. I’ll gladly drop a few bucks a month for quality updates and a relatively stable instance.

        Thank you for running this so I don’t have to deal with it myself.

  • Hedup@lemm.ee
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    1 year ago

    15 minutes is nothing. But as usual those 15 minutes might easily become hours if something breaks. If you want to minimize the downtime inconvenience as much as possible, you could do it in the middle of the night for the instances timezone. But I imagine you’d like to sleep too. Second best time would be early morning.

    Do you plan to use 0.18 right as it comes out? A more cautious approach would be to monitor some other instance that pilots it for a day or two and then, if it works smoothly, adopt.

    • sunaurus@lemm.eeOPM
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      1 year ago

      0.18 is already being piloted on https://voyager.lemmy.ml, but this is not a federated instance and it has very few people actively testing things, so for sure some issues could come out once 0.18 starts being rolled out on proper instances.

      In general lemmy.ml seems to always have been the first to roll out new versions in order to verify them, my plan for now is to give it some time on lemmy.ml and then follow with the ugprade. I expect any major issues to become apparent quite quickly (probably within hours rather than days).

      • baconeater@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 year ago

        Just wondering if there is an update on when the 0.18 rollout will be? I updated jerboa to 0.0.35 and so can’t use it to access lemmy instances <.0.18 (just to clarify I’m totally fine with this - just using a web browser for now). Thanks for all the work you are doing to build and maintain this!

        Oh and I like the idea of having pinned posts (as long as they are relevant) to convey important information (such as when lemm.ee will be down during the server upgrades). That way more people will see the message even if they don’t necessarily subscribe to the meta (lemm.ee) community

        • sunaurus@lemm.eeOPM
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          For the moment, there are several key issues with 0.18, so I’m holding off on the upgrade until they can be addressed - most likely it will happen some time next week. I’ll let you guys know ahead of time before I upgrade!

  • sleepyducky@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 year ago

    Thank you for everything you have been doing! I am loving it here so far! I enjoy the pinned posts. I might be missing a regular update post when I am working but with the pinned posts I don’t feel like I am missing anything. I love the transparency of it all

    • FerretOnLemmy@lemm.ee
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I’m guessing you’re a Reddit refugee like me, and yeah Lemmy seems pretty chill so far. I hope only the good people of Reddit migrate here tho, Reddit was mostly porn and we don’t need that crap here