I’ve recently been wondering if Lemmy should switch out NGINX for Caddy, while I hadn’t had experience with Caddy it looks like a great & fast alternative, What do you all think?

EDIT: I meant beehaw not Lemmy as a whole

  • Cinnamon@beehaw.orgOP
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    Would a good solution be to just deffer changes to data with something like Apache Kafka? Or changing to something that can be scaled, like cockroach db or neondb? I also heard ScyllaDB could be a great alternative, mostly from reading the discord technical blog.

    • veaviticus
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      It’s not the tech here. Postgres can scale both vertically and horizontally (yes there are others that can scale easier or in different factors of CAP).

      The problem is how the data is being stored and accessed. Lemmy is doing some really inefficient data access and it’s causing bottlenecks under load.

      Lemmy (unfortunately) just wasn’t ready for this level of primetime yet… It has a number of issues that are going to be quite tricky to fix now that it’s seen such wide adoption (database migrations are tricky on their own, doing so on a production site even harder, doing so on 8k+ independent production sites… Sounds like a nightmare)

      • Cinnamon@beehaw.orgOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Sorry, I assumed it was just an issue with the tech not scaling well, really shows how little I know about architecture haha.

      • argv_minus_one@beehaw.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        1 year ago

        Can you elaborate on what Lemmy is doing that’s inefficient? I’m working on a database application myself, so the more I know about optimizing database queries, the better.

    • BitOneZero@beehaw.org
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      something like Apache Kafka

      Not that I see. A database like PostgreSQL can work, but you have to be really careful how new data flows into the database. As writing to the database involves record locking and invalidates the cache for output.

      Or changing to something that can be scaled, like cockroach db or neondb?

      Taking the bulk data, comments and postings, outside PostgreSQL would help. Especially since what most people are reading on a Reddit-like website is content form the last 48 hours… and your caching potential dies way down as people move on to the newer content.

      The comments alone are the primary problem, there are lot of them on each posting and they are bulky data. Also comments are unique data.

      • Cinnamon@beehaw.orgOP
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        hmmm a good approach would be to maybe split comments into some kind of database regions and just load as they’re needed instead of loading them all at once