• @sugar_in_your_tea@sh.itjust.works
    link
    fedilink
    English
    1430 days ago

    Even then, some places will reboot on a schedule when nobody should be using it.

    I have some entry level “enterprise” hardware (Mikrotik router and Ubiquiti access point) and I auto-reboot mine weekly. In addition to maintaining performance and minor security wins, it also helps ensure everything csn survive a reboot (e.g. all configurations have persisted to disk).

    It’s good practice. Some people brag about continuous uptime, I see it as a liability.

    • yeehaw
      link
      fedilink
      English
      630 days ago

      It’s good practice for patching purposes. You should always be maintaining stable OS versions and a memory leak or the like is fairly uncommon. I think I’ve seen it once in my career on a particular check point OS version.

      • @sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        330 days ago

        Yeah, I’m more worried about keeping up on patches and ensuring things will start back up properly than memory leaks. But minor security and performance wins are nice too.

    • @locuester@lemmy.zip
      link
      fedilink
      English
      630 days ago

      Absolutely. Nothing scarier than rebooting the computer or router that’s been running for 10 years.

      I also enjoy exercising software blue/green rotation weekly. Even if no code changes, have it roll to the alternate infra on an automated schedule. Is a great habit to get into and helps any engineer sleep better. It also results in providing very accurate downtime recovery numbers - not estimates.

    • @dustyData@lemmy.world
      link
      fedilink
      English
      430 days ago

      That’s why all master systems have a backup At least on datacenters 10 years ago is how we did it. We could run a patch, system update, data backup, system restart or whatever it was required to almost any piece of kit on the racks without losing continuity of service. Just do the backup first, then the same operation on the master, if any of them fails the whole architecture is designed to pick up the tasks and continue as if nothing wrong is going on. It was expensive, but they were mission critical banking infrastructure. The thing only went out for account balancing, but it was at 3am when it was likely that no one would need it, and even then for the user there was no loss of service. Transactions still went through, just with a couple of hours of delay for the whole ordeal to sync up.