• Delta Air Lines CEO Ed Bastian said the massive IT outage earlier this month that stranded thousands of customers will cost it $500 million.
  • The airline canceled more than 4,000 flights in the wake of the outage, which was caused by a botched CrowdStrike software update and took thousands of Microsoft systems around the world offline.
  • Bastian, speaking from Paris, told CNBC’s “Squawk Box” on Wednesday that the carrier would seek damages from the disruptions, adding, “We have no choice.”
  • Riskable@programming.dev
    link
    fedilink
    English
    arrow-up
    5
    ·
    6 months ago

    Adding another reply since I went on a bit of a rant in my other one… You’re actually missing the point I was trying to make: No matter what solution you choose it’s still your fault for choosing it. There are a zillion mitigations and “back up plans” that can be used when you feel like you have no choice but to use a dangerous 3rd party tool (e.g. one that installs kernel modules). Delta obviously didn’t do any of that due diligence.

    • ricecake@sh.itjust.works
      link
      fedilink
      arrow-up
      4
      ·
      6 months ago

      Kernel module is basically the only way to implement this type of security software. That’s the only thing that has system wide access to realtime filesystem and network events.

      Yes, they’re ultimately liable to their customers because that’s how liability works, but it’s really hard to argue that they’re at fault for picking a standard piece of software from a leading vendor that functions roughly the same as every piece of software in this space for every platform functions, which then bypassed all configurations they could make to control updates, grabbed a corrupted update and crashed the computer.
      It’s like saying it’s the drivers fault the brakes on their Toyota failed and they crashed into someone. Yes, they crashed and so their insurance is going to have to cover it, but you don’t get angry at the driver for purchasing a common car in good condition and having it break in a way they can’t control.

      What mitigations should they have had? All computer systems are mostly third party tools. Your OS is a third party tool. Your programming language is a third party tool. Webserver, database, loadbalancer, caching server: all third party tools. Hardware drivers? Usually third party, but USB has made a lot of things more generic.

      If your package manager decides to ignore your configuration and update your kernel to something mangled and reboot, your computer is going to crash and it’ll stay down until you can get in there to tell it to stop booting the mangled kernel.

      • Riskable@programming.dev
        link
        fedilink
        English
        arrow-up
        2
        ·
        6 months ago

        It is absolutely not the only way to implement EDR. Linux has eBPF which is what Crowdstrike and other tools use on Linux instead of a kernel module. A kernel module is only necessary on Windows because Windows doesn’t provide the necessary functionality.

        Mitigating factors: Use (and take) regular snapshots and test them. My company had all our virtual desktops restored within half an hour on that day. If you don’t think Windows Volume Shadow Copy is capable or actually useful for that in the real world then you’re making my argument for me! LOL

        Another option is to use systems (like Linux) that let you monitor these sorts of EDR things while remaining super locked down. You can run EDR tools on immutable Linux systems! You can’t do that on Windows because (of backwards compatibility!) that OS can’t run properly in an immutable share.

        Windows was not made to be secure like that. It’s security contexts are just hacks upon hacks. Far too many things need admin rights (or more privileges!) just to function on a basic level.

        OSes like Linux were built to deal with these sorts of things. Linux, specifically, has gone though so many stages of evolution it makes Windows look like a dinosaur that barely survived the asteroid impact somehow.

        • ricecake@sh.itjust.works
          link
          fedilink
          arrow-up
          3
          ·
          6 months ago

          eBPF, the kernel level tool? Because you need to be in the kernel to have that level of access, which is what I was saying? The one with a bug that crowd strike hit that caused Linux servers to KP?
          Yes, I said “kernel module” when I should have said “software executing in a kernel context”. That’s on me.

          By the way, eBPF? Third party software by most metrics. Developed and maintained by Facebook, Cisco, Microsoft, Google and friends. Also available on windows, albeit not as deeply integrated due to the layers of cruft you mention.

          I’m glad you were able to recover your VMs quickly. How quickly were you able to recover your non-virtualized devices, like laptops, desktops or that poor AD server that no one likes?
          Airlines need more than just servers to operate. They also need laptops for various ground crew, terminals for the gate crew and ticketing agents, desktops for the people in offices outside the airport who manage “stuff” needed to keep an airline running.

          You seem to be much more interested in talking about Linux being better than windows, which is a statement I agree with, but it’s quite different from your original point that “Delta is at fault because they used third party tools”.

          My point was that it’s unreasonable to say that Delta should have known better than to use a third party tool, while recommending Linux (not written by Delta), whose ecosystem is almost entirely composed of different third parties that you need to trust, either via system software (webserver), holding your critical data (database), kernel code (network card makers usually add support by making a kernel patch), or entire architectural subsystems (eBPF was written by a company that sells services that use it, and a good chunk of the security system was the NSA).

          None of that bothers me. I just don’t get how it doesn’t bother you if you don’t trust well regarded vendors in kernel space to have those same vendors making kernel patches.

    • catloaf@lemm.ee
      link
      fedilink
      English
      arrow-up
      3
      ·
      6 months ago

      Sounds like they executed their plans just fine.

      And due diligence is “the investigation or exercise of care that a reasonable business or person is normally expected to take before entering into an agreement or contract with another party or an act with a certain standard of care”. Having BC/DR plans isn’t part of due diligence.