• Adding a line: ✅
  • Removing a line: ✅
  • Modifying a line: ✅
  • Moving a codeblock: ❌ i see you’ve rewritten everything, let me just highlight it all.

RIP reviewers on my PR.

(Meme created by my coworker)

  • @sim642@lemm.ee
    link
    fedilink
    58 months ago

    Diffing algorithms on trees might not be as efficient, especially if they have to find arbitrary node moves.

      • @sim642@lemm.ee
        link
        fedilink
        38 months ago

        It’s not necessarily about the load, it’s about the algorithmic complexity. Going from lists (lines in a file, characters in a line) to trees introduces a potentially exponential increase in complexity due to the number of ways the same list of elements can be organized into a tree.

        Also, you’re underestimating the amount of processing. It’s not about pure CPU computations but RAM access or even I/O. Even existing non-semantic diff implementations are unexpectedly inadequate in terms of performance. You clearly haven’t tried diffing multi-GB log files.

        • @killeronthecorner@lemmy.world
          link
          fedilink
          English
          1
          edit-2
          8 months ago

          Log files wouldn’t fall under the banner of compiled languages or ASTs, so I’m not sure how that example applies.

          And I’m aware that it can lead to O(n²) complexity but, as others have provided, there are already tools that do this, so it is within the capabilities of modern processors

          Yes there will be cases where the size of the search space will make it prohibitive to run in reasonable times but this is - by merit of the existing tools and the fact that they seem to work quite well - an edge case.

          • @sim642@lemm.ee
            link
            fedilink
            18 months ago

            Log files themselves don’t, but I’m just comparing it with simpler files with simpler structure with simpler algorithms with better complexity.