cross-posted from: https://lemmy.ml/post/5400607

This is a classic case of tragedy of the commons, where a common resource is harmed by the profit interests of individuals. The traditional example of this is a public field that cattle can graze upon. Without any limits, individual cattle owners have an incentive to overgraze the land, destroying its value to everybody.

We have commons on the internet, too. Despite all of its toxic corners, it is still full of vibrant portions that serve the public good — places like Wikipedia and Reddit forums, where volunteers often share knowledge in good faith and work hard to keep bad actors at bay.

But these commons are now being overgrazed by rapacious tech companies that seek to feed all of the human wisdom, expertise, humor, anecdotes and advice they find in these places into their for-profit A.I. systems.

  • kibiz0r@midwest.social
    link
    fedilink
    English
    arrow-up
    13
    ·
    1 year ago

    As an open source contributor, I believe information (facts and techniques) should be free.

    As an open source contributor, I also know that two-way collaboration only happens when users understand where the software came from and how they can communicate back to the original author(s).

    The layer of obfuscation that LLMs add, where the code is really from XYZ open-source project, but appears to be manifesting from thin air… worries me, because it’s going to alienate would-be collaborators from the original authors.

    “AI” companies are not freeing information. They are colonizing it.

    • FaceDeer@kbin.social
      link
      fedilink
      arrow-up
      5
      ·
      1 year ago

      The code that AI produces isn’t “copied” from those original authors, though. The AI learned how to code from them, it isn’t literally copying and pasting from them.

      If you think a bit of code is “really from” XYZ open-source project, that’s a copyright violation and you can pursue that legally. But you’ll need to actually show that the code is a copy.

      • NeoNachtwaechter@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        The copyright violation has happened when the code got fed into that AI’s greedy gullet, not when it came out of it’s rear end.

        • FaceDeer@kbin.social
          link
          fedilink
          arrow-up
          6
          ·
          1 year ago

          That remains to be tested legally speaking, and I don’t think it’s likely to pass muster. If it was trained correctly (ie, no overfitting) the resulting AI model does not contain a copy of the training inputs in any identifiable sense.

          • NeoNachtwaechter@lemmy.world
            link
            fedilink
            English
            arrow-up
            1
            ·
            1 year ago

            Yes, the laws are probably muddy in Usa as usual, but rather clear here in the EU. But legal proceedings are slow, and Big Tech is making haste with their feeding.

    • Meowoem@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      ·
      1 year ago

      My open source project benefits hugely from the free to access LLM coding tools available, that’s a far bigger positive than the abstract fear that someone might feel alienated because the guy copy pasting their code doesn’t know who he’s copying from?

      And yes, obviously the LLM isn’t copying code it’s leaning from a huge range of sources and combining it to make exactly what you ask for (well not exactly but with some needling it gets there eventually) but even if it were that’s still not disrupting collaboration because that’s not how collaboration works - no one says ‘instead of coding all the boring elif statements required for my fiction determining if something is a prime, I’ll search code snippits and collaborate with them’ every worthwhile collaborator to my project has been an active user of the software and wanted to help improve it or add functions - AI won’t change that, and if it does it’ll only be because it makes coding so easy I don’t need collaborators