• Kokesh@lemmy.world
    link
    fedilink
    English
    arrow-up
    16
    ·
    8 hours ago

    As it should. All the idiots calling themselves programmers, because they tell crappy chatbot what to write, based on stolen knowledge. What warms my heart a little is the fact that I poisoned everything I ever wrote on StackOverflow just enough to screw with AI slopbots. I hope I contributed my grain of sand into making this shit little worse.

    • DeathsEmbrace@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      8 hours ago

      Do it in a way that a human can understand but AI fails. I remember my days and you guys are my mvp helping me figure shit out.

      • Chakravanti@monero.town
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 hours ago

        Most “humans” don’t understand reality. So you’re postulative challenge invention isn’t going find a break you seek to divine. Few exist. I’m yet to find many that can even recognize the notion that this language isn’t made to mean what think you’re attempting to finagle it into.

        Evil Money Right Wrong Need…

        Yeah…I could go on and on but there’s five sticks humans do not cognate the public consent about the meaning of Will Never be real. Closest you find any such is imagination and the only purpose there is to help the delirious learn to cognate the difference and see reality for what it may be.

        Good fucking luck. Half the meat zappers here think I am an AI because break the notion of consent to any notion of a cohesive language. I won’t iterate that further because I’ve already spelt out why.

  • ricecake@sh.itjust.works
    link
    fedilink
    arrow-up
    16
    ·
    9 hours ago

    That’s not what that research document says. Pretty early on it talks about rote mechanical processes with no human input. By the logic they employ there’s no difference between LLM code and a photographer using Photoshop.

  • iglou@programming.dev
    link
    fedilink
    arrow-up
    24
    ·
    edit-2
    10 hours ago

    That sounds like complete bullshit to me. Even if the logic is sound, which I seriously doubt, if you use someone’s code and you claim their license isn’t valid because some part of the codebase is AI generated, I’m pretty sure you’ll have to prove that. Good luck.

    • GalacticSushi@lemmy.blahaj.zone
      link
      fedilink
      arrow-up
      7
      ·
      9 hours ago

      I do not give Facebook or any entities associated with Facebook permission to use my pictures, information, messages, or posts, both past and future.

    • brianary@lemmy.zip
      link
      fedilink
      arrow-up
      13
      ·
      11 hours ago

      The Windows FOSS part, sure, but unenforceable copyright seems quite possible, but probably not court-tested. I mean, AI basically ignored copyright to train in the first place, and there is precedent for animals not getting copyright for taking pictures.

      • CanadaPlus@lemmy.sdf.org
        link
        fedilink
        arrow-up
        8
        ·
        edit-2
        9 hours ago

        If it’s not court tested, I’m guessing we can assume a legal theory that breaks all software licensing will not hold up.

        Like, maybe the code snippets that are AI-made themselves can be stolen, but not different parts of the project.

  • Evil_Shrubbery@thelemmy.club
    link
    fedilink
    arrow-up
    16
    ·
    10 hours ago

    By that same logic LLMs themselves (by now some AI bro had to vibe code something there) & their trained datapoints (which were on stolen data anyway) should be public domain.

    What revolutionary force can legislate and enforce this?? Pls!?

    • CanadaPlus@lemmy.sdf.org
      link
      fedilink
      arrow-up
      4
      ·
      edit-2
      5 hours ago

      By that same logic LLMs themselves (by now some AI bro had to vibe code something there)

      I’m guessing LLMs are still really really bad at that kind of programming. The packaging of the LLM, sure.

      & their trained datapoints

      For legal purposes, it seems like the weights would be generated by the human-made training algorithm. I have no idea if that’s copyrightable under US law. The standard approach seems to be to keep them a trade secret and pretend there’s no espionage, though.

      • Evil_Shrubbery@thelemmy.club
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        8 hours ago

        The packaging of the LLM, sure.

        Yes, totally, but OP says a small bit affects “possibly the whole project” so I wanted to point out that includes prob AIs, Windows, etc too.

  • meekah@discuss.tchncs.de
    link
    fedilink
    arrow-up
    33
    ·
    13 hours ago

    Aren’t you all forgetting the core meaning of open source? The source code is not openly accessible, thus it can’t be FOSS or even OSS

    This just means microslop can’t enforce their licenses, making it legal to pirate that shit

    • the_artic_one@programming.dev
      link
      fedilink
      English
      arrow-up
      3
      ·
      8 hours ago

      It’s just the code that’s not under copyright, so if someone leaked it you could legally copy and distribute any parts which are AI generated but it wouldn’t invalidate copyright on the official binaries.

      If all the code were AI generated (or enough of it to be able to fill in the blanks), you might be able to make a case that it’s legal to build and distribute binaries, but why would you bother distributing that slop?

      • m0stlyharmless@lemmy.zip
        link
        fedilink
        arrow-up
        2
        ·
        8 hours ago

        Even if it were leaked, it would still likely be very difficult to prove that any one component was machine generated from a system trained on publicly accessible code.

  • cmhe@lemmy.world
    link
    fedilink
    arrow-up
    6
    ·
    edit-2
    9 hours ago

    I had a similar thought. If LLMs and image models do not violate copyright, they could be used to copyright-wash everything.

    Just train a model on source code of the company you work for or the copyright protected material you have access to, release that model publicly and then let a friend use it to reproduce the secret, copyright protected work.

    • pkjqpg1h@lemmy.zip
      link
      fedilink
      arrow-up
      6
      ·
      9 hours ago

      btw this is happening actuallt AI trained on copyrighted material and it’s repeating similar or sometimes verbatim copies but license-free :D

      • definitemaybe@lemmy.ca
        link
        fedilink
        arrow-up
        1
        ·
        53 minutes ago

        This is giving me illegal number vibes. Like, if an arbitrary calculation returns an illegal number that you store, are you holding illegal information?

        (The parallel to this case is that if a statistical word prediction machine generates copyrighted text, does that make distribution of that text copyright violation?)

        I don’t know the answer to either question, btw, but I thought it was interesting.

  • Kazumara@discuss.tchncs.de
    link
    fedilink
    arrow-up
    15
    ·
    13 hours ago

    How the hell did he arrive at the conclusion there was some sort of one-drop rule for non-protected works.

    Just because the registration is blocked if you don’t specify which part is the result of human creativity, doesn’t mean the copyright on the part that is the result of human creativity is forfeit.

    Copyright exists even before registration, registration just makes it easier to enforce. And nobody says you can’t just properly refile for registration of the part that is the result of human creativity.

    • JackbyDev@programming.dev
      link
      fedilink
      English
      arrow-up
      7
      ·
      edit-2
      11 hours ago

      Yeah, a lot of copyright law in the US is extremely forgiving towards creators making mistakes. For example, you can only file for damages after you register the copyright, but you can register after the damages. So like if I made a book, someone stole it and starting selling copies, I could register for a copyright afterwards. Which honestly is for the best. Everything you make inherently has copyright. This comment, once I click send, will be copyrighted. It would just senselessly create extra work for the government and small creators if everything needed to be registered to get the protections.

      Edit: As an example of this, this is why many websites in their terms of use have something like “you give us the right to display your work” because, in some sense, they don’t have the right to do that unless you give them the right. Because you have a copyright on it. Displaying work over the web is a form of distribution.

      • definitemaybe@lemmy.ca
        link
        fedilink
        arrow-up
        1
        ·
        43 minutes ago

        That edit had confused so many users over the years. They think they are signing away rights to their copyrighted work by agreeing to the platform’s EULA, but the terms granting them license to freely store and distribute your work? That’s literally what you want their service to do because you’re posting it with the intention of the platform showing it to others!

        Granted, companies are using user data for other purposes too, so that’s a problem, but I’ve seen so so many posts over the last couple decades of people complaining about EULAs that describe core site functions…

  • Phoenixz@lemmy.ca
    link
    fedilink
    arrow-up
    17
    ·
    14 hours ago

    So by that reasoning all Microsoft software is open source

    Not that we’d want it, it’s horrendously bad, but still

  • Michal@programming.dev
    link
    fedilink
    arrow-up
    32
    ·
    16 hours ago

    Counterpoint: how do you even prove that any part of the code was AI generated.

    Also, i made a script years ago that algorithmically generates python code from user input. Is it now considered AI-generated too?

    • Wiz@midwest.social
      link
      fedilink
      arrow-up
      18
      ·
      15 hours ago

      i made a script years ago that algorithmically generates python code from user input. Is it now considered AI-generated too?

      No, because you created the generation algorithm. Any code it generates is yours.

      • skami@sh.itjust.works
        link
        fedilink
        arrow-up
        8
        ·
        14 hours ago

        Not how I understand it, but I’m not a lawyer. The user that uses the script to generate the code can copyright the output and oop can copyright their script (and the output they themself generate). If it worked like you said, it would be trivial to write a script that generates all possible code by enumerating possible programs, then because the script will eventually generate your code, it’s already copyrighted. This appear absurd to me.

        Relevant: https://www.vice.com/en/article/musicians-algorithmically-generate-every-possible-melody-release-them-to-public-domain/

        If the script copies chunks of code under the copyright of the original script writer, I typically see for those parts that the original owner keeps copyright of those chunks and usually license it in some way to the user. But the code from the user input part is still copyrightable by the user. And that’s that last part that is most interesting for the copyright of AI works. I’m curious how the law will settle on that.

        I’m open to counterarguments.

    • JackbyDev@programming.dev
      link
      fedilink
      English
      arrow-up
      5
      ·
      12 hours ago

      Computer output cannot be copyrighted, don’t focus on it being “AI”. It’s not quite so simple, there’s some nuance about how much human input is required. We’ll likely see something about that at some point in court. The frustrating thing is that a lot of this boils down to just speculation until it goes to court.

    • sunbeam60@feddit.uk
      link
      fedilink
      arrow-up
      2
      ·
      11 hours ago

      OP is obviously ignorant of how much tooling has already helped write boiler plate code.

      Besides AI code is actually one of the things that’s harder to detect, compared to prose.

      And all that said, AI is doing an amazing job writing a lot of the boilerplate TDD tests etc. To pretend otherwise is to ignore facts.

      AI can actually write great code, but it needs an incredibly amount of tests wrapped around and a strict architecture that it’s forced to stick to. Yes, it’s far too happy sprinkling magic constants and repeat code, so it needs a considerable amount of support to clean that up … but it’s still vastly faster to write good code with an AI held on a short leash than it is to write good code by hand.

    • Dyskolos@lemmy.zip
      link
      fedilink
      arrow-up
      3
      ·
      15 hours ago

      Guess you can’t really prove that, unless you leave comments like “generated by Claude” in it with timestamp and whatnot 😁 Or one can prove that you are unable to get to that result yourself.

      So nonsense, yes.

      • VeryVito@lemmy.ml
        link
        fedilink
        arrow-up
        8
        ·
        15 hours ago

        Or one can prove that you are unable to get to that result yourself.

        Oh shit… I’ve got terabytes of code I’ve written over the years that I’d be hard-pressed to even begin to understand today. The other day I discovered a folder full of old C++ libraries I wrote 20+ years ago, and I honestly don’t remember ever coding in C++.

          • VeryVito@lemmy.ml
            link
            fedilink
            arrow-up
            1
            ·
            12 hours ago

            True enough, and I expected to get checked on that.

            Regardless… along with the archives, assets and versioned duplicates, my old projects dating back to the 90s somehow now fill multiple TB of old hard drives that I continue to pack-rat away in my office. Useless and pointless to keep, but every piece was once a priority for someone.

      • mattvanlaw@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        14 hours ago

        Cursor, an ai/agentic-first ide, is doing this with a blame-style method. Each line as it’s modified, added DOES show history of ai versus each human contributor.

        So, not nonsense in probability, but in practice – no real enforcement to turn the feature on.

        • Rooster326@programming.dev
          link
          fedilink
          arrow-up
          2
          ·
          edit-2
          14 hours ago

          Why would you ever want this?

          If you pushed the bug that took down production - they aren’t gonna whataboutism the AI generated it. They’re still going to fire you.

          • mattvanlaw@lemmy.world
            link
            fedilink
            arrow-up
            1
            ·
            8 hours ago

            Sorry, but as another reply: pushing bugs to production doesn’t immediately equate to firing. Bug tickets are common and likely addressing issues in production.

              • mattvanlaw@lemmy.world
                link
                fedilink
                arrow-up
                1
                ·
                1 hour ago

                I guess you mean like full outtage for all users? My bad just a lot of ways to take the verb “down” for me. Still, though, what a crappy company to not learn but fire from that experience!

          • sunbeam60@feddit.uk
            link
            fedilink
            arrow-up
            2
            ·
            10 hours ago

            It makes little difference IMHO. If you crash the car, you can’t escape liability blaming self driving.

            Likewise, if you commit it, you own it, however it’s generated.

            • mattvanlaw@lemmy.world
              link
              fedilink
              arrow-up
              2
              ·
              9 hours ago

              It’s mainly for developers to follow decisions made over many iterations of files in a code base. A CTO might crawl the gitblame…but it’s usually us crunchy devs in the trenches getting by.

    • WFH@lemmy.zip
      link
      fedilink
      arrow-up
      7
      ·
      11 hours ago

      Agentic IDEs like Cursor track usage and how much of the code is LLM vs human generated.

      Which probably means it tracks every single keystroke inside it. Which rightfully looks like a privacy and/or corporate code ownership nightmare.

      But hey at least our corporate overlords are happy to see the trend go up. The fact that we tech people were all very unsubtly threatened into forced agentic IDEs usage despite vocal concerns about code quality drop, productivity losses and increasing our dependence on US tech (especially openly nazi tech) says it all.

      • herseycokguzelolacak@lemmy.ml
        link
        fedilink
        arrow-up
        1
        ·
        7 hours ago

        Agentic IDEs like Cursor track usage and how much of the code is LLM vs human generated.

        For your code, sure. How do you know someone else’s code is LLM generated?

        • WFH@lemmy.zip
          link
          fedilink
          arrow-up
          1
          ·
          7 hours ago

          Because it’s a surveillance state baby. Everything is uploaded to a central server so our corporate overlords can monitor our usage.