One prominent author responds to the revelation that his writing is being used to coach artificial intelligence.

By Stephen King

Non-paywalled link: https://archive.li/8QMmu

  • FaceDeer
    link
    fedilink
    1410 months ago

    These are software companies illegally using artists works

    There is nothing illegal about what they’re doing. You may want it to be illegal, but it’s not illegal until laws are actually passed to make it illegal. Things are not illegal by default.

    Copyright only prevents copying works. Not analyzing them. The results of the analysis are not the same as the original work.

    • RyanHeffronPhoto
      link
      fedilink
      15
      edit-2
      10 months ago

      It is illegal. As an artist, if another individual or company wants to use my work for their own commercial purposes in any way, even if just to ‘analyze’ (since the analysis is part of their private commercial product), they still need to pay for a license to do so. Otherwise it’s an unauthorized use and theft. Copyright doesn’t even play into it at that point, and would be a separate issue.

      • FaceDeer
        link
        fedilink
        2210 months ago

        As an artist, if another individual or company wants to use my work for their own commercial purposes in any way, even if just to ‘analyze’, they still need to pay for a license to do so.

        I think you need to review the relevant laws, that’s not true.

        For example, your comment that I’m responding to is copyrighted and you own the copyright. I just quoted part of it in my response without your permission, and that’s an entirely legal fair use. I also pasted your comment into Notepad++ and did a word count, there are 64 words in it. That didn’t break any laws either.

        A lot of people have very expansive and incorrect ideas about how intellectual property works.

        • Kaldo
          link
          fedilink
          510 months ago

          First of all, a random online comment is not protected by copyright law afaik.

          Secondly, if you did take something protected by copyright and then used it for commercial purposes (to make money off it), like these LLMs do, then you would be breaking the law.

          In short, I’d say you are using a flawed analogy from the start.

          Also copyright is not about just copying but also distributing as well. Playing.(radio) songs in your coffee shop for clients is treated differently than you listening to it at home. You generally can’t just profit off someone else’s work without them allowing it.

          • FaceDeer
            link
            fedilink
            910 months ago

            First of all, a random online comment is not protected by copyright law afaik.

            You got a fundamental aspect of copyright law wrong right in the first line.

            Your comments are indeed protected by copyright.

            Secondly, if you did take something protected by copyright and then used it for commercial purposes

            That’s wrong too. Whether or not someone’s making money off of a copyright violation will affect the damages you can sue them for, but it’s copyright violation either way.

            Also copyright is not about just copying but also distributing as well. Playing.(radio) songs in your coffee shop for clients is treated differently than you listening to it at home.

            Technically true, but what does it have to do with these circumstances?

            You generally can’t just profit off someone else’s work without them allowing it.

            Generally speaking, sure you can. Why couldn’t you? People do work that other people profit off of all the time. If a carpenter builds a desk and then I go sit at it while doing my job and earning millions of dollars, I don’t need to ask the carpenter’s permission.

            Copyright has a few extra limitations, but those limitations are on copying stuff without permission.

        • RyanHeffronPhoto
          link
          fedilink
          410 months ago

          that’s an entirely legal fair use

          Yet what these companies are doing does not constitute ‘fair use’, period, no matter how much you want to argue otherwise.

      • @Drewelite@lemmynsfw.com
        cake
        link
        fedilink
        English
        4
        edit-2
        10 months ago

        I keep rereading this comment and as someone in R&D… I’m so astonished that people think that companies just spontaneously come up with everything they produce without looking around. Companies start off almost every venture by analyzing any work in the field that’s been done and reverse engineering it. It’s how basically anyone you’ve heard of works. It goes double for art. Inspiration is key for art. Composers will break down the sheet music of great compositions, graphic designers will have walls full of competitors designs, cinematographers will study movies frame by frame.

      • FaceDeer
        link
        fedilink
        710 months ago

        No, it’s not. Something that is merely in the style of something else is not a derivative work. If that were the case there’d be lawsuits everywhere.

        • @anachronist@midwest.social
          link
          fedilink
          English
          5
          edit-2
          10 months ago

          LLMs regurgitate their training set. This has been proven many times. In fact from what I’ve seen LLMs are either regurgitating or hallucinating.

          • @sunbeam60
            link
            810 months ago

            With great respect I believe that to be a gross simplification of what an LLMs does. There is no training set stored in the LLM, only statistics about what word set is likely to follow what word set. There is not regurgitation of the date - if that was the case, they temperature parameter wouldn’t matter when it very much does.

            • admiralteal
              link
              fedilink
              4
              edit-2
              10 months ago

              A slightly compressed JPG of an oil painting is still, at least for purposes of intellectual property rights, not distinct from the original work on canvas. Sufficiently complex and advanced statistics on a work are not substantially different from the work itself. It’s just a different way of storing a meaningful representation.

              These LLMs are all more or less black boxes. We really cannot conclusively say one way or another whether they are storing and using the full original work in some form or another. We do know that they can be coaxed into spitting out the original work, though, which sure implies it is in there.

              And if the work of a human that needs to be fed is being used by one of these bots – which is pretty much by definition a commercial purpose given that all the relevant bots are operated as such – then that human should be getting paid.

              • FaceDeer
                link
                fedilink
                310 months ago

                We do know that they can be coaxed into spitting out the original work, though, which sure implies it is in there.

                Only very rarely, under extreme cases of overfitting. Overfitting is a failure state that LLM trainers want to avoid anyway, for reasons unrelated to copyright.

                There simply isn’t enough space in a LLM’s neural network to be storing actual copies of the training data. It’s impossible, from a data compression perspective, to fit it in there.