• onlinepersona@programming.dev
    link
    fedilink
    English
    arrow-up
    5
    ·
    11 months ago

    That’s a good question that I don’t have an answer to as I have no legal training. I’m assuming if you can sign a contract online where the legal text is behind a link and the main offer is what you see… maybe? Technically, it wouldn’t be too difficult to simply erase any mention of a license in a pre-cleaning phase of the data, but I don’t know if the act itself would be an even bigger indication of guilt. There would be no excuse like “oops, I just copied this data into my training set, teehee”. But as I said, not a legal expert.

    If there are copyright experts that want to weigh in, I’d be interested to hear their opinion. Given that there are running, unanswered cases (most notably again Microsoft’s Copilot), and Japan on the verge of drafting into law that AI training data can ignore copyright, it’s possible even legal experts would have a hard time answer the question.

    I’m putting them here just in case. Only costs me a line carriage and a Ctrl+V.

    CC BY-NC-SA 4.0

    • Cosmic Cleric@lemmy.world
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      11 months ago

      If there are copyright experts that want to weigh in, I’d be interested to hear their opinion.

      Myself as well. It’s a new frontier, legally.

      I’m putting them here just in case. Only costs me a line carriage and a Ctrl+V.

      Seeing that you have done that made me start to think about doing it myself, as I definitely feel there are days when I’m being shadowed by AI training mechanisms.

      But if it doesn’t make any difference legally as a deterrent, then I wouldn’t bother.

      • ArmokGoB@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        6
        ·
        11 months ago

        Even if it’s ruled illegal in the US, there’s nothing stopping AI companies from moving their operations to Japan where copyright doesn’t apply to training data.