• Voroxpete@sh.itjust.works
    link
    fedilink
    arrow-up
    28
    ·
    2 days ago

    As with all things LLM, triggering or evading the censorship depends on the questions asked and how they’re phrased, but the censorship most definitely is there.

    • Buddahriffic@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      edit-2
      2 days ago

      That could just come down to the nature of the debate. The freedom of Israelis isn’t really a question in the debate. People who see a difference between Palestinians and Hamas also see a difference between Israel’s administration and military and the general Israeli population.

      My guess is that it’s set up to see contexts with conflicting positions associated as controversial but it will just go with responses that don’t have controversy associated with them.

      A bias in the training data will result in a bias in the results and it doesn’t have morals to help it choose between conflicting data in its training. It’s possible that this bias was deliberately introduced, though it’s also possible that it was negligently introduced as it just sucked up data from the internet.

      I’m curious though how it would respond if the second response is used to challenge the first one with a clarification that Palestinians are indeed people.

      Edit: not saying that there isn’t any censorship going on with LLMs outside of China (I believe there absolutely is, depending on the model), just that that example doesn’t look like the other cases of censorship I’ve seen.

      • Voroxpete@sh.itjust.works
        link
        fedilink
        arrow-up
        3
        ·
        2 days ago

        My guess is that it’s set up to see contexts with conflicting positions associated as controversial but it will just go with responses that don’t have controversy associated with them.

        This is significantly more reasoning and analysis than LLMs are capable of.

        When they give those “I can’t respond to that” replies, it’s because a specific programmed keyword filter was tripped, forcing the model to insert a pre-programmed response instead. The rest of the time, they’re just regurgitating a melange of the most statically present text on the subject from their training data.

        • Buddahriffic@lemmy.world
          link
          fedilink
          arrow-up
          1
          ·
          2 days ago

          Yeah, that’s what censorship usually looks like but look at the image in the comment I originally replied to. It didn’t say “I can’t answer that”, it said it didn’t have an opinion and then talked about the controversial nature of it.

          It’s not really reasoning or analysis I’m talking about but the way it ended up setting up its weights in the NN. If it had training data with wildly different responses to questions like that and had training data that commented on wildly different opinions as controversial, then that could make it believe (metaphor) that “it’s a controversial subject” is the most statistically present text.