• Anony Moose@lemmy.ca
    link
    fedilink
    English
    arrow-up
    69
    ·
    1 year ago

    This is probably because of the autoregressive nature of LLMs, and is why “step by step” and “chain of thought” prompting work so well. GPT4 can only “see” up to the next token, and doesn’t know how its own entire answer upfront.

    If my guess is correct, GPT4 knew the probabilities of “Yes” or “No” were highest amongst possible tokens as it started generating the answer, but, it didn’t really know the right answer until it got to the arithmetic calculation tokens (the 0.9 * 500 part). In this case it probably had a lot of training data to confirm the right value for 0.9 * 500.

    I’m actually impressed it managed to correct course instead of confabulating!

    • Steeve@lemmy.ca
      link
      fedilink
      English
      arrow-up
      39
      ·
      1 year ago

      “Sometimes I’ll start a sentence, and I don’t even know where it’s going. I just hope I find it along the way.” -GPT

  • Rentlar@lemmy.ca
    link
    fedilink
    arrow-up
    17
    ·
    1 year ago

    I guess ChatGPT just likes to talk back with “No” a lot as an immediate reaction. (Sounds like some people I know…)

  • DavidGarcia@feddit.nl
    link
    fedilink
    arrow-up
    13
    ·
    1 year ago

    Words are generated word by word. It’s reading the entire prompt and what it replied so far to generate new words. So yeah it can recognize its own mistakes while writing. It just wasn’t trained for that so it usually doesn’t do it, but you can encourage it to do it by giving custom instructions telling it to second guess itself.

  • Jordan Lund
    link
    fedilink
    English
    arrow-up
    8
    ·
    1 year ago

    Guess we found where all those Pentium processors ended up…

      • canihasaccount@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        You can try this yourself with GPT-4. I have, and it fails every time. Earlier GPT-4 versions, via the API, also fail every time. Claude reasons before it answers, but if you ask it to say yes or no only, it fails. Bard is the only one that gets it right, right off the bat