I love to show that kind of shit to AI boosters. (In case you’re wondering, the numbers were chosen randomly and the answer is incorrect).

They go waaa waaa its not a calculator, and then I can point out that it got the leading 6 digits and the last digit correct, which is a lot better than it did on the “softer” parts of the test.

    • mountainriver@awful.systems
      link
      fedilink
      English
      arrow-up
      8
      ·
      22 hours ago

      I find it a bit interesting that it isn’t more wrong. Has it ingested large tables and got a statistical relationship between certain large factors and certain answers? Or is there something else going on?

      • CodexArcanum@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        2
        ·
        17 hours ago

        I posted a top level comment about this also, but Anthropic has done some research on this. The section on reasoning models discusses math I believe. The short version is it has a bunch of math in its corpus so it can approximate math (kind of, seemingly, similar to how you’d do a back of the envelope calculation in your head to get the orders of magnitude right) but it can’t actually do calculations which is why they often get the specifics wrong.