OpenAI releases o1, its first model with ‘reasoning’ abilities

@nave@lemmy.ca · edit-2 7 days ago

OpenAI releases o1, its first model with ‘reasoning’ abilities

@Voroxpete@sh.itjust.works · 6 days ago

This example doesn’t prove what you think it does. It shows pattern detection - something computers are inherently very well suited for - but it doesn’t demonstrate “reasoning” in any meaningful way.

@kromem@lemmy.world · 6 days ago

You should really look at the full CoT traces on the demos.

I think you think you know more than you actually know.

@Voroxpete@sh.itjust.works · 5 days ago

You mean like this chain of thought?

@kromem@lemmy.world · 5 days ago

Actually, they are hiding the full CoT sequence outside of the demos.

What you are seeing there is a summary, but because the actual process is hidden it’s not possible to see what actually transpired.

People are very not happy about this aspect of the situation.

It also means that model context (which in research has been shown to be much more influential than previously thought) is now in part hidden with exclusive access and control by OAI.

There’s a lot of things to be focused on in that image, and “hur dur the stochastic model can’t count letters in this cherry picked example” is the least among them.

@drspod@lemmy.ml · 6 days ago

Got a link to that?

@kromem@lemmy.world · edit-2 6 days ago

Yep:

https://openai.com/index/learning-to-reason-with-llms/

First interactive section. Make sure to click “show chain of thought.”

The cipher one is particularly interesting, as it’s intentionally difficult for the model.

The tokenizer is famously bad at two letter counts, which is why previous models can’t count the number of rs in strawberry.

So the cipher depends on two letter pairs, and you can see how it screws up the tokenization around the xx at the end of the last word, and gradually corrects course.

Will help clarify how it’s going about solving something like the example I posted earlier behind the scenes.

@FatCrab · 5 days ago

I think if you can actually define reasoning, your comments (and those like yours) would be much more convincing. I’m just calling yours out because I’ve seen you up and down in this thread repeating it, but it’s a general observed of the vocal critics of the technology overall. Neither intelligence nor reasons (likewise understanding and knowing, for that matter) are easily defined in a way that is more useful than invoking spirits and ghosts. In this case, detecting patterns certainly seems a critical component of what we would consider to be reasoning. I don’t think it’s sufficient, buy it is absolutely necessary.

@Voroxpete@sh.itjust.works · 5 days ago

While truly defining pretty much any aspect of human intelligence is functionally impossible with our current understanding of the mind, we can create some very usable “good enough” working definitions for these purposes.

At a basic level, “reasoning” would be the act of drawing logical conclusions from available data. And that’s not what these models do. They mimic reasoning, by mimicking human communication. Humans communicate (and developed a lot of specialized language with which to communicate) the process by which we reason, and so LLMs can basically replicate the appearance of reasoning by replicating the language around it.

The way you can tell that they’re not actually reasoning is simple; their conclusions often bear no actual connection to the facts. There’s an example I linked elsewhere where the new model is asked to list states with W in their name. It does a bunch of preamble where it spells out very clearly what the requirements and process are; assemble a list of all states, then check each name for the presence of the letter W.

And then it includes North Dakota, South Dakota, North Carolina and South Carolina in the list.

Any human being capable of reasoning would absolutely understand that that was wrong, if they were taking the time to carefully and systematically work through the problem in that way. The AI does not, because all this apparent “thinking” is a smoke show. They’re machines built to give the appearance of intelligence, nothing more.

When real AGI, or even something approaching it, actually becomes a thing, I will be extremely excited. But this is just snake oil being sold as medicine. You’re not required to buy into their bullshit just to prove you’re not a technophobe.