Cutting-edge AI models from OpenAI and DeepSeek undergo 'com...

Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals

livescience.com

RelatedInsightsHighlights

Almost every company I talk to, and basically every solution vendor, has been pushing for people to use AI to “talk-to-your-data,” an approach allows the AI to retrieve content from a company’s proprietary databases and then work with the documents and data it retrieves. The problem is that AIs hallucinate, or make up plausible information, all the... See more

Ethan Mollick • Almost an Agent: What GPTs can do

Apple just dropped a bombshell on the AI world. Their new study, titled “The Illusion of Thinking” found that today’s top “reasoning” models like Claude, DeepSeek-R1, Gemini Thinking, and OpenAI’s o3-mini don’t actually reason. They just memorize patterns really well. Instead of using... See more

instagram.com

Thumbnail of p-dkzvmhgztjg-e1dcdc837ef84ac7 0

📢 More evidence is piling up that shows AI systems cannot “think” like humans do. While LLMs may seem articulate, new research from Apple reveals that these AI reasoning models fail when faced with true complexity, offering “eloquent emptiness” devoid of substance. @forbes reporter Cornelia... See more

instagram.com

The community remains puzzled about whether these models genuinely generalize to unseen tasks, or seemingly succeed by memorizing the training data. This paper makes important strides in addressing this question. It constructs a suite of carefully designed counterfactual evaluations, providing fresh insights into the capabilities of... See more

Zhaofeng Wu • Reasoning skills of large language models are often overestimated

So far, there’s no evidence that large language models possess world models, even though some researchers and engineers believe they might naturally emerge over time. And it is this absence of grounded, rules-based modeling, Marcus recently argued, that explains why L.L.M.s often “hallucinate” in strange and unexpected ways: “What L.L.M.s do is to... See more

Ethan Mollick • Almost an Agent: What GPTs can do

Zhaofeng Wu • Reasoning skills of large language models are often overestimated

Just a moment...