Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
"Apple did more for AI than anyone else: they proved through peer-reviewed publications that LLMs are just neural networks and, as such, have all the limitations of other neural networks trained in a supervised way, which I and a few other voices tried to convey, but the noise from a bunch of AGI-feelers and their sycophants was too loud," Andriy... See more
Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
The findings point to models relying more heavily on pattern recognition, and less on emergent logic, than those who herald imminent machine intelligence claim. But the researchers do highlight key limitations to their study, including that the problems only represent a "narrow slice" of the potential reasoning tasks that the models could be... See more
Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
there’s so much we are missing, i think its too early to deploy these tools in more vulnerable sectors like schooling at least until there is enforcement of some standard A.I literacy course that all students of a certain age must take and that evolves as more information is acquired.
Upon passing a critical threshold, reasoning models reduced the tokens (the fundamental building blocks models break data down into) they assigned to more complex tasks, suggesting that they were reasoning less and had fundamental limitations in maintaining chains-of-thought. And the models continued to hit these snags even when given solutions.
"Wh... See more
"Wh... See more
Cutting-edge AI models from OpenAI and DeepSeek undergo 'complete collapse' when problems get too difficult, study reveals
However, as this process is rooted in statistical guesswork instead of any real understanding, chatbots have a marked tendency to 'hallucinate' — throwing out erroneous responses, lying when their data doesn't have the answers, and dispensing bizarre and occasionally harmful advice to users.