Reasoning skills of large language models are often overestimatedZhaofeng Wu

There's so much more to explore