AI Models Persist on Wrong Answers, Unlike Humans Who Disengage.

Han-yu Wang· June 26, 2026 View original

▶ The 2-minute explainer

Summary

This research reveals a fundamental difference in deliberation between humans and large reasoning models (LRMs): while both spend more time on harder problems, LRMs spend more tokens on problems they get wrong, whereas humans spend less time on their failures. This suggests LRMs' deliberation is driven by uncertainty, while humans' is driven by engagement or abandonment.

New research highlights a key divergence in problem-solving strategies between humans and large reasoning models (LRMs). While both exhibit longer deliberation times for more difficult problems, their internal mechanisms differ significantly when facing incorrect solutions. Specifically, LRMs tend to expend more computational resources, measured in tokens, when they arrive at a wrong answer compared to when they solve the same problem correctly. In contrast, humans demonstrate the opposite behavior, spending less time on problems they ultimately fail to solve. This pattern suggests that LRMs' extended processing on incorrect answers is likely driven by uncertainty, as they continue to search for a solution. Humans, however, appear to employ an engagement-versus-abandonment strategy, investing more time in problems they anticipate solving and disengaging from those they expect to fail. This distinction is crucial for understanding and improving AI reasoning capabilities.

Why it matters

Understanding these differences is critical for developing more human-like and efficient AI reasoning systems, particularly in areas requiring nuanced decision-making or resource-constrained environments. It informs how we interpret AI failures and design better AI agents.

How to implement this in your domain

  1. 1Design AI models with explicit mechanisms to detect and manage uncertainty, potentially leading to earlier disengagement from unsolvable problems.
  2. 2Implement meta-reasoning components that can learn to "give up" or re-evaluate strategies when initial attempts are unproductive.
  3. 3Develop training methodologies that reward efficient problem-solving, including the ability to identify and abandon dead ends.
  4. 4Incorporate human-like cognitive biases or heuristics into AI models to explore alternative deliberation allocation strategies.

Who benefits

AI DevelopmentCognitive ScienceRoboticsEducationSoftware Engineering

Key takeaways

  • LRMs spend more tokens on wrong answers, while humans spend less time on their failures.
  • AI deliberation is driven by uncertainty, human deliberation by engagement/abandonment.
  • This difference impacts AI efficiency and problem-solving strategies.
  • Future AI design should consider more sophisticated meta-reasoning for resource allocation.

Original post by Han-yu Wang

"arXiv:2606.26502v1 Announce Type: new Abstract: Large reasoning models (LRMs) take longer on harder problems, just as humans do. This surface similarity hides an opposite pattern within items. When an LRM gets a problem wrong, it spends more tokens than when it gets the same prob…"

View on X

Originally posted by Han-yu Wang on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses