Hybrid Open-Ended Tri-Evolution Improves AI for Deep Research Tasks

Hongming Piao, Chi Liu, Mengzhuo Chen, Yan Shu, Derek Li, Ying Wei, Bryan Dai· June 15, 2026 View original

Summary

The Hybrid Open-Ended Tri-Evolution (HOTE) framework is proposed to enable autonomous evolving agents for open-ended research tasks. It uses hybrid-mode reinforcement learning to collaboratively evolve a proposer, solver, and judge based on web-scale knowledge, outperforming static and state-of-the-art models on deep research benchmarks.

The Hybrid Open-Ended Tri-Evolution (HOTE) framework has been introduced to advance AI agents in deep research and autonomous evolution, crucial steps towards artificial general intelligence. Traditional deep research methods are often limited by static parametric capabilities, while existing agent evolution techniques are primarily validated on verifiable tasks with clear answers, leaving a gap for open-ended research. HOTE bridges these two areas by employing hybrid-mode reinforcement learning. This approach facilitates the collaborative evolution of three distinct modules: a proposer, a solver, and a judge. These modules learn and adapt based on web-scale knowledge, enabling agents to autonomously evolve in open-ended tasks and environments. Extensive experiments were conducted on three long-form deep research benchmarks. The results demonstrated that an 8B model trained using HOTE surpassed the performance of stronger static open models (8-32B parameters) and also outperformed models trained with other state-of-the-art deep research methods, all while requiring less training time. The study further confirmed that the evolution of all three modules within HOTE is essential for its superior performance.

Why it matters

This framework represents a significant step towards creating more autonomous and capable AI agents that can conduct complex, open-ended research. Professionals can leverage such evolving AI systems to accelerate discovery, synthesize information from vast datasets, and tackle ill-defined problems in various domains.

How to implement this in your domain

  1. 1Explore the HOTE framework for developing AI agents capable of open-ended research and problem-solving.
  2. 2Design multi-agent systems with distinct roles (proposer, solver, judge) that can collaboratively evolve.
  3. 3Integrate hybrid-mode reinforcement learning into AI training pipelines for continuous capability improvement.
  4. 4Apply evolving AI agents to complex, ill-defined research tasks requiring autonomous information retrieval and synthesis.

Who benefits

AI ResearchScientific DiscoveryMarket ResearchConsultingData Analytics

Key takeaways

  • HOTE enables autonomous evolving AI agents for open-ended research tasks.
  • It uses hybrid-mode reinforcement learning to evolve a proposer, solver, and judge.
  • HOTE-trained models outperform stronger static and state-of-the-art deep research models.
  • The collaborative evolution of all three modules is critical for its effectiveness.

Original post by Hongming Piao, Chi Liu, Mengzhuo Chen, Yan Shu, Derek Li, Ying Wei, Bryan Dai

"arXiv:2606.13710v1 Announce Type: new Abstract: Deep research and agent evolution serve as de-facto tasks for AI agents in real-world applications toward artificial general intelligence. The former enables autonomous retrieval and integration of information in open-ended environm…"

View on X

Originally posted by Hongming Piao, Chi Liu, Mengzhuo Chen, Yan Shu, Derek Li, Ying Wei, Bryan Dai on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses