New Framework Boosts GUI Agent Performance with Skill-Guided Distillation

Zhimin Fan, Hongwei Yu, Yeqing Shen, Haolong Yan, Guozhen Peng, Tianhao Peng, Yudong Zhang, Xiaowen Zhang, Kaijun Tan, Zheng Ge, Xiangyu Zhang, Daxin Jiang· June 18, 2026 View original

Summary

Researchers propose Skill-Guided Continuation Distillation (SGCD), an iterative self-improvement framework for GUI agents that addresses the challenge of off-trajectory states. SGCD uses a skill-guided policy to complete tasks from these unseen states, generating successful continuations that provide crucial supervision and significantly improve agent success rates.

A new research paper introduces Skill-Guided Continuation Distillation (SGCD), an iterative framework designed to enhance the performance of GUI (Graphical User Interface) agents. Current methods often rely on imitating expert demonstrations, but agents frequently encounter situations not covered by these pre-recorded paths, known as "off-trajectory states." SGCD tackles this supervision gap by first allowing the agent to explore and reach these realistic off-trajectory states. Then, a skill-guided policy takes over to successfully complete the task from these novel states, generating new, successful continuation trajectories. These newly generated continuations are then combined with the original expert data, providing the agent with crucial supervision for previously unseen scenarios. By extracting skills like "Continuation Plans" and "Failure Traps" from both successful and failed attempts, SGCD significantly improved the success rates of various base models on the OSWorld-Verified benchmark.

Why it matters

This framework is highly relevant for professionals developing AI agents for automation, customer service, or software testing, as it enables agents to handle unexpected situations more robustly and reduces the need for exhaustive expert demonstrations.

How to implement this in your domain

  1. 1Integrate the SGCD framework into the training pipeline for GUI automation agents.
  2. 2Develop mechanisms to identify and record "off-trajectory states" during agent execution.
  3. 3Extract and define "skills" (e.g., continuation plans, failure traps) from agent rollouts to guide policy improvement.
  4. 4Utilize generated successful continuations to augment expert datasets for more comprehensive training.

Who benefits

Software TestingBusiness Process AutomationCustomer ServiceRoboticsUI/UX Development

Key takeaways

  • SGCD is an iterative self-improvement framework for GUI agents.
  • It addresses the problem of agents encountering off-trajectory states.
  • Skill-guided policies generate successful continuations for unseen scenarios.
  • The method significantly improves GUI agent success rates.

Original post by Zhimin Fan, Hongwei Yu, Yeqing Shen, Haolong Yan, Guozhen Peng, Tianhao Peng, Yudong Zhang, Xiaowen Zhang, Kaijun Tan, Zheng Ge, Xiangyu Zhang, Daxin Jiang

"arXiv:2606.18890v1 Announce Type: new Abstract: Improving GUI agents typically relies on behavior cloning on expert trajectories. However, as the current policy deviates from the expert policy, it inevitably encounters policy-induced off-trajectory states during closed-loop execu…"

View on X

Originally posted by Zhimin Fan, Hongwei Yu, Yeqing Shen, Haolong Yan, Guozhen Peng, Tianhao Peng, Yudong Zhang, Xiaowen Zhang, Kaijun Tan, Zheng Ge, Xiangyu Zhang, Daxin Jiang on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses