Ghost Attractor Networks Offer Efficient, Stable Robotic Action Decoding

Tianyu Wang, Ying Wang, Zhihao Liu, Xi Vincent Wang, Lihui Wang· June 18, 2026 View original

Summary

Researchers propose Ghost Attractor Networks, a dynamical decoder that generates basin-structured latent representations for efficient and stable closed-loop sequential generation. This approach significantly reduces parameters and latency compared to large Transformers and diffusion models while maintaining accuracy.

Generating sequential outputs with large-scale Transformer and diffusion decoders often incurs high memory costs that scale with sequence length, alongside iterative per-step computation. While smaller feed-forward decoders offer efficiency, they typically produce unstructured latent representations, hindering closed-loop control features like phase-conditioned action generation and cross-step latent carry-over, which require stable latent basins. This paper introduces Ghost Attractor Networks, a theoretically derived dynamical decoder designed to address these limitations. Its latent state evolves under a learned potential with drift, inherently creating a basin-attractor structure. This design is motivated by the need for multi-modality, single-pass switching at the decoder level, and constant memory usage. Mode transitions in Ghost arise from saddle-node bifurcations with ghost-attractor escape, and a hierarchical phase-space decomposition separates initial basin convergence from subsequent proprioceptive refinement. Empirical evaluations show that a Ghost network, trained end-to-end, exhibits the predicted gradient-flow contraction. As a robotic action decoder, a 2.3-million-parameter Ghost matched the offline accuracy of a 1.07-billion-parameter Diffusion Transformer, achieving this with 462 times fewer parameters and 32 times lower latency. It also outperformed five other 2M-parameter decoders in offline mean squared error. In closed-loop benchmarks, phase conditioning on Ghost's basin-structured latent improved success rates significantly, with persistent-latent ensembling reaching a 95.7% success rate.

Why it matters

This research offers a breakthrough in efficient and stable sequential generation, particularly for robotics and other real-time control systems. It enables the deployment of highly capable decoders in resource-constrained environments, significantly reducing computational overhead while improving control and reliability.

How to implement this in your domain

  1. 1Investigate Ghost Attractor Networks as an alternative to large Transformer or diffusion decoders for sequential generation tasks in robotics or control systems.
  2. 2Apply basin-structured dynamical decoders to improve closed-loop control and phase-conditioned action generation in autonomous agents.
  3. 3Develop and deploy more memory-efficient and lower-latency AI models for edge computing or embedded systems using this approach.
  4. 4Explore the use of learned potential functions and drift for creating stable and interpretable latent representations in generative models.
  5. 5Benchmark existing sequential generation models against Ghost Attractor Networks for efficiency and performance in specific applications.

Who benefits

RoboticsAutonomous SystemsEdge AIManufacturingGaming

Key takeaways

  • Ghost Attractor Networks offer efficient, stable sequential generation with basin-structured latents.
  • They significantly reduce parameters and latency compared to large Transformers and diffusion models.
  • The design enables robust closed-loop control and phase-conditioned action generation.
  • This approach is highly effective for robotic action decoding in resource-constrained environments.

Original post by Tianyu Wang, Ying Wang, Zhihao Liu, Xi Vincent Wang, Lihui Wang

"arXiv:2606.18315v1 Announce Type: cross Abstract: Sequential output generation with large-scale Transformer and diffusion decoders pays a memory cost that grows with sequence length, plus iterative per-step computation. Replacing them with small feed-forward decoders restores eff…"

View on X

Originally posted by Tianyu Wang, Ying Wang, Zhihao Liu, Xi Vincent Wang, Lihui Wang on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses