Weave Router Optimizes LLM Costs for Coding Agents

adchurch· June 26, 2026 View original

Summary

Weave has developed a model router that intelligently directs coding agent requests to the most cost-effective and suitable large language models, saving up to 40% on token costs.

Weave has introduced a new model router designed to optimize the use of large language models (LLMs) within coding agents like Claude Code, Codex, and Cursor. This router acts as an intermediary, analyzing each inference request and dynamically selecting the best LLM to handle it, prioritizing both cost-efficiency and performance. The system employs a reinforcement learning model, trained on extensive agent traces, to make intelligent routing decisions. For instance, complex planning tasks might be sent to a powerful model like Opus 4.8, while simpler code exploration or implementation steps could be routed to faster and cheaper alternatives. Weave reports internal savings of 40% on token costs without compromising quality or development velocity. The router is available as a source-available tool under the Elastic License 2.0 for self-hosting, or via a hosted service.

Why it matters

This tool offers a practical solution for professionals to significantly reduce the operational costs of using advanced LLMs in development workflows while maintaining high performance and quality.

How to implement this in your domain

1Integrate the Weave Router into existing coding agent workflows to optimize LLM usage and reduce costs.
2Evaluate the cost savings and performance improvements by A/B testing the router against direct LLM calls.
3Customize routing logic based on specific project requirements and preferred LLM capabilities.
4Deploy the router either by self-hosting the source-available version or utilizing the hosted service.
5Train internal teams on leveraging intelligent model routing for more efficient AI-assisted software development.

Who benefits

Software DevelopmentAI EngineeringIT ServicesDevOps

Key takeaways

The Weave Router intelligently routes coding agent requests to optimal LLMs.
It can achieve significant cost savings, up to 40% on token usage.
The router maintains performance and quality by selecting models based on task complexity.
It is available for self-hosting or as a hosted service.

Original post by adchurch

"We built a model router that plugs into coding agents (e.g. Claude Code, Codex, Cursor, etc.) and intelligently sends requests to the best model to serve them. Here's a quick demo of running it locally: https://www.youtube.com/watch?v=isKhAyivtfM . At Weave, w…"

View on X

Originally posted by adchurch on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

More in AI Engineering & DevTools

AI Engineering & DevToolsAI News & Tools

MCP and A2A Protocols Standardize Agentic Internet Development

The Model Context Protocol (MCP) and Agent-to-Agent (A2A) Protocol are standardizing how AI agents discover tools, call services, and coordinate across systems. Understanding these protocols is crucial for developers building agent-compatible infrastructure.

Theo VasilisJun 28, 2026

Video

AI ResearchAI Engineering & DevTools

VISReg Enhances JEPA Training with Novel Regularization

A new research paper introduces VISReg, a Variance-Invariance-Sketching Regularization technique designed to improve the training of Joint Embedding Predictive Architectures (JEPA). This method aims to create more robust and generalizable self-supervised learning models.

@_akhaliqJun 28, 2026

AI News & ToolsAI Engineering & DevTools

Ford's AI-Driven Layoffs Backfire Significantly

Ford reportedly replaced human workers with AI, a decision that subsequently led to severe negative repercussions for the company.

speckxJun 28, 2026