DR-DCI Scales Agentic Search Over Large Corpora with Dynamic Workspace.

Yi Lu, Zhuofeng Li, Ping Nie, Haoxiang Zhang, Yuyu Zhang, Kai Zou, Wenhu Chen, Jimmy Lin, Dongfu Jiang, Yu Zhang· June 16, 2026 View original

Summary

DR-DCI is a new retriever-steered framework that enhances Direct Corpus Interaction (DCI) for AI agents by dynamically pulling relevant documents into a local workspace. This approach allows agents to perform flexible search and verification operations efficiently across massive datasets, overcoming the scalability limitations of traditional DCI.

Agentic search systems often struggle with scalability when interacting directly with vast document corpora, relying on retrievers that provide limited views of information. This research introduces DR-DCI, a novel framework that improves upon Direct Corpus Interaction (DCI) by integrating retriever-steered dynamic workspace expansion. Instead of agents operating on the entire corpus, DR-DCI enables them to selectively retrieve and pull relevant documents into a localized workspace, where DCI operations can then be performed efficiently. This design effectively combines the broad recall capabilities of retrievers with the precise, flexible interaction offered by DCI. The agent can dynamically expand its workspace as needed, allowing for complex operations like filtering, comparison, and constraint verification across documents without being bogged down by full-corpus commands. This hybrid approach ensures both scalability and granular control over evidence resolution. Experimental results demonstrate DR-DCI's superior effectiveness and efficiency across various scales, from 100K to 10M documents, and even a 20M-scale Wiki-18 QA setting. It significantly improves accuracy and reduces tool usage, wall time, and estimated costs compared to raw DCI and other retrieval-based baselines, proving its robustness for large-scale agentic search tasks.

Why it matters

For professionals building or deploying AI agents that need to interact with extensive knowledge bases, DR-DCI offers a more efficient, scalable, and accurate method for information retrieval and synthesis. This can lead to more capable and cost-effective AI assistants and automated research tools.

How to implement this in your domain

  1. 1Evaluate DR-DCI for enhancing the performance of your AI agents in document-heavy tasks.
  2. 2Integrate the dynamic workspace concept into existing agentic search architectures to improve scalability.
  3. 3Develop agent workflows that leverage retriever-steered actions for efficient evidence gathering and verification.
  4. 4Benchmark DR-DCI against current retrieval methods to identify potential gains in accuracy and cost-efficiency.

Who benefits

AI DevelopmentLegalTechResearch & AcademiaCustomer SupportData Analytics

Key takeaways

  • DR-DCI improves agentic search by combining retriever recall with flexible DCI operations.
  • It dynamically expands a local workspace, enabling efficient interaction with large corpora.
  • The framework significantly boosts accuracy and reduces operational costs for AI agents.
  • DR-DCI demonstrates strong scalability across millions of documents.

Original post by Yi Lu, Zhuofeng Li, Ping Nie, Haoxiang Zhang, Yuyu Zhang, Kai Zou, Wenhu Chen, Jimmy Lin, Dongfu Jiang, Yu Zhang

"arXiv:2606.14885v1 Announce Type: new Abstract: Agentic search over large corpora relies on retriever-mediated interfaces (e.g., BM25 or ColBERT) for scalable candidate discovery. While effective at ranking relevant documents, these interfaces expose evidence only as ranked resul…"

View on X

Originally posted by Yi Lu, Zhuofeng Li, Ping Nie, Haoxiang Zhang, Yuyu Zhang, Kai Zou, Wenhu Chen, Jimmy Lin, Dongfu Jiang, Yu Zhang on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses