SchemaRAG Boosts LLM Structured Information Extraction Efficiency

Sin Yu Bonnie Ho, Arlie Coles, Erik Larsson, Eric Marshall, Nathan Bodenstab, Paul Vozila· July 2, 2026 View original

Summary

Researchers propose SchemaRAG, a retrieval-augmented generation (RAG) framework that dynamically prunes large and complex output schemas for LLM-driven structured information extraction. SchemaRAG significantly improves micro-F1 scores, reduces latency, and lowers token costs by leveraging schema metadata and few-shot examples.

Extracting structured data from unstructured text using large language models (LLMs) becomes increasingly difficult when dealing with extensive and intricate target schemas. Including the entire schema in the LLM prompt can lead to higher costs, increased latency, performance degradation due to "lost-in-the-middle" effects, and even exceed context length limitations. To overcome these challenges, a new retrieval-augmented generation (RAG) framework called SchemaRAG has been introduced. SchemaRAG intelligently prunes the output schema space dynamically for schema-conditioned information extraction tasks. It achieves this by effectively utilizing available schema metadata and few-shot examples. Evaluations conducted on real-world healthcare and e-commerce datasets demonstrated SchemaRAG's effectiveness. The framework achieved notable improvements, including an 8.8% increase in micro-F1 scores, a 47% reduction in latency, and a 48% reduction in token costs, proving its practical value for large-schema extraction scenarios.

Why it matters

Professionals in data engineering, AI development, and business intelligence can use SchemaRAG to make LLM-driven structured information extraction more efficient, cost-effective, and accurate, especially when dealing with complex enterprise data.

How to implement this in your domain

  1. 1Implement SchemaRAG to optimize LLM-driven structured information extraction from unstructured text.
  2. 2Leverage schema metadata and few-shot examples to dynamically prune large schemas in prompts.
  3. 3Integrate SchemaRAG into data processing pipelines for healthcare, e-commerce, or other data-intensive domains.
  4. 4Benchmark the cost, latency, and accuracy improvements against current full-schema prompting methods.

Who benefits

HealthcareE-commerceFinancial ServicesLegalTechData Analytics

Key takeaways

  • SchemaRAG dynamically prunes large schemas for LLM-driven information extraction.
  • It significantly improves extraction accuracy (micro-F1), reduces latency, and lowers token costs.
  • The framework leverages schema metadata and few-shot examples for efficient schema reduction.
  • SchemaRAG is practical for complex, large-schema extraction tasks in real-world applications.

Original post by Sin Yu Bonnie Ho, Arlie Coles, Erik Larsson, Eric Marshall, Nathan Bodenstab, Paul Vozila

"arXiv:2607.00008v1 Announce Type: cross Abstract: Extracting structured data from unstructured text using large language models (LLMs) becomes challenging when target schemas are large and complex. In such cases, including the full schema in the prompt increases cost and latency,…"

View on X

Originally posted by Sin Yu Bonnie Ho, Arlie Coles, Erik Larsson, Eric Marshall, Nathan Bodenstab, Paul Vozila on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses