Predicting LLM Compositional Failures Using Feature Geometry.

Jennifer Meng Lu, Ruochen Zhang, Isabelle Lee, David Alvarez-Melis, Ellie Pavlick, Naomi Saphra· June 15, 2026 View original

Summary

This research demonstrates that an LLM's representational geometry can predict its compositional failures. When concepts are encoded near-orthogonally, the model composes them reliably; when encodings are close, causing interference, the model fails. This method anticipates failure modes without evaluating specific inputs.

Identifying challenging scenarios for Large Language Models (LLMs) typically involves human intuition or extensive benchmark curation. This paper introduces a novel approach: predicting an LLM's compositional errors by analyzing its internal representational geometry. The core finding is that compositional failure can be attributed to interference between salient features within the model's internal representations. Specifically, if a pair of concepts is encoded with near-orthogonal linear representations, the LLM reliably composes them. Conversely, if their linear encodings are too close, leading to interference, the model struggles to compose them effectively. This method reliably anticipates failure modes across various compositional tasks, including programmatic settings, multi-hop reasoning, and multilingual factual recall, without requiring evaluation of specific inputs. These results provide a scalable foundation for identifying high-risk examples, constructing targeted stress tests, and enhancing active learning strategies in real-world LLM deployments.

Why it matters

For professionals involved in LLM development, testing, and deployment, this research offers a powerful diagnostic tool. It enables proactive identification of potential failure points in compositional tasks, allowing for more targeted model improvement, robust stress testing, and safer deployment of AI systems.

How to implement this in your domain

  1. 1Analyze the representational geometry of your LLMs to predict potential compositional failure modes.
  2. 2Develop tools to visualize and measure the orthogonality of concept encodings within your models.
  3. 3Use predicted failure modes to construct targeted adversarial examples and stress tests for LLM evaluation.
  4. 4Integrate geometric analysis into active learning pipelines to prioritize data collection for challenging concept combinations.

Who benefits

AI DevelopmentSoftware TestingCybersecurityQuality AssuranceResearch & Development

Key takeaways

  • LLM compositional failures can be predicted from their internal feature geometry.
  • Near-orthogonal concept encodings lead to reliable composition.
  • Close concept encodings cause interference and compositional failure.
  • This method allows for proactive identification of failure modes without input evaluation.

Original post by Jennifer Meng Lu, Ruochen Zhang, Isabelle Lee, David Alvarez-Melis, Ellie Pavlick, Naomi Saphra

"arXiv:2606.13934v1 Announce Type: new Abstract: Humans cannot always intuit what scenarios are most challenging to LLMs. Hoping to capture challenging edge cases, developers either design problems to be difficult for humans or curate extensive benchmarks. What if we could instead…"

View on X

Originally posted by Jennifer Meng Lu, Ruochen Zhang, Isabelle Lee, David Alvarez-Melis, Ellie Pavlick, Naomi Saphra on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses