FoundCause Discovers Causal Graphs with Latent Confounders.

Patrick Bl\"obaum, Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan· June 17, 2026 View original

Summary

This paper introduces FoundCause, an amortized causal discovery model trained on synthetic data that maps datasets directly to causal graphs, explicitly modeling latent confounding. It significantly outperforms classical and other amortized methods on real-world datasets in terms of F1 score, AUROC, and structural Hamming distance.

Discovering causal relationships from observational data is a complex challenge, particularly due to the need to identify directed structures and account for unobserved, or latent, confounding variables without the benefit of interventions. Traditional methods often struggle with these aspects. Researchers have developed FoundCause, an amortized causal discovery model designed to address these limitations. FoundCause is trained entirely on large collections of synthetic structural causal models, allowing it to learn transferable statistical patterns and map datasets directly to causal graphs in a single forward pass. This approach explicitly models latent confounding using learnable latent tokens. The model incorporates several key inductive biases, including a permutation-invariant transformer encoder with alternating attention over samples and variables, and the injection of pairwise statistical features. A factorized decoder separates edge existence from direction, and a triangular refinement module reasons over higher-order causal motifs. FoundCause also handles missing data. It significantly outperforms 11 classical and 4 other amortized causal discovery methods on 15 real-world datasets, achieving substantial improvements in F1 score, AUROC, and structural Hamming distance, while offering rapid inference.

Why it matters

This advancement provides a powerful and efficient tool for uncovering complex causal relationships in data, which is critical for evidence-based decision-making, policy formulation, and scientific discovery across many domains. Professionals can leverage this to gain deeper insights from observational data, leading to more effective strategies and interventions.

How to implement this in your domain

  1. 1Apply FoundCause to analyze observational datasets in your domain to uncover underlying causal structures.
  2. 2Integrate causal discovery methods into data analysis pipelines for more robust insights and decision-making.
  3. 3Explore how explicitly modeling latent confounders can improve the reliability of causal inferences in your work.
  4. 4Utilize the amortized nature of FoundCause for rapid causal graph discovery in large-scale data environments.
  5. 5Consider the implications of causal graphs for designing more effective interventions or policies based on data.

Who benefits

HealthcareSocial SciencesEconomicsMarketingPublic Policy

Key takeaways

  • FoundCause is a novel amortized model for causal discovery from observational data.
  • It explicitly models latent confounders, a significant challenge in causal inference.
  • The model outperforms many classical and amortized methods on real-world datasets.
  • It offers rapid inference by mapping datasets to causal graphs in a single pass.

Original post by Patrick Bl\"obaum, Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan

"arXiv:2606.17516v1 Announce Type: new Abstract: Causal discovery from observational data remains challenging due to the need to recover directed structure and latent confounding without interventions. We propose FoundCause, an amortized causal discovery model trained entirely on…"

View on X

Originally posted by Patrick Bl\"obaum, Krishnakumar Balasubramanian, Shiva Prasad Kasiviswanathan on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses