Thesis Explores Bayesian Principles for Deep Learning Uncert

Thesis Explores Bayesian Principles for Deep Learning Uncertainty and Generalization

Luis A. Ortega· June 15, 2026 View original

▶ The 60-second brief

Summary

This thesis investigates how Bayesian principles can enhance the understanding of modern deep learning systems, focusing on generalization and uncertainty quantification. It introduces a scalable Bayesian framework (DVIP), post-hoc uncertainty methods (VaLLA, FMGP), and a unified theoretical framework connecting diversity, smoothness, and stochasticity to generalization.

This doctoral thesis presents a comprehensive investigation into applying Bayesian principles to deepen the understanding of modern deep learning systems. The core focus is on two critical aspects: the ability of neural networks to generalize effectively and their capacity to quantify uncertainty in their predictions. The research approaches these challenges from both methodological and theoretical perspectives, unifying Bayesian inference, function-space modeling, and large-deviation theory. Methodologically, the thesis introduces the Deep Variational Implicit Process (DVIP), a scalable Bayesian framework that extends implicit processes to deep architectures. Additionally, it proposes two post-hoc methods, the Variational Linearized Laplace Approximation (VaLLA) and the Fixed-Mean Gaussian Process (FMGP), designed to equip pre-trained deterministic networks with well-calibrated uncertainty estimates without requiring retraining. The theoretical contributions address the fundamental question of why large, over-parameterized neural networks generalize so effectively. The thesis develops a unified probabilistic framework that elucidates the connections between three key mechanisms—diversity, smoothness, and stochasticity—within the context of PAC-Bayesian and large-deviation theory. This work provides a more profound theoretical foundation for understanding deep learning's generalization capabilities.

Why it matters

Professionals building and deploying AI models, especially in high-stakes applications, need reliable uncertainty estimates and a deeper understanding of generalization to ensure trustworthiness, safety, and performance.

How to implement this in your domain

1Apply post-hoc uncertainty quantification methods like VaLLA or FMGP to existing pre-trained deep learning models to obtain calibrated uncertainty estimates.
2Explore the Deep Variational Implicit Process (DVIP) for developing scalable Bayesian deep learning architectures.
3Incorporate principles of diversity, smoothness, and stochasticity into model design and training to improve generalization.
4Utilize the theoretical insights to better interpret and debug the generalization behavior of complex neural networks.

Who benefits

HealthcareAutonomous VehiclesFinancial ServicesScientific ResearchAI/ML Engineering

Key takeaways

Bayesian principles offer deeper insights into deep learning generalization and uncertainty.
DVIP provides a scalable Bayesian framework for deep architectures.
VaLLA and FMGP enable post-hoc uncertainty estimation for pre-trained networks.
Diversity, smoothness, and stochasticity are key mechanisms for deep learning generalization.

Original post by Luis A. Ortega

"arXiv:2606.13818v1 Announce Type: new Abstract: This thesis investigates how Bayesian principles can deepen our understanding of modern deep learning systems. While neural networks achieve remarkable predictive performance, their ability to generalize and to quantify uncertainty…"

View on X

Originally posted by Luis A. Ortega on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

Thesis Explores Bayesian Principles for Deep Learning Uncertainty and Generalization

Why it matters

How to implement this in your domain

Who benefits

Key takeaways

Want to go deeper?

More in AI Research

VISReg Enhances JEPA Training with Novel Regularization

Margaret Atwood Criticizes AI for "Garbage In, Garbage Out" Flaw

Podcast Explores Large Test-Time Compute and AI Model Budgets