New Workflow Discovers Reaction Networks Using MCMC and Chemical-Informed GPs.

Runzhe Liu, Zihao Wang, Wenbo Yang, Shengyang Tao· June 24, 2026 View original

Summary

This paper introduces PC-MCMC-CIGP, a gray-box workflow combining physically constrained Markov Chain Monte Carlo (MCMC) with Chemical-Informed Gaussian Processes (CIGP) for discovering reaction networks from sparse chemical data. The method improves parameter calibration, experimental design, and distinguishes elementary pathways from deceptive fits, demonstrating enhanced performance in chemical optimization.

Extracting accurate governing equations from limited and noisy chemical time-series data is a significant challenge, largely due to the intertwined nature of discrete reaction topology and continuous kinetic parameters. Researchers have developed PC-MCMC-CIGP, a novel gray-box workflow designed to address this complexity. This method integrates spike-and-slab topology sampling, rigorous conservation and thermodynamic screening, and a Chemical-Informed Gaussian Process (CIGP) residual model. The core contribution lies not in isolated MCMC or GP families, but in their synergistic integration into a physically constrained workflow that incorporates explicit uncertainty-aware acquisition choices. Evaluations on benchmarks like H2 + Br2 demonstrated the constrained sampler's ability to differentiate elementary radical pathways from misleading phenomenological fits. In styrene epoxidation, the CIGP optimization loop led to a 12.5% improvement in final yield compared to a baseline. Further studies on acquisition strategies revealed that physically constrained criteria significantly reduce low-yield suggestions, while EI-style criteria still offer strong final-yield performance.

Why it matters

For professionals in chemical engineering and materials science, this workflow offers a powerful tool to accelerate the discovery and optimization of chemical reactions, leading to more efficient processes and novel material development.

How to implement this in your domain

  1. 1Apply the PC-MCMC-CIGP workflow to analyze complex chemical reaction systems with limited experimental data.
  2. 2Integrate spike-and-slab topology sampling to identify plausible reaction pathways and mechanisms.
  3. 3Utilize hard conservation and thermodynamic screening to ensure the physical validity of proposed reaction networks.
  4. 4Employ Chemical-Informed Gaussian Processes (CIGP) for robust parameter calibration and uncertainty quantification.
  5. 5Leverage the uncertainty-aware acquisition choices for intelligent experimental design, guiding future data collection to maximize information gain.

Who benefits

ChemicalPharmaceuticalMaterials ScienceBiotechnologyEnergy

Key takeaways

  • PC-MCMC-CIGP combines MCMC and CIGP for robust reaction network discovery.
  • It effectively extracts interpretable governing equations from sparse chemical data.
  • The workflow incorporates physical constraints and uncertainty-aware experimental design.
  • It improves reaction yield optimization and distinguishes true pathways from false fits.

Original post by Runzhe Liu, Zihao Wang, Wenbo Yang, Shengyang Tao

"arXiv:2606.23757v1 Announce Type: new Abstract: Extracting interpretable governing equations from sparse, noisy chemical time-series data remains difficult because discrete reaction topology and continuous kinetic parameters are tightly coupled. We present PC-MCMC-CIGP, a reprodu…"

View on X

Originally posted by Runzhe Liu, Zihao Wang, Wenbo Yang, Shengyang Tao on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses