LLMs Guide ODE Discovery for Rare Diseases from Aggregate Data

Hanning Yang, Meropi Karakioulaki, Lennart Purucker, Tim Litwin, Cristina Has, Moritz Hess· July 2, 2026 View original

Summary

AgentODE, a new framework, uses LLMs to propose ODE structures and a tool-augmented agent to refine parameter distributions from population-level summary statistics. This enables mechanistic modeling for rare diseases with scarce, noisy data, recovering consistent ODE structures where individual-level data methods fail.

Mechanistic modeling using ordinary differential equations (ODEs) is invaluable for understanding complex dynamics, particularly in clinical settings where interpretable descriptions are crucial. However, rare diseases present unique challenges: both the model structure and parameters are often unknown, and individual-level data is scarce, noisy, and subject to privacy concerns. Population-level summary statistics offer a practical, privacy-preserving alternative, but existing methods struggle to jointly discover ODE structure and refine parameter distributions solely from such aggregate data. To address this gap, researchers introduce AgentODE, an end-to-end framework. AgentODE employs a large language model (LLM) to propose candidate ODE structures. A separate, tool-augmented inference agent then iteratively refines parameter distributions through a diagnosis-update loop, operating exclusively on population-level summary statistics. This approach allows for modeling parameters as distributions rather than fixed values, capturing inherent heterogeneity. Evaluations on three benchmark problems and two clinical datasets, including the rare disease recessive dystrophic epidermolysis bullosa (RDEB) with very limited patient data, demonstrate AgentODE's effectiveness. It consistently recovers functionally sound ODE structures across all settings. Notably, for sparse and noisy data like RDEB, reasoning from summary statistics led to mechanistically principled structure discovery, outperforming baselines that, despite having access to individual-level data, yielded implausible structures. AgentODE opens new avenues for mechanistic modeling in data-scarce and privacy-sensitive contexts.

Why it matters

Professionals in healthcare, particularly those involved in rare disease research and drug development, can leverage this framework to build interpretable mechanistic models from limited, privacy-sensitive data, accelerating understanding and therapeutic strategies.

How to implement this in your domain

  1. 1Explore AgentODE for developing mechanistic models in rare disease research or other data-scarce biological systems.
  2. 2Investigate using LLMs to generate initial ODE structures based on domain knowledge and existing literature.
  3. 3Apply tool-augmented inference agents to refine model parameters using only population-level summary statistics to maintain privacy.
  4. 4Compare AgentODE's structure discovery and parameter inference capabilities against traditional methods when working with limited and noisy data.

Who benefits

HealthcarePharmaceuticalsBiotechnologyMedical Research

Key takeaways

  • AgentODE enables LLM-guided ODE discovery and parameter inference from aggregate data.
  • It addresses challenges of data scarcity, noise, and privacy in rare disease modeling.
  • Reasoning from summary statistics can yield more mechanistically principled models than individual-level data in sparse settings.
  • The framework offers a new approach for interpretable mechanistic modeling in sensitive domains.

Original post by Hanning Yang, Meropi Karakioulaki, Lennart Purucker, Tim Litwin, Cristina Has, Moritz Hess

"arXiv:2607.00733v1 Announce Type: new Abstract: Mechanistic modeling via ordinary differential equations (ODEs) provides interpretable descriptions of complex dynamics and enables inference of underlying mechanisms, which is particularly valuable in clinical settings. However, in…"

View on X

Originally posted by Hanning Yang, Meropi Karakioulaki, Lennart Purucker, Tim Litwin, Cristina Has, Moritz Hess on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses

More in AI Research

AI ResearchAI Engineering & DevTools

Human Feedback Guides Generative Meta-Learning for Robust Generalization.

This paper introduces Generative Meta-Learning with Human Feedback (GMHF), a framework that uses expert intuition to guide data synthesis and bridge the domain gap for machine learning models. GMHF employs a Conditional Neural ODE as a generative digital twin and an RL agent to refine latent physical parameters based on feedback, significantly reducing deployment loss and improving generalization under distribution shifts.

Midhun Parakkal Unni, Samuel KaskiJul 2, 2026
AI ResearchAI Engineering & DevTools

Valdi: Value Diffusion World Models for MPC

Valdi introduces Value Diffusion World Models, combining end-to-end online training for Model Predictive Control (MPC) with a latent diffusion dynamics model. Preliminary experiments show that Valdi, using a single diffusion step, matches deterministic MLP baselines in the CarRacing environment, highlighting a trade-off between predictive multimodality and control performance.

Christopher Lindenberg, Kashyap ChittaJul 2, 2026
AI Engineering & DevToolsAI Research

Task-Aware LLM Quantization Improves Efficiency and Performance.

This paper introduces TASA (Task-Aware Sensitivity Analysis), a two-level framework for mixed-precision quantization of large language models (LLMs) that optimizes calibration data composition and bit allocation. TASA addresses the "Perplexity Illusion" and the "Alignment-Diversity Tradeoff," enabling 3.5-bit models to match or surpass 4-bit baselines by jointly considering perplexity and reasoning-oriented sensitivity.

Fei Wang, Chao Xue, Taoran Liu, Li Shen, Ye Liu, ChangXing DingJul 2, 2026