Zero-Shot Size Transfer for Graph Neural ODEs Proven

Mingsong Yan, Zhida Wang, Sui Tang· June 26, 2026 View original

Summary

This paper develops a quantitative theory for zero-shot size transfer in Graph Neural Differential Equations (GNDEs) on sparse random graphs. It establishes trajectory-wise convergence of GNDE solutions to Graphon-NDE solutions and uniform-in-time convergence for adjoint systems, supporting training on small graphs for deployment on larger ones.

Graph Neural Differential Equations (GNDEs) are models that capture continuous-time dynamics on graphs by using Graph Neural Networks to parameterize velocity fields. A key, intuitive principle for GNDEs is "zero-shot size transfer," meaning a model trained on a smaller graph could be directly applied to larger, similar graphs without needing retraining. This research provides a rigorous quantitative theory to support this principle, specifically for sparse random graphs derived from graphons. The study introduces Graphon Neural Differential Equations (Graphon-NDEs) and their adjoint counterparts as the theoretical infinite-node limits of GNDE systems, proving their well-posedness. For an $n$-node random graph, the researchers demonstrate that GNDE solutions converge to Graphon-NDE solutions at a rate of $O((\alpha_n n)^{-1/2})$, with high probability, up to logarithmic factors. This convergence is also established for adjoint systems, which are crucial for calculating gradients during training. Furthermore, the paper investigates the consistency between "discretize-then-optimize" (DTO) and "optimize-then-discretize" (OTD) training approaches. It shows that these methods are asymptotically consistent, with discrepancies in hidden-state and parameter gradients decreasing with the number of discretization steps. Experimental results on various graphon classes confirm the theoretical rates and validate the effectiveness of zero-shot transfer, allowing learned GNDEs to be accurately deployed on larger, independently sampled graphs.

Why it matters

For professionals working with graph-structured data, this research validates the efficiency of training Graph Neural ODEs on smaller datasets and deploying them on much larger, real-world graphs without costly retraining, significantly improving scalability and reducing computational overhead.

How to implement this in your domain

  1. 1Design Graph Neural Differential Equations (GNDEs) with the expectation of zero-shot size transfer for scalability.
  2. 2Train GNDEs on smaller, representative graph datasets to reduce computational costs.
  3. 3Deploy trained GNDEs on larger, unseen graphs without requiring additional retraining.
  4. 4Consider the theoretical convergence rates when selecting graph sizes and discretization steps for training and deployment.

Who benefits

Social NetworksDrug DiscoveryLogisticsCybersecurityMaterials Science

Key takeaways

  • Zero-shot size transfer for GNDEs on sparse random graphs is theoretically proven.
  • GNDE solutions converge to Graphon-NDE limits with quantifiable rates.
  • Adjoint systems also exhibit uniform-in-time convergence.
  • Training on small graphs for deployment on large ones is validated.

Original post by Mingsong Yan, Zhida Wang, Sui Tang

"arXiv:2606.26662v1 Announce Type: new Abstract: Graph Neural Differential Equations (GNDEs) model continuous-time graph dynamics by parameterizing Neural ODE velocity fields with Graph Neural Networks. Their local, size-independent filters suggest a zero-shot size-transfer princi…"

View on X

Originally posted by Mingsong Yan, Zhida Wang, Sui Tang on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses