Emyx Accelerates All-Atom Protein Generation with High Efficiency

Nicholas J. Williams, Ward Haddadin, Matteo P. Ferla, Constantin Schneider, Nicholas B. Woodall, Ruby Sedgwick, Christian D. Madsen, Andrew L. Hopkins, Edward O. Pyzer-Knapp· June 19, 2026 View original

Summary

Emyx is a new 140M-parameter conditional flow matching model designed for fast and efficient all-atom protein generation, particularly for enzyme design. It significantly reduces training costs and outperforms larger, existing models like Proteina-Complexa and RFdiffusion3 in terms of success rate, structural novelty, diversity, and geometric validity.

Computational enzyme design requires generative models capable of producing proteins with high geometric accuracy and structural diversity to scaffold catalytic residues and ligands effectively. Existing all-atom protein generators often suffer from high computational costs and limited sample diversity, largely due to inheriting complex architectures from structure prediction tasks. This research introduces Emyx, a novel 140-million-parameter conditional flow matching model that rethinks the architecture for generative tasks. Emyx concentrates its capacity within standard transformer blocks, replacing heavy embedding stacks with lightweight conditional representations and sparse connectivity. This design choice significantly reduces the model's complexity and training overhead. Emyx also features an exact reparametrization of the flow matching interpolant into the EDM noise-level framework, bridging training efficiency with state-of-the-art sampling methods without requiring retraining. Despite being a much smaller model, Emyx outperforms leading generators like Proteina-Complexa and RFdiffusion3 across critical benchmarks, including global fold recovery, catalytic geometry accuracy, structural novelty, scaffold diversity, and geometric validity, all while training approximately four times faster.

Why it matters

Professionals in biotechnology, pharmaceuticals, and materials science can leverage Emyx to accelerate the design and discovery of novel enzymes and proteins, leading to faster development cycles for new drugs, industrial catalysts, and biomaterials.

How to implement this in your domain

  1. 1Adopt Emyx for rapid and cost-effective generation of novel protein structures in research and development.
  2. 2Integrate conditional flow matching models into enzyme engineering workflows for targeted catalyst design.
  3. 3Explore Emyx's capabilities for generating diverse protein scaffolds for various biological applications.
  4. 4Benchmark Emyx against existing protein design tools to optimize internal computational pipelines.

Who benefits

BiotechnologyPharmaceuticalsDrug DiscoveryMaterials Science

Key takeaways

  • Emyx is a fast and efficient all-atom protein generation model.
  • It uses a conditional flow matching approach with a smaller parameter count.
  • Emyx significantly reduces training costs and improves sample diversity.
  • It outperforms leading models in success rate, novelty, and geometric validity.

Original post by Nicholas J. Williams, Ward Haddadin, Matteo P. Ferla, Constantin Schneider, Nicholas B. Woodall, Ruby Sedgwick, Christian D. Madsen, Andrew L. Hopkins, Edward O. Pyzer-Knapp

"arXiv:2606.19377v1 Announce Type: new Abstract: Computational enzyme design requires generating proteins that scaffold catalytic residues and ligands, a task that demands both geometric accuracy and structural diversity from the underlying generative model. Current all-atom gener…"

View on X

Originally posted by Nicholas J. Williams, Ward Haddadin, Matteo P. Ferla, Constantin Schneider, Nicholas B. Woodall, Ruby Sedgwick, Christian D. Madsen, Andrew L. Hopkins, Edward O. Pyzer-Knapp on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses