Transformer for Jet Tagging Implemented on Versal AI Engines.

Gram Koski, Sean Lipps, Zhenghua Ma, G. Abarajithan, Ryan Kastner· June 17, 2026 View original

Summary

This paper presents an initial implementation of a quantized, integer-only transformer model for jet tagging on AMD Versal AI Engines (AIEs), addressing the challenge of deploying high-performance models in low-latency, resource-constrained trigger systems. A reusable software framework is introduced to automatically generate Vitis graph code from a high-level Python model description.

Transformer-based models have demonstrated strong performance in jet tagging, a crucial task at the CERN Large Hadron Collider (LHC). However, deploying these complex models within the LHC's low-latency, resource-constrained trigger systems poses a significant engineering challenge. Researchers have tackled this by presenting an initial implementation of a quantized, integer-only transformer specifically designed for jet tagging. This model is mapped onto AMD Versal AI Engines (AIEs), with dense and multi-head attention layers configured to run efficiently on AIE tiles. A key contribution is a reusable software framework that simplifies this deployment. It represents transformer layers as composable AIE building blocks and automatically generates the necessary Vitis graph code from a high-level Python model description. This open-source framework provides a robust foundation for future research and development in deploying advanced AI models on reconfigurable hardware.

Why it matters

This work is vital for enabling the real-time processing of complex scientific data, particularly in high-energy physics, by demonstrating how advanced AI models can be efficiently deployed on specialized hardware. Professionals in scientific computing, hardware acceleration, and embedded AI can leverage this for high-throughput, low-latency applications.

How to implement this in your domain

  1. 1Explore the use of AMD Versal AI Engines for accelerating transformer-based models in latency-critical applications.
  2. 2Utilize the open-source software framework to streamline the deployment of quantized AI models on reconfigurable hardware.
  3. 3Investigate integer-only quantization techniques for optimizing AI models for embedded and edge devices.
  4. 4Apply the principles of composable building blocks for AI layers to other hardware acceleration projects.
  5. 5Collaborate with hardware vendors to push the boundaries of AI model deployment on specialized accelerators.

Who benefits

High-Energy PhysicsScientific ComputingAerospace & DefenseEdge AISemiconductor

Key takeaways

  • A quantized transformer for jet tagging is implemented on AMD Versal AI Engines.
  • The solution addresses low-latency, resource-constrained deployment challenges.
  • A reusable software framework automates Vitis graph code generation from Python.
  • This work provides a foundation for deploying advanced AI on reconfigurable hardware.

Original post by Gram Koski, Sean Lipps, Zhenghua Ma, G. Abarajithan, Ryan Kastner

"arXiv:2606.17500v1 Announce Type: new Abstract: Transformer-based models achieve strong performance for jet tagging at the CERN LHC, but deploying them in low-latency, resource-constrained trigger systems is challenging. We present an initial implementation of a quantized, intege…"

View on X

Originally posted by Gram Koski, Sean Lipps, Zhenghua Ma, G. Abarajithan, Ryan Kastner on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses