scGTN Improves Single-Cell RNA Sequencing Clustering with Graph Transformers

Jinke Wu, Yifan Wang, Siyu Yi, Caiyang Yu, Ziyue Qiao, Nan Yin, Jiancheng Lv, Wei Ju· June 18, 2026 View original

Summary

Researchers propose scGTN, a deep Siamese Graph Transformer Network, for single-cell RNA sequencing (scRNA-seq) clustering. This framework integrates gene expression profiles and intercellular structural dependencies, explicitly addressing data sparsity, noise, and complex relationships to outperform existing methods.

A novel framework called scGTN (deep Siamese Graph Transformer Network) has been introduced to enhance clustering in single-cell RNA sequencing (scRNA-seq) data analysis. Current methods often struggle with the inherent sparsity, noise, and complex intercellular structural information present in scRNA-seq data. scGTN addresses these challenges by formulating scRNA-seq data as a graph. It constructs two augmented graph views that act as dual perspectives, capturing complementary intercellular information. A Siamese graph transformer network is then employed to explicitly incorporate shortest-path information and node-wise distances, thereby capturing richer structural relationships between cells. Finally, an optimal transport strategy guides the cell clustering in a self-supervised manner. Extensive experiments on multiple benchmark scRNA-seq datasets demonstrate that scGTN consistently outperforms existing methods, offering a more robust and accurate approach to identifying cell types and understanding cellular heterogeneity.

Why it matters

Bioinformaticians and life science researchers can leverage scGTN to achieve more accurate and robust clustering of single-cell RNA sequencing data, leading to deeper insights into cellular heterogeneity and disease mechanisms.

How to implement this in your domain

  1. 1Adopt scGTN for single-cell RNA sequencing data analysis to improve cell type identification and clustering accuracy.
  2. 2Integrate graph transformer networks into bioinformatics pipelines for analyzing complex biological data.
  3. 3Utilize the Siamese network architecture to capture complementary information from dual data views.
  4. 4Apply optimal transport strategies for self-supervised clustering in high-dimensional biological datasets.

Who benefits

BiotechnologyPharmaceuticalsHealthcareAcademia (Life Sciences)

Key takeaways

  • scGTN improves single-cell RNA sequencing clustering by addressing sparsity and noise.
  • The framework integrates gene expression and intercellular structural dependencies.
  • Siamese graph transformer networks capture rich cellular relationships.
  • It consistently outperforms existing methods on benchmark datasets.

Original post by Jinke Wu, Yifan Wang, Siyu Yi, Caiyang Yu, Ziyue Qiao, Nan Yin, Jiancheng Lv, Wei Ju

"arXiv:2606.18672v1 Announce Type: new Abstract: Single-cell RNA sequencing (scRNA-seq) serves a pivotal role in characterizing gene expression at the cellular level, enabling the identification of cell types and advancing the understanding of cellular heterogeneity. Despite the s…"

View on X

Originally posted by Jinke Wu, Yifan Wang, Siyu Yi, Caiyang Yu, Ziyue Qiao, Nan Yin, Jiancheng Lv, Wei Ju on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses