KAN-Guided Dynamic Graph Improves Single-Cell RNA-seq Clustering.

Jun Tang, Pengwei Hu, Sicong Gao, Jie Guo, Lun Hu, Xin Luo· June 30, 2026 View original

Summary

scKDGM is a KAN-guided dynamic graph masked learning framework designed for single-cell RNA sequencing (scRNA-seq) clustering, addressing challenges like high dimensionality and noise. It uses graph-aware gene masking, a KAN-based encoder, and dynamic graph construction with cross-view contrastive learning to achieve robust cell type identification.

Single-cell RNA sequencing (scRNA-seq) clustering is vital for identifying distinct cell types, but the data presents significant challenges due to its high dimensionality, sparsity, dropout events, and technical noise. Existing methods often fall short, with masked autoencoders primarily focusing on expression recovery and graph clustering relying on static k-Nearest Neighbor graphs without feedback loops for optimization. This research introduces scKDGM, a novel KAN-guided dynamic graph masked learning framework specifically tailored for scRNA-seq clustering. scKDGM incorporates several innovative components: Graph-Aware Distribution Preserving Gene Masking (GDP-Mask) to perturb cell identity, a KAN-based TAKGCN encoder for learning masked-view representations, and mask-guided expression recovery to construct a dynamic graph. Crucially, it uses cross-view contrastive learning to transfer recovery signals into topology updates, ensuring the graph evolves with improved expression. A Zero-Inflated Negative Binomial (ZINB) loss function is also employed to accurately model data overdispersion and zero inflation. Extensive experiments across 12 real scRNA-seq datasets demonstrate that scKDGM significantly outperforms 10 baseline methods in average Normalized Mutual Information (NMI) and Adjusted Rand Index (ARI), indicating superior performance in robust cell type identification.

Why it matters

For professionals in bioinformatics, drug discovery, and medical research, more accurate and robust single-cell RNA-seq clustering means better identification of cell types, leading to deeper biological insights and more targeted therapeutic development.

How to implement this in your domain

  1. 1Adopt scKDGM for single-cell RNA-seq data analysis to improve cell type identification accuracy.
  2. 2Explore integrating KAN-based encoders into other bioinformatics pipelines for complex data representation.
  3. 3Apply dynamic graph construction and masked learning principles to other high-dimensional biological datasets.
  4. 4Benchmark scKDGM against current in-house clustering methods to assess potential performance gains.

Who benefits

BiotechnologyPharmaceuticalsHealthcareAcademiaDiagnostics

Key takeaways

  • scKDGM is a new framework for robust single-cell RNA-seq clustering.
  • It addresses challenges like high dimensionality and noise using KANs and dynamic graphs.
  • The framework integrates gene masking, a KAN-based encoder, and contrastive learning.
  • scKDGM significantly outperforms existing methods in cell type identification.

Original post by Jun Tang, Pengwei Hu, Sicong Gao, Jie Guo, Lun Hu, Xin Luo

"arXiv:2606.28459v1 Announce Type: new Abstract: Single-cell RNA sequencing (scRNA-seq) clustering is essential for identifying cell types, but high dimensionality, sparsity, dropout, and technical noise hinder robust expression representation and cell graph construction. Existing…"

View on X

Originally posted by Jun Tang, Pengwei Hu, Sicong Gao, Jie Guo, Lun Hu, Xin Luo on X · view source

Want to go deeper?

Turn these trends into skills with Learnijoy's hands-on AI & tech courses.

Explore courses