Clustering single-cell RNA-seq data with a model-based deep learning approach

Tian Tian, Ji Wan, Qi Song, Zhi Wei

Research output: Contribution to journalArticlepeer-review

166 Scopus citations


Single-cell RNA sequencing (scRNA-seq) promises to provide higher resolution of cellular differences than bulk RNA sequencing. Clustering transcriptomes profiled by scRNA-seq has been routinely conducted to reveal cell heterogeneity and diversity. However, clustering analysis of scRNA-seq data remains a statistical and computational challenge, due to the pervasive dropout events obscuring the data matrix with prevailing ‘false’ zero count observations. Here, we have developed scDeepCluster, a single-cell model-based deep embedded clustering method, which simultaneously learns feature representation and clustering via explicit modelling of scRNA-seq data generation. Based on testing extensive simulated data and real datasets from four representative single-cell sequencing platforms, scDeepCluster outperformed state-of-the-art methods under various clustering performance metrics and exhibited improved scalability, with running time increasing linearly with sample size. Its accuracy and efficiency make scDeepCluster a promising algorithm for clustering large-scale scRNA-seq data.

Original languageEnglish (US)
Pages (from-to)191-198
Number of pages8
JournalNature Machine Intelligence
Issue number4
StatePublished - Apr 1 2019

All Science Journal Classification (ASJC) codes

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications
  • Artificial Intelligence


Dive into the research topics of 'Clustering single-cell RNA-seq data with a model-based deep learning approach'. Together they form a unique fingerprint.

Cite this