Clustering gene expression data using adaptive double self-organizing map

Habtom Ressom, Dali Wang, Padma Natarajan

Research output: Contribution to journalArticlepeer-review

18 Scopus citations

Abstract

This paper presents a novel-clustering technique known as adaptive double self-organizing map (ADSOM). ADSOM has a flexible topology and performs clustering and cluster visualization simultaneously, thereby requiring no a priori knowledge about the number of clusters. ADSOM is developed based on a recently introduced technique known as double self-organizing map (DSOM). DSOM combines features of the popular self-organizing map (SOM) with two-dimensional position vectors, which serve as a visualization tool to decide how many clusters are needed. Although DSOM addresses the problem of identifying unknown number of clusters, its free parameters are difficult to control to guarantee correct results and convergence. ADSOM updates its free parameters during training, and it allows convergence of its position vectors to a fairly consistent number of clusters provided that its initial number of nodes is greater than the expected number of clusters. The number of clusters can be identified by visually counting the clusters formed by the position vectors after training. A novel index is introduced based on hierarchical clustering of the final locations of position vectors. The index allows automated detection of the number of clusters, thereby reducing human error that could be incurred from counting clusters visually. The reliance of ADSOM in identifying the number of clusters is proven by applying it to publicly available gene expression data from multiple biological systems such as yeast, human, and mouse. ADSOM's performance in detecting number of clusters is compared with a model-based clustering method.

Original languageEnglish (US)
Pages (from-to)35-46
Number of pages12
JournalPhysiological Genomics
Volume14
DOIs
StatePublished - Oct 2003
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Physiology
  • Genetics

Keywords

  • Cluster visualization
  • Detecting number of clusters
  • Microarray
  • Unsupervised learning

Fingerprint

Dive into the research topics of 'Clustering gene expression data using adaptive double self-organizing map'. Together they form a unique fingerprint.

Cite this