Abstract
Background: A widely used approach for extracting information from gene expression data employs the construction of a gene co-expression network and the subsequent computational detection of gene clusters, called modules. WGCNA and related methods are the de facto standard for module detection. The purpose of this work is to investigate the applicability of more sophisticated algorithms toward the design of an alternative method with enhanced potential for extracting biologically meaningful modules. Results: We present self-learning gene clustering pipeline (SGCP), a spectral method for detecting modules in gene co-expression networks. SGCP incorporates multiple features that differentiate it from previous work, including a novel step that leverages gene ontology (GO) information in a self-leaning step. Compared with widely used existing frameworks on 12 real gene expression datasets, we show that SGCP yields modules with higher GO enrichment. Moreover, SGCP assigns highest statistical importance to GO terms that are mostly different from those reported by the baselines. Conclusion: Existing frameworks for discovering clusters of genes in gene co-expression networks are based on relatively simple algorithmic components. SGCP relies on newer algorithmic techniques that enable the computation of highly enriched modules with distinctive characteristics, thus contributing a novel alternative tool for gene co-expression analysis.
Original language | English (US) |
---|---|
Article number | 230 |
Journal | BMC Bioinformatics |
Volume | 25 |
Issue number | 1 |
DOIs | |
State | Published - Dec 2024 |
Externally published | Yes |
All Science Journal Classification (ASJC) codes
- Structural Biology
- Biochemistry
- Molecular Biology
- Computer Science Applications
- Applied Mathematics
Keywords
- GO enrichment
- Gene co-expression networks
- Gene modules
- WGCNA