TY - GEN
T1 - New filter-based feature selection criteria for identifying differentially expressed genes
AU - Loo, Lit Hsin
AU - Roberts, Samuel
AU - Hrebien, Leonid
AU - Kam, Moshe
PY - 2005
Y1 - 2005
N2 - We propose two new filter-based feature selection criteria for identifying differentially expressed genes, namely the average difference score (ADS) and the mean difference score (MDS). These criteria replace the serial noise estimator used in existing criteria by a parallel noise estimator. The result is better detection of changes in the variance of expression levels, which t-statistic type criteria tend to under-emphasize. We compare the performance of the new criteria to that of several commonly used feature selection criteria, including the Welch t-statistic, the Fisher correlation score, the Wilcoxon rank sum, and the Independently Consistent Expression discriminator, on synthetic data and real biological data obtained from acute lymphoblastic leukemia and acute myeloid leukemia patients. We find that ADS and MDS outperform the other criteria by exhibiting higher sensitivity and comparable specificity. ADS is also able to flag several biologically important genes that are missed by the Welch t-statistic.
AB - We propose two new filter-based feature selection criteria for identifying differentially expressed genes, namely the average difference score (ADS) and the mean difference score (MDS). These criteria replace the serial noise estimator used in existing criteria by a parallel noise estimator. The result is better detection of changes in the variance of expression levels, which t-statistic type criteria tend to under-emphasize. We compare the performance of the new criteria to that of several commonly used feature selection criteria, including the Welch t-statistic, the Fisher correlation score, the Wilcoxon rank sum, and the Independently Consistent Expression discriminator, on synthetic data and real biological data obtained from acute lymphoblastic leukemia and acute myeloid leukemia patients. We find that ADS and MDS outperform the other criteria by exhibiting higher sensitivity and comparable specificity. ADS is also able to flag several biologically important genes that are missed by the Welch t-statistic.
UR - http://www.scopus.com/inward/record.url?scp=33847287864&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33847287864&partnerID=8YFLogxK
U2 - 10.1109/ICMLA.2005.48
DO - 10.1109/ICMLA.2005.48
M3 - Conference contribution
AN - SCOPUS:33847287864
SN - 0769524958
SN - 9780769524955
T3 - Proceedings - ICMLA 2005: Fourth International Conference on Machine Learning and Applications
SP - 135
EP - 144
BT - Proceedings - ICMLA 2005
T2 - ICMLA 2005: 4th International Conference on Machine Learning and Applications
Y2 - 15 December 2005 through 17 December 2005
ER -