Prediction of continuous phenotypes in mouse, fly, and rice genome wide association studies with support vector regression SNPs and ridge regression classifier

Abdulrhman Aljouie, Usman Roshan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Scopus citations

Abstract

The ranking of SNPs and prediction of phenotypes in continuous genome wide association studies is a subject of increasing interest with applications in personalized medicine and animal and plant breeding. The ranking of SNPs in case control (discrete label) genome wide association studies has been examined in several previous studies with machine learning techniques but this is poorly explored for studies with quantitative labels. Here we study ranking of SNPs in mouse, fly, and rice continuous genome wide association studies given by the popular univariate Pearson correlation coefficient and the multivariate support vector regression and ridge regression. We perform cross-validation with the support vector regression and ridge regression models on top ranked SNPs and compute correlation coefficients between true and predicted phenotypes. Our results show that ridge regression prediction with top ranked support vector regression SNPs gives the highest accuracy. On all datasets we achieve accuracies comparable to previously published values but with fewer SNPs. Our work shows we can learn parsimonious SNP models for predicting continuous labels in genome wide studies.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1246-1250
Number of pages5
ISBN (Electronic)9781509002870
DOIs
StatePublished - Mar 2 2016
EventIEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015 - Miami, United States
Duration: Dec 9 2015Dec 11 2015

Publication series

NameProceedings - 2015 IEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015

Other

OtherIEEE 14th International Conference on Machine Learning and Applications, ICMLA 2015
Country/TerritoryUnited States
CityMiami
Period12/9/1512/11/15

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications

Keywords

  • Genome wide association studies
  • Phenotype prediction
  • Ridge regression
  • SNP selection
  • Support vector regression

Fingerprint

Dive into the research topics of 'Prediction of continuous phenotypes in mouse, fly, and rice genome wide association studies with support vector regression SNPs and ridge regression classifier'. Together they form a unique fingerprint.

Cite this