In many practices, scientists are particularly interested in detecting which of the predictors are truly associated with a multivariate response. It is more accurate to model multiple responses as one vector rather than separating each component one by one. This is particularly true for complex traits having multiple correlated components. A Bayesian multivariate variable selection (BMVS) approach is proposed to select important predictors influencing the multivariate response from a candidate pool with ultrahigh dimension. By applying the sample-size-dependent spike and slab priors, the BMVS approach satisfies the strong selection consistency property under certain conditions, which represents the advantages of BMVS over other existing Bayesian multivariate regression-based approaches. The proposed approach considers the covariance structure of multiple responses without assuming independence and integrates the estimation of covariance-related parameters together with all regression parameters into one framework through a fast-updating Markov chain Monte Carlo (MCMC) procedure. It is demonstrated through simulations that the BMVS approach outperforms some other relevant frequentist and Bayesian approaches. The proposed BMVS approach possesses a flexibility of wide applications, including genome-wide association studies with multiple correlated phenotypes and a large scale of genetic variants and/or environmental variables, as demonstrated in the real data analyses section. The computer code and test data of the proposed method are available as an R package.
All Science Journal Classification (ASJC) codes
- Statistics and Probability
- Statistics, Probability and Uncertainty
- Bayesian analysis
- gene selection
- high dimension modelling
- multivariate genome-wide association studies