Variable selection for partially linear models via learning gradients

Lei Yang, Yixin Fang, Junhui Wang, Yongzhao Shao

Research output: Contribution to journalArticlepeer-review

5 Scopus citations


Partially linear models (PLMs) are important generalizations of linear models and are very useful for analyzing high-dimensional data. Compared to linear models, the PLMs possess desirable flexibility of nonparametric regression models because they have both linear and non-linear components. Variable selection for PLMs plays an important role in practical applications and has been extensively studied with respect to the linear component. However, for the non-linear component, variable selection has been well developed only for PLMs with extra structural assumptions such as additive PLMs and generalized additive PLMs. There is currently an unmet need for variable selection methods applicable to general PLMs without structural assumptions on the non-linear component. In this paper, we propose a new variable selection method based on learning gradients for general PLMs without any assumption on the structure of the non-linear component. The proposed method utilizes the reproducing-kernel-Hilbert-space tool to learn the gradients and the group-lasso penalty to select variables. In addition, a block-coordinate descent algorithm is suggested and some theoretical properties are established including selection consistency and estimation consistency. The performance of the proposed method is further evaluated via simulation studies and illustrated using real data.

Original languageEnglish (US)
Pages (from-to)2907-2930
Number of pages24
JournalElectronic Journal of Statistics
Issue number2
StatePublished - 2017

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


  • Gradient learning
  • Group Lasso
  • High-dimensional data
  • PLM
  • Reproducing kernel Hilbert space
  • Variable selection


Dive into the research topics of 'Variable selection for partially linear models via learning gradients'. Together they form a unique fingerprint.

Cite this