Practical variable selection for generalized additive models

Giampiero Marra, Simon N Wood

Research output: Contribution to journalArticle

171 Citations (Scopus)

Abstract

The problem of variable selection within the class of generalized additive models, when there are many covariates to choose from but the number of predictors is still somewhat smaller than the number of observations, is considered. Two very simple but effective shrinkage methods and an extension of the nonnegative garrote estimator are introduced. The proposals avoid having to use nonparametric testing methods for which there is no general reliable distributional theory. Moreover, component selection is carried out in one single step as opposed to many selection procedures which involve an exhaustive search of all possible models. The empirical performance of the proposed methods is compared to that of some available techniques via an extensive simulation study. The results show under which conditions one method can be preferred over another, hence providing applied researchers with some practical guidelines. The procedures are also illustrated analysing data on plasma beta-carotene levels from a cross-sectional study conducted in the United States.
Original languageEnglish
Pages (from-to)2372-2387
Number of pages16
JournalComputational Statistics & Data Analysis
Volume55
Issue number7
DOIs
Publication statusPublished - 1 Jul 2011

Fingerprint

Generalized Additive Models
Variable Selection
Nonparametric Testing
Plasmas
Selection of Variables
Exhaustive Search
Selection Procedures
Testing
Shrinkage
Covariates
Predictors
Plasma
Choose
Non-negative
Simulation Study
Estimator
Carotenoids

Keywords

  • shrinkage smoother
  • generalized additive model
  • penalized thin plate regression spline
  • nonnegative garrote estimator
  • practical variable selection

Cite this

Practical variable selection for generalized additive models. / Marra, Giampiero; Wood, Simon N.

In: Computational Statistics & Data Analysis, Vol. 55, No. 7, 01.07.2011, p. 2372-2387.

Research output: Contribution to journalArticle

Marra, Giampiero ; Wood, Simon N. / Practical variable selection for generalized additive models. In: Computational Statistics & Data Analysis. 2011 ; Vol. 55, No. 7. pp. 2372-2387.
@article{80bb7637bd9445d4930022b31e1b3aeb,
title = "Practical variable selection for generalized additive models",
abstract = "The problem of variable selection within the class of generalized additive models, when there are many covariates to choose from but the number of predictors is still somewhat smaller than the number of observations, is considered. Two very simple but effective shrinkage methods and an extension of the nonnegative garrote estimator are introduced. The proposals avoid having to use nonparametric testing methods for which there is no general reliable distributional theory. Moreover, component selection is carried out in one single step as opposed to many selection procedures which involve an exhaustive search of all possible models. The empirical performance of the proposed methods is compared to that of some available techniques via an extensive simulation study. The results show under which conditions one method can be preferred over another, hence providing applied researchers with some practical guidelines. The procedures are also illustrated analysing data on plasma beta-carotene levels from a cross-sectional study conducted in the United States.",
keywords = "shrinkage smoother, generalized additive model, penalized thin plate regression spline, nonnegative garrote estimator, practical variable selection",
author = "Giampiero Marra and Wood, {Simon N}",
year = "2011",
month = "7",
day = "1",
doi = "10.1016/j.csda.2011.02.004",
language = "English",
volume = "55",
pages = "2372--2387",
journal = "Computational Statistics & Data Analysis",
issn = "0167-9473",
publisher = "Elsevier",
number = "7",

}

TY - JOUR

T1 - Practical variable selection for generalized additive models

AU - Marra, Giampiero

AU - Wood, Simon N

PY - 2011/7/1

Y1 - 2011/7/1

N2 - The problem of variable selection within the class of generalized additive models, when there are many covariates to choose from but the number of predictors is still somewhat smaller than the number of observations, is considered. Two very simple but effective shrinkage methods and an extension of the nonnegative garrote estimator are introduced. The proposals avoid having to use nonparametric testing methods for which there is no general reliable distributional theory. Moreover, component selection is carried out in one single step as opposed to many selection procedures which involve an exhaustive search of all possible models. The empirical performance of the proposed methods is compared to that of some available techniques via an extensive simulation study. The results show under which conditions one method can be preferred over another, hence providing applied researchers with some practical guidelines. The procedures are also illustrated analysing data on plasma beta-carotene levels from a cross-sectional study conducted in the United States.

AB - The problem of variable selection within the class of generalized additive models, when there are many covariates to choose from but the number of predictors is still somewhat smaller than the number of observations, is considered. Two very simple but effective shrinkage methods and an extension of the nonnegative garrote estimator are introduced. The proposals avoid having to use nonparametric testing methods for which there is no general reliable distributional theory. Moreover, component selection is carried out in one single step as opposed to many selection procedures which involve an exhaustive search of all possible models. The empirical performance of the proposed methods is compared to that of some available techniques via an extensive simulation study. The results show under which conditions one method can be preferred over another, hence providing applied researchers with some practical guidelines. The procedures are also illustrated analysing data on plasma beta-carotene levels from a cross-sectional study conducted in the United States.

KW - shrinkage smoother

KW - generalized additive model

KW - penalized thin plate regression spline

KW - nonnegative garrote estimator

KW - practical variable selection

UR - http://www.scopus.com/inward/record.url?scp=79953654016&partnerID=8YFLogxK

UR - http://dx.doi.org/10.1016/j.csda.2011.02.004

U2 - 10.1016/j.csda.2011.02.004

DO - 10.1016/j.csda.2011.02.004

M3 - Article

VL - 55

SP - 2372

EP - 2387

JO - Computational Statistics & Data Analysis

JF - Computational Statistics & Data Analysis

SN - 0167-9473

IS - 7

ER -