Generalized additive models for large data sets

Simon N. Wood, Yannig Goude, Simon Shaw

Research output: Contribution to journalArticlepeer-review

210 Citations (SciVal)
53 Downloads (Pure)


We consider an application in electricity grid load prediction, where generalized additive models are appropriate, but where the data set's size can make their use practically intractable with existing methods. We therefore develop practical generalized additive model fitting methods for large data sets in the case in which the smooth terms in the model are represented by using penalized regression splines. The methods use iterative update schemes to obtain factors of the model matrix while requiring only subblocks of the model matrix to be computed at any one time. We show that efficient smoothing parameter estimation can be carried out in a well‐justified manner. The grid load prediction problem requires updates of the model fit, as new data become available, and some means for dealing with residual auto‐correlation in grid load. Methods are provided for these problems and parallel implementation is covered. The methods allow estimation of generalized additive models for large data sets by using modest computer hardware, and the grid load prediction problem illustrates the utility of reduced rank spline smoothing methods for dealing with complex modelling problems.
Original languageEnglish
Pages (from-to)139-155
Number of pages17
JournalJournal of the Royal Statistical Society Series C-Applied Statistics
Issue number1
Early online date27 May 2014
Publication statusPublished - 1 Jan 2015


Dive into the research topics of 'Generalized additive models for large data sets'. Together they form a unique fingerprint.

Cite this