# Generalized Additive Models for Large datasets: spatial-temporal modelling of the UK's Daily Black Smoke (1961 - 2005)

• Zheyuan Li

Student thesis: Doctoral ThesisPhD

### Abstract

The UK Black Smoke monitoring network has produced daily particulate air pollution data from a network of up to 1200 monitoring stations over several decades, resulting in 10 million measurements in total. Spatial-temporal modelling of the data is desirable for accurate trend / seasonality estimation and mapping and to provide daily exposure estimates for epidemiological cohort studies. Generalized additive models offer one way to do this if we can deal with the data volume and model size. This thesis will develop computation method for estimating generalized additive models having $O(10^4)$ coefficients and $O(10^8)$ observations. The strategy combines 3 elements: (i) fine scale discretization of covariates, (ii) an efficient approach to restricted likelihood optimization, that avoids computation of numerically awkward log determinant terms and (iii) restricted likelihood optimization algorithms that make good use of numerical linear algebra methods with high performance and good parallel scaling on mordern multi-core machines. The new method enables us to estimate spatial-temporal models for daily Black Smoke data over the last four decades at a daily resolution which had once been infeasible. A spatial-temporal dataset of daily Black Smoke is also produced on a grid of 5km $\times$ 5km resolution. Our prediction is shown to suffer from little extrapolation and bias.
Date of Award 13 Feb 2019 English University of Bath Gavin Shaddick (Supervisor), Theresa Smith (Supervisor) & Simon Wood (Supervisor)

'