Abstract
This thesis compiles three papers at various stages of publication consisting of novel algorithms and research on the topic of deep learning accelerated physics based Bayesian inference. Additionally included is an application paper written in collaboration with colleagues in the Department of Engineering where some methodology developed in this thesis is applied to heat transfer data from an experimental compressor cavity rig.Before presenting the first of these papers, we begin in Chapter 1 with an introduction and overview of the themes of the thesis. This overview includes a description of Bayesian inference and its solution methods, an outline of traditional solution methods for PDEs and integral equations, a brief introduction to neural networks and their training, and some established approaches applied to solve PDE-based Bayesian inverse problems. Much of the material covered here will be revisited where appropriate in subsequent chapters, and so Chapter 1 is not intended to be overly comprehensive. Instead this chapter serves as a gentle introduction to these topics, providing both a natural framework through which the subsequent chapters can be viewed, as well as a reference of more established approaches for PDE-based Bayesian inversion to which our deep learning accelerations can be compared. The remaining chapters follow organically from this starting point. The statistical models considered in each chapter are ordered in increasing levels of complexity, and the contents focus on how deep learning surrogates can accelerate Bayesian inference in these varied settings. Numerical examples are plentiful throughout and Python implementations are publicly available where indicated.
Chapter 2 of this report contains a paper [38] which is currently under review. In this chapter we describe a deep learning approach to efficiently perform Bayesian inference in partial differential equation (PDE) and integral equation models over potentially high-dimensional parameter spaces. We review some deep learning approaches to approximate the solutions to PDEs, and introduce a new neural network approach to approximating the solutions of Fredholm and Volterra integral equations of the first and second kind. These algorithms work by formulating appropriate loss functions such that the solutions of these equations are the minimisers of the corresponding optimisation problems. We then extend these algorithms to approximate parametric surrogate solutions of PDEs and integral equations. This deep learning approach allows the efficient approximation of parametric solutions in significantly higher dimensions than is possible using classical techniques. Since the approximated solutions are very cheap to evaluate, the solutions of Bayesian inverse problems become tractable using Markov chain Monte Carlo. Our method is applied to two real-world examples; these include Bayesian inference of PDE and integral equation parameters in an electrochemical example, and Bayesian inference of a heat-transfer function with applications in aviation.
Chapter 3 comprises two papers born from a collaboration with the Department of Mechanical Engineering. The main part of this chapter presents a paper that significantly extends the techniques introduced previously, in order to solve a spatio-temporal extension of the heat-transfer problem introduced in Chapter 2. In this highly ill-posed inverse problem we seek to understand the heat flux behaviour in rapidly rotating discs within engine cavities, the results of which will be used in practice with experimental data to inform future engine design. Due to this demanding application, the developments in this paper focus on ensuring that our inference is efficient and reliable despite the black-box nature of deep learning. We do this by constructing appropriate Gaussian process priors, developing a training procedure that efficiently achieves more accurate approximations by adapting to the desired posterior distribution, using delayed-acceptance sampling to ensure the accuracy of the posteriors we achieve are mathematically guaranteed, and using surrogate accelerated Hamiltonian proposals to achieve these samples within a reasonable time-frame. The method is then applied in a simulation study to verify its effectiveness, before illustrating the results using real data from a multi-cavity compressor rig. Additionally included as Subchapter 3A, is an application paper written in conjunction with colleagues in the Department of Mechanical Engineering. This is a more detailed investigation of the experimental setting and how the inferences resulting from our method will affect engine design in the future.
In Chapter 4 we present a manuscript in preparation, containing some preliminary results that explore a more flexible application of deep learning surrogates. Here, neural networks are used as physical features in a spatio-temporal statistical model of pollution levels. We are motivated by the situation in Ulaanbaatar, Mongolia, which has dangerous pollution levels that fluctuate according to daily, weekly, and annual patterns. Despite this, the number of pollution sensors in the city is low, leading to large uncertainties dominating our inference if modern spatio-temporal statistical models are fit (e.g. regression with an autoregressive-Matern Gaussian field). To overcome this, we use physical knowledge represented by a deep learning surrogate of advection diffusion dynamics with uncertain parameterisation as the mean in a hierarchical spatio-temporal Gaussian process model. This approach heavily weights the prior of the model towards advection-diffusion dynamics, and its inclusion within a more flexible Gaussian process model allows any behaviour that is not fully described by the physical equation to be resolved. In this chapter we perform an exploratory analysis of the atmospheric data from Ulaanbaatar, describe the statistical and physical models that we base our analysis on, and then introduce a physics-informed statistical model that incorporates aspects of both. Using the deep learning surrogate to represent the PDE, we fit all parameters (both PDE and Gaussian process parameters) within this model simultaneously, by sampling from their joint posterior distribution using Hamiltonian Monte Carlo. Finally we marginalise these parameters to form a predictive posterior eld over the city that fully accounts the physical and statistical uncertainty and their interdependencies. This chapter then concludes with some notes on possible extensions to this topic, including the description of a mass-controlled neural network which allows us to systematically control the integral of our surrogate approximation, as well as several ways that the physical and statistical model could be developed in future.
| Date of Award | 22 Jun 2022 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Tony Shardlow (Supervisor) & Eike Mueller (Supervisor) |