TY - JOUR
T1 - Parallelisation of the Lagrangian atmospheric dispersion model NAME
AU - Müller, Eike
AU - Ford, Rupert
AU - Hort, Matthew
AU - Huggett, Lois
AU - Riley, Graham
AU - Thomson, David
PY - 2013/12/31
Y1 - 2013/12/31
N2 - The NAME Atmospheric Dispersion Model is a Lagrangian particle model used by the Met Office to predict the propagation and spread of pollutants in the atmosphere. The model is routinely used in emergency response applications, where it is important to obtain results as quickly as possible. This requirement for a short runtime and the increase in core number of commonly available CPUs, such as the Intel Xeon series, has motivated the parallelisation of NAME in the OpenMP shared memory framework. In this work we describe the implementation of this parallelisation strategy in NAME and discuss the performance of the model for different setups. Due to the independence of the model particles, the parallelisation of the main compute intensive loops is relatively straightforward. The random number generator for modelling sub-grid scale turbulent motion needs to be adapted to ensure that different particles use independent sets of random numbers. We find that on Intel Xeon X5680 CPUs the model shows very good strong scaling up to 12 cores in a realistic emergency response application for predicting the dispersion of volcanic ash in the North Atlantic airspace. We implemented a mechanism for asynchronous reading of meteorological data from disk and demonstrate how this can reduce the runtime if disk access plays a significant role in a model run. To explore the performance on different chip architectures we also ported the part of the code which is used for calculating the gamma dose from a cloud of radioactive particles to a graphics processing unit (GPU) using CUDA-C. We were able to demonstrate a significant speedup of around one order of magnitude relative to the serial CPU version.
AB - The NAME Atmospheric Dispersion Model is a Lagrangian particle model used by the Met Office to predict the propagation and spread of pollutants in the atmosphere. The model is routinely used in emergency response applications, where it is important to obtain results as quickly as possible. This requirement for a short runtime and the increase in core number of commonly available CPUs, such as the Intel Xeon series, has motivated the parallelisation of NAME in the OpenMP shared memory framework. In this work we describe the implementation of this parallelisation strategy in NAME and discuss the performance of the model for different setups. Due to the independence of the model particles, the parallelisation of the main compute intensive loops is relatively straightforward. The random number generator for modelling sub-grid scale turbulent motion needs to be adapted to ensure that different particles use independent sets of random numbers. We find that on Intel Xeon X5680 CPUs the model shows very good strong scaling up to 12 cores in a realistic emergency response application for predicting the dispersion of volcanic ash in the North Atlantic airspace. We implemented a mechanism for asynchronous reading of meteorological data from disk and demonstrate how this can reduce the runtime if disk access plays a significant role in a model run. To explore the performance on different chip architectures we also ported the part of the code which is used for calculating the gamma dose from a cloud of radioactive particles to a graphics processing unit (GPU) using CUDA-C. We were able to demonstrate a significant speedup of around one order of magnitude relative to the serial CPU version.
U2 - 10.1016/j.cpc.2013.06.022
DO - 10.1016/j.cpc.2013.06.022
M3 - Article
SN - 0010-4655
VL - 184
SP - 2734
EP - 2745
JO - Computer Physics Communications
JF - Computer Physics Communications
IS - 12
ER -