A lot of controversy exists around the choice of the most appropriate error measure for assessing the performance of forecasting methods. While statisticians argue for the use of measures with good statistical properties, practitioners prefer measures that are easy to communicate and understand. Moreover, researchers argue that the loss-function for parameterizing a model should be aligned with how the post-performance measurement is made. In this paper we ask: Does it matter? Will the relative ranking of the forecasting methods change significantly if we choose one measure over another? Will a mismatch of the in-sample loss-function and the out-of-sample performance measure decrease the performance of the forecasting models? Focusing on the average ranked point forecast accuracy, we review the most commonly-used measures in both the academia and practice and perform a large-scale empirical study to understand the importance of the choice between measures. Our results suggest that there are only small discrepancies between the different error measures, especially within each measure category (percentage, relative, or scaled).
|Number of pages||18|
|Journal||Journal of the Operational Research Society|
|Early online date||13 Apr 2021|
|Publication status||Published - 31 Dec 2022|