Evaluating the discrimination ability of proper multi-variate scoring rules

C. Alexander, M. Coulon, Y. Han, X. Meng

Research output: Contribution to journalArticlepeer-review

6 Citations (SciVal)


Proper scoring rules are commonly applied to quantify the accuracy of distribution forecasts. Given an observation they assign a scalar score to each distribution forecast, with the lowest expected score attributed to the true distribution. The energy and variogram scores are two rules that have recently gained some popularity in multivariate settings because their computation does not require a forecast to have parametric density function and so they are broadly applicable. Here we conduct a simulation study to compare the discrimination ability between the energy score and three variogram scores. Compared with other studies, our simulation design is more realistic because it is supported by a historical data set containing commodity prices, currencies and interest rates, and our data generating processes include a diverse selection of models with different marginal distributions, dependence structure, and calibration windows. This facilitates a comprehensive comparison of the performance of proper scoring rules in different settings. To compare the scores we use three metrics: the mean relative score, error rate and a generalized discrimination heuristic. Overall, we find that the variogram score with parameter p=0.5 outperforms the energy score and the other two variogram scores.

Original languageEnglish
Pages (from-to)857-883
Number of pages27
JournalAnnals of Operations Research
Issue number1-3
Early online date18 Mar 2022
Publication statusPublished - 18 Mar 2022


  • Discrimination heuristic
  • Energy score
  • Multivariate forecasting
  • Proper scoring rules
  • Variogram score

ASJC Scopus subject areas

  • General Decision Sciences
  • Management Science and Operations Research

Cite this