Operational comparison of rainfall-runoff models through hypothesis testing

Research output: Contribution to journalArticle

Abstract

Assessing rainfall-runoff model performance and selecting the model best suited are important considerations in operational hydrology. However, often model choice is heuristic and based on a simplistic comparison of a single performance criterion without considering the statistical significance of differences in performance. This is potentially problematic as interpretation of a single performance criteria is subjective to the user. This paper removes the subjectivity by applying a jackknife split-sample calibration method to create a sample mean of performance for competing models which are used in a paired t-test allowing statements of statistical significance to be made. A second method is presented based on a hypothesis test in the binomial distribution, considering model performance across a group of catchments.


A case study comparing the performance of two rainfall-runoff models across 27 urban catchments within the Thames basin show that while the urban signal is difficult to detect on single catchment, it is significant across the group of catchments depending upon the choice of performance criteria. These results demonstrate the operational applicability of the new tools and the benefits of considering model performance in a probabilistic framework.
Original languageEnglish
Pages (from-to)1-26
Number of pages26
JournalJournal of Hydrologic Engineering
Publication statusAccepted/In press - 4 Oct 2019

Cite this

@article{526f7bcb7fc441dfbc56023a2753a425,
title = "Operational comparison of rainfall-runoff models through hypothesis testing",
abstract = "Assessing rainfall-runoff model performance and selecting the model best suited are important considerations in operational hydrology. However, often model choice is heuristic and based on a simplistic comparison of a single performance criterion without considering the statistical significance of differences in performance. This is potentially problematic as interpretation of a single performance criteria is subjective to the user. This paper removes the subjectivity by applying a jackknife split-sample calibration method to create a sample mean of performance for competing models which are used in a paired t-test allowing statements of statistical significance to be made. A second method is presented based on a hypothesis test in the binomial distribution, considering model performance across a group of catchments. A case study comparing the performance of two rainfall-runoff models across 27 urban catchments within the Thames basin show that while the urban signal is difficult to detect on single catchment, it is significant across the group of catchments depending upon the choice of performance criteria. These results demonstrate the operational applicability of the new tools and the benefits of considering model performance in a probabilistic framework.",
author = "James Fidal and Thomas Kjeldsen",
year = "2019",
month = "10",
day = "4",
language = "English",
pages = "1--26",
journal = "Journal of Hydrologic Engineering",
issn = "1084-0699",
publisher = "American Society of Civil Engineers (ASCE)",

}

TY - JOUR

T1 - Operational comparison of rainfall-runoff models through hypothesis testing

AU - Fidal, James

AU - Kjeldsen, Thomas

PY - 2019/10/4

Y1 - 2019/10/4

N2 - Assessing rainfall-runoff model performance and selecting the model best suited are important considerations in operational hydrology. However, often model choice is heuristic and based on a simplistic comparison of a single performance criterion without considering the statistical significance of differences in performance. This is potentially problematic as interpretation of a single performance criteria is subjective to the user. This paper removes the subjectivity by applying a jackknife split-sample calibration method to create a sample mean of performance for competing models which are used in a paired t-test allowing statements of statistical significance to be made. A second method is presented based on a hypothesis test in the binomial distribution, considering model performance across a group of catchments. A case study comparing the performance of two rainfall-runoff models across 27 urban catchments within the Thames basin show that while the urban signal is difficult to detect on single catchment, it is significant across the group of catchments depending upon the choice of performance criteria. These results demonstrate the operational applicability of the new tools and the benefits of considering model performance in a probabilistic framework.

AB - Assessing rainfall-runoff model performance and selecting the model best suited are important considerations in operational hydrology. However, often model choice is heuristic and based on a simplistic comparison of a single performance criterion without considering the statistical significance of differences in performance. This is potentially problematic as interpretation of a single performance criteria is subjective to the user. This paper removes the subjectivity by applying a jackknife split-sample calibration method to create a sample mean of performance for competing models which are used in a paired t-test allowing statements of statistical significance to be made. A second method is presented based on a hypothesis test in the binomial distribution, considering model performance across a group of catchments. A case study comparing the performance of two rainfall-runoff models across 27 urban catchments within the Thames basin show that while the urban signal is difficult to detect on single catchment, it is significant across the group of catchments depending upon the choice of performance criteria. These results demonstrate the operational applicability of the new tools and the benefits of considering model performance in a probabilistic framework.

M3 - Article

SP - 1

EP - 26

JO - Journal of Hydrologic Engineering

JF - Journal of Hydrologic Engineering

SN - 1084-0699

ER -