Table 3

Performance metrics tested in the optimization process. Each metric produces a different set of combination weights (i.e., a different ensemble). In each case a label is shown in parentheses that is used throughout the rest of the manuscript. Categorical metrics are calculated using 2 × 2 contingency table after probabilistic forecasts are turned into deterministic forecasts by choosing a threshold value, P th. See Appendix A for their definitions.

Probabilistic Categorical
Brier score (BRIER) Brier score (BRIER_C)
Mean absolute error (MAE) True skill statistic (TSS)
Linear correlation coefficient (LCC)a Heidke skill score (HSS)
Rank (nonlinear) correlation coefficient Accuracy (ACC)
(NLCC_ρ, NLCC_τ) Critical success index (CSI)
Reliability (REL) Gilbert skill score (GSS)
Resolution (RES)
Relative operating characteristic (ROC) curve area

NLCC_ρ and NLCC_τ are Spearman’s rank correlation and Kendall’s τ correlation, correspondingly.

