Issue 
J. Space Weather Space Clim.
Volume 8, 2018
Flares, coronal mass ejections and solar energetic particles and their space weather impacts



Article Number  A17  
Number of page(s)  12  
DOI  https://doi.org/10.1051/swsc/2018005  
Published online  26 March 2018 
Research Article
Verification of realtime WSA−ENLIL+Cone simulations of CME arrivaltime at the CCMC from 2010 to 2016
^{1}
American University, Physics Department,
Washington, DC, USA
^{2}
University of Colorado Boulder, Aerospace Engineering Sciences,
Boulder,
CO, USA
^{3}
NASA Goddard Space Flight Center,
Greenbelt,
MD, USA
^{4}
Catholic University of America,
Washington, DC, USA
^{5}
University of Maryland College Park,
College Park,
MD, USA
^{6}
George Mason University,
Fairfax,
VA, USA
^{*} Corresponding author: amtwold@gmail.com
Received:
3
June
2017
Accepted:
15
January
2018
The WangSheeleyArge (WSA)−ENLIL+Cone model is used extensively in space weather operations worldwide to model coronal mass ejection (CME) propagation. As such, it is important to assess its performance. We present validation results of the WSA−ENLIL+Cone model installed at the Community Coordinated Modeling Center (CCMC) and executed in realtime by the CCMC space weather team. CCMC uses the WSA−ENLIL+Cone model to predict CME arrivals at NASA missions throughout the inner heliosphere. In this work we compare model predicted CME arrivaltimes to in situ interplanetary coronal mass ejection leading edge measurements at Solar TErrestrial RElations ObservatoryAhead (STEREOA), Solar TErrestrial RElations ObservatoryBehind (STEREOB), and Earth (Wind and ACE) for simulations completed between March 2010 and December 2016 (over 1,800 CMEs). We report hit, miss, false alarm, and correct rejection statistics for all three locations. For all predicted CME arrivals, the hit rate is 0.5, and the false alarm rate is 0.1. For the 273 events where the CME was predicted to arrive at Earth, STEREOA, or STEREOB, and was actually observed (hit event), the mean absolute arrivaltime prediction error was 10.4 ± 0.9 h, with a tendency to early prediction error of −4.0 h. We show the dependence of the arrivaltime error on CME input parameters. We also explore the impact of the multispacecraft observations used to initialize the model CME inputs by comparing model verification results before and after the STEREOB communication loss (since September 2014) and STEREOA sidelobe operations (August 2014–December 2015). There is an increase of 1.7 h in the CME arrival time error during single, or limited twoviewpoint periods, compared to the threespacecraft viewpoint period. This trend would apply to a future space weather mission at L5 or L4 as another coronagraph viewpoint to reduce CME arrival time errors compared to a single L1 viewpoint.
Key words: MHD / modeling / validation / forecasting / coronal mass ejection (CME)
© A.M. Wold et al., Published by EDP Sciences 2018
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 Introduction
The WangSheeleyArge (WSA) coronal model (Arge & Pizzo, 2000; Arge et al., 2004) coupled with the global heliospheric ENLIL solarwind model (Odstrčil et al., 1996; Odstrčil & Pizzo, 1999a, b; Odstrčil, 2003; Odstrčil et al., 2004) has been used extensively in space weather operations worldwide. Space weather models provide forecast capabilities that greatly enhance satellite and groundbased observations. It is essential for both model users and developers to understand the limitations and capabilities of these models. In order to measure model performance, generally the model output is compared to a measurable parameter and skill scores are computed. Results from model verification are also helpful as feedback to the model developers, setting benchmarks for the current state of a model, and determining the usefulness and capabilities of a model for operations.
Previous studies have also assessed aspects of WSA−ENLIL+Cone model performance, typically with a smaller sample size of selected events, and in a nonrealtime setting. Taktakishvili et al. (2009) studied the performance of the ENLIL+Cone model in modeling the propagation of coronal mass ejections (CMEs) in the heliosphere by comparing the results of the simulation with ACE satellite observations. They evaluated the results of the ENLIL+Cone model for 14 fast CME events and found more earlier arrival predictions (9 out of 14) than late arrival predictions. The errors on the earlier arrivals were, on the average, larger than those of the late arrival predictions. The average absolute error was approximately 6 h for the total set. Millward et al. (2013) assessed 25 CME events during the first year of WSA−ENLIL operations at NOAA Space Weather Prediction Center (October 2011–October 2012) and found an average error of 7.5 h. Mays et al. (2015a) assesses the WSA−ENLIL+Cone ensemble modeling of CMEs in the realtime Community Coordinated Modeling Center (CCMC) setting. The ensemble modeling method provides a probabilistic forecast of CME arrival time and an estimation of arrivaltime uncertainty from the spread and distribution of predictions, as well as forecast confidence in the likelihood of CME arrival. For 17 predicted CME arrivals, the mean absolute arrivaltime prediction error was 12.3 h with an early bias of −5.8 h. Vršnak et al. (2014) compares a dragbased model (DBM) to the WSA−ENLIL+Cone model, and shows that the average arrival time error for DBM is about 14 h. They also show that DBM performs similarly to ENLIL during low solar activity periods, but ENLIL performs better as solar activity increases.
In interpreting the results of arrival time error assessment for the WSA−ENLIL+Cone model, it is important to examine all of the factors that can contribute to this uncertainty. Mays et al. (2015a) found that the reliability of ensemble CMEarrival predictions was heavily dependent on the initial distribution of CME input parameters, especially speed and width. Millward et al. (2013) provides an analysis of the impact that CME input parameters have on ENLILs arrival time predictions, as well as the importance of multiviewpoint coronagraph imagery in determining these parameters. Millward et al. (2013) found that when CMEs are measured from a single viewpoint with cone analysis, it is difficult to establish objectively the correct ellipse that should be applied to a given halo CME. Any uncertainties in the projected elliptical face of the cone were seen to lead to large errors in calculations of cone angle and radial velocity. Millward et al. (2013) reports that three viewpoint measuring of CMEs improves the accuracy of measurements, and that for outputs of models like ENLIL to be meaningful, the accuracy of key CME parameters is essential.
Beyond the improvement in arrival time prediction expected when using three coronagraph viewpoints instead of one or two, tracking CME propagation beyond coronagraph field of views leads to improved arrival time predictions. Möstl et al. (2014) found that predicting CME speeds and arrival times with heliospheric images gives more accurate results than using projected initial speeds from coronagraph measurements on the order of 12 h for the arrival times. By comparing predictions of speed and arrival time for 22 CMEs to the corresponding interplanetary coronal mass ejection (ICME) measurements at in situ observatories, they found the absolute difference between predicted and observed ICME arrival times was 8.1 ± 6.3 h (RMS = 10.9 h), with their empirical corrections improving their performance for the arrival times to 6.1 ± 5.0 h (root mean square error RMSE = 7.9 h). While error in arrival time predictions decreases with this heliospheric image tracking, the prediction lead time decreases. Colaninno et al. (2013) used a variety of nonrealtime methods to evaluate CME arrivaltime predictions based on a linear fit above a height of 50 solar radii () to multiviewpoint imaging data analysis only, and found an average absolute error of 6 h for seven out of nine CMEs, and 13 h for the full sample of nine CMEs.
Möstl et al. (2014) and Millward et al. (2013) mention important factors other than CME parameter inputs that affect prediction quality. The interaction of multiple CMEs (Lee et al., 2013), the lack of ejecta magnetic structure, differences between a direct hit or a glancing blow (Möstl et al., 2015; Mays et al., 2015b), and prediction errors from other model limitations should be considered. Reliable characterization of the ambient solar wind flow is also necessary for simulating transients and CME propagation (Lee et al., 2013; Mays et al., 2015a).
In this article we evaluate the performance of the WSA−ENLIL+Cone model installed at the CCMC and executed in realtime by the CCMC space weather team from March 2010–December 2016. The CCMC, located at NASA Goddard Space Flight Center, is an interagency partnership to facilitate community research and accelerate implementation of progress in research into space weather operations. The CCMC space weather team is a CCMC subteam that provides space weather services to NASA robotic mission operators and science campaigns and prototypes new models, forecasting techniques, and procedures. The CCMC space weather team began performing realtime WSA−ENLIL+Cone simulations in March 2010, marking the start of our verification period. The CCMC also serves the CME Scoreboard website (http://kauai.ccmc.gsfc.nasa.gov/CMEscoreboard) to the research community who may submit CME arrivaltime predictions in realtime for different forecasting methods.
In Section 2 we provide a brief description of the WSA−ENLIL+Cone model simulations and input parameters. We describe the methodology of our verification study in Section 3. We compute the average error (bias) and the average absolute arrival time error in Section 4.1 and examine the dependence of these errors on CME input parameters, including direction, speed, and width. In Section 4.2 we determine significance of multispacecraft observations for deriving CME parameters used to initialize the model. This is done by comparing arrival time errors before and after the Solar TErrestrial RElations Observatory (STEREO:Kaiser et al., 2008) Behind (B) communication loss in September 2014, and during STEREOAhead (A) sidelobe operations, from August 2014–December 2015. In Section 4.3, we report CME arrival hit, miss, false alarm, and correct rejection statistics and skill scores. For Earth arrivals, verification of forecasts are presented in Section 5. Finally, in Section 6 we summarize our results.
2 WSA−ENLIL+Cone model simulations
The WSA−ENLIL+Cone model consists of two parts, the WSA coronal model that approximates solar wind outflow at 21.5 , beyond the solar wind critical point, and the ENLIL 3D MHD numerical model that provides a timedependent description of the background solar wind plasma and magnetic field into which a CME can be inserted at the inner boundary of 21.5 . A common method to estimate the 3D CME kinematic and geometric parameters is to assume that the geometrical CME properties are approximated by the Cone model (Zhao et al., 2002; Xie et al., 2004), which assumes isotropic expansion, radial propagation, and constant CME cone angular width. Generally, a CME disturbance is inserted in the WSA–ENLIL model as slices of a homogeneous spherical plasma cloud with uniform velocity, density, and temperature as a timedependent inner boundary condition with a steady magnetic field. Three dimensional CME parameters were determined using the Stereoscopic CME Analysis Tool (StereoCAT) (Mays et al., 2015a) and the NOAA Space Weather Prediction Center CME Analysis Tool (CAT) (Millward et al., 2013). For most of the simulations in this study, ENLIL model version 2.7 was used with ambient settings “a4b1”, together with WSA version 2.2. A small subset of simulations prior to May 2011 (101 simulations; 34 hits) were performed with an earlier ENLIL version 2.6 using the ambient setting of “a3b2”. Jian et al. (2011) describes the model version differences in more detail.
In this study, the results of the simulations are compared to in situ ICME arrivals near Earth, STEREOA and STEREOB from March 2010 through December 2016. This set includes simulations of about 1,800 CMEs. Coronagraph observations from the SOlar and Heliospheric Observatory (SOHO: Domingo et al., 1995) spacecraft at L1 ahead of Earth, and also the STEREOA and B spacecraft trailing ahead and behind Earth's orbit were used. Since these simulations are all made in realtime by space weather forecasters, many slow CMEs (under 500 km s^{−1}) and CMEs out of the ecliptic plane (narrow CMEs at latitudes >25^{∘}) may not be modeled. Additionally, all of the coronagraph derived CME measurements were derived in realtime, often inferred from just a few data points due to realtime data gaps. Finally, there are many different space weather forecasters with varying levels of experience producing the measurements that were used to initialize the simulations.
As described in Emmons et al. (2013) and Mays et al. (2015a), for Earthdirected CMEs, WSA−ENLIL+Cone model outputs are used to compute an estimate of the geomagnetic K_{P} index using the Newell et al. (2007) coupling function. Three magnetic field clockangle scenarios of 90^{∘} (westward), 135^{∘} (southwestward), and 180^{∘} (southward) are calculated to compute the K_{P}. Verification results of the K_{P} predictions are presented in Section 5.
3 Verification methodology
The quality of model performance is evaluated by comparing the model output to the observed ICME arrival time. Both the predicted and observed arrival times refer to the arrival of the leading edge of the ICME shock or compression wave and not the magnetic cloud start time (if it exists). The CCMC’s publicly available Space Weather Database Of Notifications, Knowledge, Information (DONKI) (ccmc.gsfc.nasa.gov/donki) is populated by the CCMC space weather team and contains space weather relevant flares, solar energetic particles, CME parameters, WSA−ENLIL+Cone CME simulations, ICME arrivals, high speed streams, modeled magnetopause crossings, and radiation belt enhancements. All simulations used in this study were conducted in realtime and recorded in DONKI, while some of the observed ICME arrivals at L1, STEREOA, and STEREOB were added postevent. In order to determine observed ICME arrivals that may have been missing or incorrect in DONKI, existing ICME catalogs were compared to those already documented in the DONKI database. These included the Jian et al. (2006, 2011) Wind/ACE and STEREO ICME catalogs (Jian et al., 2013) (wwwssc.igpp.ucla.edu/forms/stereo/stereo_level_3.html, ftp://stereodata.nascom.nasa.gov/pub/ins_data/impact/level3/README.html), Richardson & Cane (2010), ICME catalog (www.srl.caltech.edu/ACE/ASC/DATA/level3/icmetable2.html), International Study of EarthAffecting Solar Transients catalog (solar.gmu.edu/heliophysics/index.php/The_ISEST_Master_CME_List), and the NievesChinchilla et al. (2016) Wind ICME catalog (wind.nasa.gov/fullcatalogue.php) with circular flux rope model fitting. Any CME arrivals at Earth (L1 spacecraft, e.g. Wind, ACE, and DSCOVR – Deep Space Climate Observatory), STEREOA, and STEREOB from the catalogues that had not been documented in DONKI were added to the database. In the case of any disagreement of arrival times between catalogues or DONKI, the in situ data from ACE, Wind, DSCOVR, and STEREOA and B was analyzed.
Complications arise in determining in situ ICME arrivals for several reasons including (1) weak arrivals, (2) hybrid events including arrivals of stream interaction regions (SIRs) and CMEs, and (3) CME arrivals with uncertain sources. The first complication occurs, as an example, when a CME has a slow speed and only creates a minimal jump in observed solar wind parameters. In the second case of SIR/CME hybrids, it can be difficult to distinguish the CME arrival time from the jump caused by the SIR in the in situ data. The third complication is likely to arise when there are multiple CMEs predicted to impact the same location at around the same time. The process of matching observed ICMEs to their source eruptions can be difficult.
Simulation information was retrieved from DONKI with a webservice application program interface (API). The DONKI database contains all CCMC space weather team WSA−ENLIL+Cone model runs, each linked to the CME measurement used as model input. Each CME is also linked to its ICME arrival observed at STEREOA, STEREOB, or Earth, verified with the ICME catalogues listed above. Arrivals at Mercury, Venus, and Mars are also recorded but are not used in this study. The predicted arrival times listed in DONKI are automatically obtained where the derivative of the simulated time series of the dynamic pressure crosses a threshold at each location. The forecaster may also add glancing blow CME arrival predictions by manually assessing the simulation contour plots for cases when a CME arrival is not automatically detected from the time series. The API returns a file in Javascript object notation (JSON) format with all simulation and linked information. We developed automated python routines to process the JSON by CME, retrieving the “CME analysis” measurements (velocity, direction, width) and the simulation results for each measurement. We calculated hit, miss, false alarm, and correct rejection statistics based on whether a predicted or observed arrival at Earth, STEREOA, and/or STEREOB is recorded (see Sect. 4.3). For the hits, CMEs both predicted and observed to arrive, we calculated the time difference, in hours, of the predicted and observed arrival.
This automated verification is complicated by the fact that multiple simulations are often performed for the same CME, either with updated CME input parameters or in conjunction with other CMEs. Since each CME analysis measurement can be flagged as either “True” or “False” in DONKI to indicate the best measurement, filtering out “False” CME analyses and filtering by most final simulation results for each CME eliminates repeated CMEs from the error calculations. In cases when a CME with unchanging parameters is simulated multiple times (which occurs only when the CME is simulated with other CMEs), we only consider the first simulation of the CME that was performed. This helps eliminate any double counting of the same CME in different simulations that could skew our results.
Another problem can arise for simulations containing multiple CMEs impacting different locations. For example, if two CMEs are modeled together, ENLIL may output a predicted arrival at STEREOA and at Earth. Only a visual examination of the simulation output can clearly show which CME is predicted to impact each location. The WSA−ENLIL+Cone model results stored in DONKI do not contain information that differentiates between which CME is predicted to impact each location, but this is a feature that CCMC plans to add. Since the predicted impacts of two CMEs modeled together are connected directly to the model and not also to their respective CMEs, this introduces some uncertainty to the skill score calculations because some correct rejections and misses may be counted multiple times for the same simulation, and some spurious false alarms will be counted. To test the impact of this database ambiguity, we excluded multiple simulations including the same CME (40% of our sample) and most of the skill scores remained within the error bars reported (see discussion in Sect. 4.3 and Fig. 6). Note that this uncertainty does not impact our analysis of hits and CME arrival time error, as each CME is directly linked to an observed arrival in the database.
4 CME arrival time verification
4.1 CME arrival time prediction errors
We computed the CME arrival time prediction error Δt_{err} = t_{predicted} − t_{observed} for hits at all three locations (STEREOA, Earth, STEREOB). A hit is defined when a CME is predicted and also observed to arrive. In the case of simulations that had arrival time prediction errors greater than 30 h, the simulation was not counted as a hit, but as a miss. Figure 1 shows the histogram distribution of CME arrival time prediction errors in hours, with negative error indicating early prediction and positive error indicating late prediction. Overall at all locations, we found an average arrival time error of −4.0 h showing a tendency for early prediction, and this can be seen in Figure 1. The tendency for early predictions is −4.1 h at Earth, −4.0 h at STEREOA, and −3.9 h at STEREOB. The distribution of errors at STEREOB is flat compared to Earth and STEREOA.
Figure 2 shows the absolute arrival time error at each location and all together for the entire time period (black). We calculated 95% confidence intervals using a bootstrapping method, resampling 10 000 times with replacement. We found an mean absolute error (MAE) of h at Earth, h at STEREOA, 12.2 ± 2.1 h at STEREOB, and 10.4 ± 0.9 h at all locations considered together. The slightly increased arrival time error at STEREOB may be due to the model being initialized by the oldest magnetogram information among the three locations (however the error bars at each location overlap). As discussed earlier, prediction errors were computed for hits, defined when the CME arrival time error is less than 30 h. To examine the effect of the threshold used to define the hit, we varied the threshold by decreasing it to 24 and 18 h. As expected, the MAE at all locations decreases to 8.9 ± 0.8 and 7.4 ± 0.7 h, respectively.
In Figure 3 (top), we plot the prediction error in hours against the input CME radial speed in km s^{−1}. Looking at CMEs with input speeds below 1000 km s^{−1}, errors are scattered. However, with CMEs with speeds over 1000 km s^{−1}, most arrival time predictions are early, similar to the results found by Mays et al. (2015a). This could be a sign of the modeled CME having too much momentum as defined by a combination of the input speed and halfwidth (which is related to the modeled CME mass). The overestimation of the modeled CME velocity compared to in situ observed values is also due to the modeled CME having a lower magnetic pressure than is observed in typical magnetic clouds. If we exclude CME input speeds below approximately 700 km s^{−1} the MAE becomes 8.5 h at Earth, 8.4 h at STEREOA, 12.2 h at STEREOB. Figure 3 (bottom) shows the CME arrival time percent error versus the input CME radial velocity. The percent error is calculated as the ratio of the prediction error to the transit time from 21.5 to the detecting spacecraft, and this figure shows similar trends. In investigating any correlation between arrival time error and CME input parameters, no significant correlation was found with latitude, longitude, or width. The distribution of input longitudes in Figure 4 roughly correlates with the detecting spacecraft's location. CME input latitudes in Figure 5 (left) are scattered between +60^{∘} and −60°, as expected for CMEs that are detected at spacecraft within the ecliptic plane. There is no obvious correlation between error and CME input width in Figure 5 (right).
Fig. 1 Distribution of CME arrival time prediction errors at Earth (green), STEREOA (red), STEREOB (blue), and all locations (black). The bins are as follow for each of the locations: [−30, −20], [−20, −10], [−10, 0], [0, 10], [10, 20], [20, 30]. The results for each location are distributed laterally within each bin space for clearer presentation. 
Fig. 2 Average absolute error of CME arrival time predictions at Earth, STEREOA, STEREOB, and all together for four different time periods. 
Fig. 3 CME arrival time prediction error versus CME input radial speed (top) and CME arrival time percent error versus CME input radial speed (bottom). 
Fig. 4 CME arrival time prediction errors versus CME longitude. 
Fig. 5 CME arrival time prediction errors versus input CME latitude (left) and CME halfwidth (right). 
4.2 Different spacecraft viewpoint time periods
Figure 2 also shows the error at each location for three different time periods, useful for comparing the arrival time errors during periods of two or three observing spacecraft. The full time period of March 2010–December 2016 is shown in black and was discussed in the previous section. The March 2010–September 2014 period had three coronagraph viewpoints observing CMEs, shown in green. STEREOA longitudes ranged from 65° to 168° and B −71° to −161° during this time period. After September 2014, communication with STEREOB was lost, leaving only two coronagraph viewpoints. From October 2014 through December 2015 (orange), just after the STEREOB communication loss, STEREOA underwent sidelobe operations, resulting in very limited STEREOA coverage during this period and effectively only a single spacecraft was available for CME measurement. Normal STEREOA operations resumed in the last time period of January–December 2016 (brown), there are only 18 hits during this period. STEREOA longitudes ranged from −165° to −143° during January–December 2016, which was far from Earth and reduced the effectiveness of dual coronagraph observations.
The average absolute arrival time error at all locations from March 2010–September 2014 with 224 hits (green: three spacecraft) was 10.1 ± 1.0 h. For the 31 hits during the period of STEREOA sidelobe operations and STEREOB communication loss (orange: singleviewpoint in effect), the mean absolute arrivaltime prediction error is h at all locations. In comparing these two time periods, there is a reduction in skill of h. However, due to the small sample size of only 31 hits in the October 2014–December 2015 (orange) STEREOA sidelobe operation period, the 95% error bars overlap. For the 18 hits after the end of STEREOA sidelobe operations, but still without STEREOB communication from January–December 2016 (brown: two spacecraft), the mean absolute arrivaltime error is h. If we combine the two spacecraft viewpoint periods from October 2014–December 2016 (orange and brown) the error is h, with more events (49) reducing the error bar. Compared to the three spacecraft period (green) the difference becomes h, still overlapping. The overlapping error bars for these time periods due to the small number of hits in the twoviewpoint time period shows that this result does not have 95% statistical significance. However, if we reduce our bootstrapped confidence interval to 60%, the difference in arrival time errors for threeviewpoint (green) vs. singleviewpoint in effect (orange) time period is 1.7 ± 1.2 h (without overlap). This shows a trend for multiview coronagraph observations improving CME arrival time forecast accuracy with 60% confidence and that more events are needed to show this with 95% statistical significance.
Similarly, considering only Earth hits, the difference in arrival time errors for the threeviewpoint (green; MAE = 9.6 ± 0.8 h) vs. singleviewpoint in effect (orange; MAE = 11.6 ± 1.4 h) time period is 2.0 ± 0.6 h if the confidence interval is reduced to 60% (the 95% confidence interval overlaps). For the 10 hits at Earth from January–December 2016 the MAE = h. This increased error compared to the sidelobe period could indicate that the STEREOA viewpoint on the farside did not add to greater accuracy in CME measurement. If we combine the twospacecraft viewpoint periods of October 2014–December 2016 (orange and brown) the error difference with the three spacecraft period is 2.7 ± 1.9 h (75% confidence interval). When considering only STEREOA arrivals, the difference in arrival time error between the time period with three viewpoints (green) and the time period with an effective single viewpoint (orange) is h (with overlap). This large error may suggest that there is a reduction in CME arrival time accuracy when there is only a farside coronagraph (Earth viewpoint) and an extreme ultraviolet instrument (STEREOA viewpoint). However, there are only 4 hits in this reduced time period, so this is not conclusive.
For completeness we have examined all of the CME simulations included in our verification study. This included glancing blow CMEs that are generally more difficult to predict. In the DONKI database predicted glancing blows are manually entered into the system by the forecaster and are flagged as such. We performed the same analysis as Section 4.1 excluding predicted glancing blow arrivals, which reduced the hits at all locations from 273 to 183 and found an average absolute CME arrival time error at all locations of 10.2 ± 1.1 h and h at Earth. As expected, the error decreases when glancing blows are excluded, but we only find a slight decrease.
4.3 CME arrival time skill scores
For the purpose of evaluating our forecasting performance, we defined each simulation as a hit, miss, false alarm, or correct rejection. A hit is an event forecast to occur that did occur. A miss is an event forecast not to occur, but did occur. A false alarm is an event forecast to occur that did not occur. A correct rejection is an event forecast not to occur that did not occur. This is summarized in Table 1. In the case of simulations that had a difference in prediction time and observed arrival time greater than 30 h, the simulation was not counted as a hit, but as a miss. A variety of skill scores were calculated to evaluate model performance as defined in Table 2. The success and false alarm ratios are conditioned on the forecasts (given that an event was forecast, what was the observed outcome?) and represent the fraction of predicted events that were observed, and were not observed, respectively. The accuracy score is the overall fraction of correct forecasts. The bias score indicates if the model is overforecasting or underforecasting by measuring the ratio of the frequency of forecast events to the frequency of observed events, and ranges from −1 to 1. The probability of detection (POD, or hit rate) and probability of false detection (POFD, or false alarm rate) are conditioned on the observations (given that an event was observed, what is the corresponding forecast?) as the fraction of observed events that were predicted and fraction of incorrect observed nonevents, respectively. The Hanssen and Kuipers discriminant (HK = POD−POFD), or true skill statistic, or Pierce's skill score, ranges from −1 to 1, with 1 being a perfect score and 0 is no skill. The HK discriminant measures the ability of the forecast to discriminate between two alternative outcomes and does not rely on climatological event frequency (Jolliffe & Stephenson, 2011).
Table 3 shows the number of hits, misses, false alarms, and correct rejections for the realtime simulations run at the CCMC from March 2010 to December 2016. The greater than symbol before the misses at STEREOA and STEREOB indicates that there are a greater number of misses than we could fully confirm at these locations. This is due to the ICME catalogues for STEREOA and STEREOB having not been updated past 2014 at the time of our analysis. They have since been updated through 2016 at the time of this writing.
Figure 6 shows the calculated skill scores discussed in Table 2. With the exception of the bias score, the skill scores do not vary greatly between locations. At all locations over the whole time period, the hit rate, or POD is 0.50, indicating that half of the observed arrivals were correctly predicted. The false alarm rate, or POFD, is 0.10 and is the fraction of observed nonarrivals that were incorrectly predicted. The Hanssen and Kuipers discriminant (HK = POD − POFD) of 0.39 represents how well the prediction is able to separates arrivals from nonarrivals. On the other hand, the success ratio is 0.40 and the false alarm ratio is 0.60. The success and false alarm ratios represent the fraction of predicted CME arrivals that were observed, and were not observed, respectively. The success ratio is less than the false alarm ratio, and both scores are far from the perfect scores of 1 and 0, respectively, showing poor skill for these scores which are conditioned on predictions. However, due to spurious false alarms possible from the DONKI database ambiguity discussed in Section 3, when we excluded all simulations of more than one CME (40% of our sample) the success and false alarm ratios become 0.55 and 0.45 respectively, showing some minor improvement. The database ambiguity did not effect any other skill scores to within the error bars shown in Figure 6. The accuracy score at all locations is 0.85 and is closer to the perfect score than the success ratio or hit rate because it is the fraction of correct forecasts overall and the correct rejections are the majority of our forecasts.
The bias score is the ratio of the frequency of forecast events to the frequency of observed events and shows whether the model is underforecasting or overforecasting. We find a bias score of 1.2 for all locations considered together, revealing a tendency to overforecast CME arrivals. There is a slight bias of 1.1 at STEREOA and STEREOB, but at Earth there is the highest bias for overforecasting with a bias score of 1.3. This may be due to CCMC space weather team assessing glancing blows in simulation results that were not automatically detected as CME arrivals more often for Earthdirected events, as they have the potential for geoeffectiveness. There is a human bias for overpredicting CME arrivals at Earth by modeling more events. Millward et al. (2013) notes that forecaster judgment plays a significant role in event selection and that operational model performance assessments will include the moredifficulttoforecast glancing blow events compared to nonrealtime assessments.
Contingency table.
Brief description of skill scores derived from the contingency table. The false alarm rate is also known as the probability of false detection (POFD) and the hit rate as the probability of detection (POD).
Hit, miss, false alarm, and correct rejection rates for the WSA−ENLIL+Cone model for the period March 2010–December 2016.
Fig. 6 Success ratio, false alarm ratio, accuracy score, bias score, POD, POFD, and HK (defined in Tab. 2) of total modeled CME events which predict hits at Earth, STEREOA, STEREOB, and all locations combined. Error bars derived from Wilks (2011). 
5 K_{P} prediction verification
The geomagnetic threehour planetary K index (K_{P}) is derived from data from groundbased magnetometers, categorizes geomagnetic activity on a scale from 0 to 9, with 0 being the lowest amount of geomagnetic activity and 9 being the highest (Bartels et al., 1939; Rostoker, 1972; Menvielle & Berthelier, 1991). The time series that the WSA−ENLIL+Cone model outputs at Earth is used as input to a formula derived from the Newell et al. (2007) coupling function to provide a K_{P} forecast time series for three magnetic field clockangle scenarios of 90° (westward), 135° (southwestward), and 180° (southward). Because ENLILmodeled CMEs do not contain an internal magnetic field and the magnetic field amplification is caused mostly by plasma compression, only the magneticfield magnitude is used and the three magnetic field clockangle scenarios are assumed. This provides a simple estimate of three possible maximum values of each time series that the K_{P} index might reach following arrival of the predicted CME shock/sheath. For the forecast, the K_{P} estimates are rounded to the nearest whole number (Mays et al., 2015a). With all three clockangle forecasts, we determined a range from the minimum to the maximum predicted K_{P} and compared to the maximum observed K_{P} for the three days following the CME arrival. If the maximum observed K_{P} within three days after the observed CME arrival fell within the predicted K_{P} range, we counted a hit. If the max observed K_{P} was above the forecast range, we counted a miss. If the max observed K_{P} was below the forecast range, we counted a false alarm. For example, if the predicted K_{p}_{max} for the three clock angle scenarios is 5, 7, 7, and the observed is 6, this is counted as a hit because 6 falls within the predicted K_{p}_{max} range of 5–7. The resulting contingency table for K_{P} range is: 47 hits, 12 false alarms, and 12 misses. There are no correct rejections in this analysis, so we could not calculate an accuracy score. Thus, Figure 7 shows all skill scores from Table 2 except for accuracy, for the different time periods defined in Section 4.1. There were only two predictions for the January–December 2016 time period, so skill scores are not shown. For the entire time period, we found a success ratio of 0.80 and a false alarm ratio of 0.20, meaning that a high fraction of predicted K_{P} ranges contained observed K_{P} value and a low fraction of predicted K_{P} ranges did not contain the observed value. We found a bias score of 1.0, indicating that we are neither under or overforecasting the K_{P} range. The hit rate (POD) of 0.80 shows a high fraction of the observed K_{P} values were within the K_{P} range.
The K_{P} prediction error can be calculated by first taking the maximum of the three K_{p}_{max} predictions as a single prediction (this is generally 180°). For example, for the earlier prediction of K_{p}_{max} = 5, 7, 7, the single maximum K_{p}_{max} prediction is 7. For all K_{P} predictions, the K_{p}_{max} mean error is 0.6 ± 0.4, the mean average error is 1.7 ± 0.3, and the RMSE is 2.2 ± 0.3. The error bars are computed using a similar bootstrapping method as used for the arrival time error. In Figure 8, K_{p}_{max} and K_{p}_{min} prediction errors are plotted against the CME input speed, at times showing large errors given the range of the index from 0 to 9. We find that there is a general overprediction of K_{p}_{max} for CME input speeds above ≈1000 km s^{−1}, similar to the results found by Mays et al. (2015a). This overprediction could be due to an overestimation of the CME dynamic pressure at Earth for faster CMEs, as the inserted CMEs have a lower magnetic pressure than is observed and from the approximation of the CME as a cloud with homogeneous density. We also see that the K_{p}_{min} forecast underpredicts K_{P} for CME speeds under ≈1000 km s^{−1}, likely when the observed K_{P} is caused by magnetic cloud passage, and not the shock/sheath.
Instead of using the forecast K_{P} range, a threshold was used to compute the K_{p}_{max} prediction skill scores in Figure 9. For example, for a threshold of K_{P} = 6, a hit occurs when the forecast K_{p}_{max} and the observed K_{p}_{max} are at or above 6. A correct rejection is counted when the forecast K_{p}_{max} and the observed K_{p}_{max} are both below 6. A false alarm is counted when the forecast K_{p}_{max} is above or at 6 and the observed K_{p}_{max} is below 6. A miss is counted when the forecast K_{p}_{max} is below 6 and the observed K_{p}_{max} is at or above 6. We varied the threshold between 2 and 8 in Figure 9 and computed the resulting skill scores. The figure shows the success ratio, false alarm ratio, accuracy, bias score, hit rate, false alarm rate, and HK as a function of threshold. The skill scores generally decrease in performance as the K_{p}_{max} threshold increases, except the false alarm ratio, and HK. With events categorized as hits at low thresholds shifting to correct rejections at higher thresholds, the total number of hits is decreasing. The success ratio, accuracy score, bias score, hit rate, and false alarm rate are all heavily dependent on the number of hits. Only the accuracy score remains relatively unchanged, since it depends on the total number of events (unchanging). This figure shows that the K_{p}_{max} prediction performs best (where most skill scores perform the best) for an observed K_{p}_{max} of 5 and above, after which the K_{p}_{max} is overforecast (bias score) and the false alarm ratio increases.
Fig. 7 Success ratio, false alarm ratio, bias score, and hit rate skill scores based on whether the observed K_{pmax} falls within the predicted K_{P} range, grouped by forecast time period. 
Fig. 8 K_{pmax} and K_{pmin} prediction error plotted against the CME input speed, at times showing large errors given the range of the index from 0 to 9. There is a general overprediction of K_{pmax} for CME input speeds above ≈1000 km s^{−1}, whereas the K_{pmin} forecast tends underpredict below ≈1000 km s^{−1}. 
Fig. 9 Success ratio, false alarm ratio, accuracy, bias score, and hit rate skill scores for K_{Pmax} forecasts as a function of different thresholds. 
6 Summary and discussion
In this article we evaluate the performance of the WSA−ENLIL+Cone model installed at the CCMC and executed in realtime by the CCMC space weather team from March 2010 to December 2016. The simulations included over 1,800 CMEs and 273 of these were categorized as hits–CMEs that were both observed and predicted to arrive. We computed the CME arrival time prediction error Δt_{err} = t_{predicted} − t_{observed} for hits at three locations: Earth, STEREOA and B. The average absolute error of the CME arrival time was found to be h at Earth, h STEREOA, 12.2 ± 2.1 h at STEREOB, and 10.4 ± 0.9 h at all locations considered together. These errors are comparable to arrival time error results from other studies using WSA−ENLIL or other models (see Sect. 1). At all locations over the whole time period, the hit rate is 0.50, indicating that half of the observed arrivals were correctly predicted. The false alarm rate is 0.10 and represents the fraction of observed nonarrivals that were incorrectly predicted.
Overall, at all locations, we found an average arrival time error of −4.0 h showing a tendency for early predictions. Sources of CME arrival time error include input CME parameters, ambient solar wind prediction accuracy, previous CMEs, input magnetogram limitations/uncertainties, model ambient parameters, and model CME parameters. For example, Temmer et al. (2017) found that interplanetary space takes about 2–5 days to recover to normal background solar wind speed conditions. We are exploring methods to quantify the effect of the ambient solar wind prediction accuracy on CME arrival time, but this is beyond the scope of this article. The tendency for early predictions can also arise from CME input speeds, as measured from coronagraph data, that are too high from measurement of the shock instead of the main CME driver (Mays et al., 2015b). A parametric study by Mays et al. (2015a) found that larger CME half widths also have a small tendency to produce early arrivals. They also report that reducing the ENLIL model parameter of CME cloud density ratio [dcld] may help to predict later arrivals. Overall, the parametric case study shows that after the CME input speed, the cavity ratio [radcav] (radial CME cavity width/CME width) and density ratio assumed in ENLIL have the greatest effects on the predicted CME arrival time.
For Earthdirected CMEs using the WSA−ENLIL+Cone model outputs to compute an estimate of the geomagnetic K_{P} index we found a mean K_{p}_{max} error of 0.6 ± 0.4, a mean average error of 1.7 ± 0.3, and RMSE of 2.2 ± 0.3, and the K_{p}_{max} is generally overpredicted for CME input speeds above ≈1000 km s^{−1}. The K_{p}_{max} overprediction could be due to an overestimation of the CME dynamic pressure at Earth for faster CMEs, as the inserted CMEs have no magnetic pressure other than from the ambient field.
Verification of the singlespacecraft (in effect) period October 2014–December 2015 (without STEREOB and with reduced STEREOA coverage) shows an increase in CME arrival time error of h. Because of overlapping error bars for these time periods due to the small number of hits in the singleviewpoint (in effect) time period this result does not have 95% statistical significance, but instead 60% (1.7 ± 1.2). Nevertheless we show a trend for multiview coronagraph observations improving CME arrival time forecast accuracy, and more events are needed to show this with 95% statistical significance. For example, a future space weather mission at L5 or L4 as a second coronagraph viewpoint would reduce CME arrival time errors compared to a single L1 viewpoint. For example, Akioka et al. (2005), Simunac et al. (2009), Gopalswamy et al. (2011), Strugarek et al. (2015), Vourlidas (2015), Lavraud et al. (2016), and Weinzierl et al. (2016) all identify the potential benefit of an L5 mission. Lavraud et al. (2016) proposes an L5 mission that would measure the coronal magnetic field using a polarization technique, in addition to whitelight imaging. Weinzierl et al. (2016) shows how an L5 mission equipped with a magnetic imager could improve predictions of solar wind, which is particularly important for the WSA–ENLIL+Cone model.
We also discussed various factors that should be taken into consideration when interpreting these verification results. This includes uncertainties arising from the determination of CME input parameters from realtime coronagraph data, CME measurements from a variety of forecasters each with their own biases, the identification of ICME arrivals in situ, glancing blow ICME arrivals, and multiple ICME arrivals. In future work, quality factors will be introduced for observed arrivals and identifying candidate CMEs. We will further quantify model performance by evaluating how well observed in situ solar wind measurements compare to modeled values. We will also plan to assess CME arrivals at other locations, such as Mercury (Winslow, 2015) and Mars.
Acknowledgements
This work was completed while A.M.W. was an undergraduate student at American University, now a graduate student at University of Colorado Boulder. A.M.W. and M.L.M. thank Christine Verbeke and the CME Arrival Time and Impact Working Team (ccmc.gsfc.nasa.gov/assessment/topics/heliocmearrival.php) for useful discussions. M.L.M. acknowledges the support of NASA grant NNX15AB80G. L.K.J. acknowledges the support of NASA’s STEREO project, NSF grants AGS 1321493 and 1259549. D. Odstrcil acknowledges the support of NASA LWSSC NNX13AI96G. The WSA model was developed by N. Arge (now at NASA/GSFC), and the ENLIL model was developed by D. Odstrcil (now at GMU). Estimated realtime planetary indices are from NOAA as archived on CCMC’s iSWA system (ccmc.gsfc.nasa.gov/iswa). The editor thanks Ward Manchester and an anonymous referee for their assistance in evaluating this paper.
References
 Akioka M, Nagatsuma T, Miyake W, Ohtaka K, Marubashi K. 2005. The L5 mission for space weather forecasting. Adv Space Res 35: 65–69. Mars International Reference Atmosphere, Living With a Star and Fundamental Physics. DOI:10.1016/j.asr.2004.09.014. [CrossRef] [Google Scholar]
 Arge CN, Pizzo VJ. 2000. Improvement in the prediction of solar wind conditions using nearreal time solar magnetic field updates. J Geophys Res 105: 10465–10480. DOI:10.1029/1999JA000262. [NASA ADS] [CrossRef] [Google Scholar]
 Arge CN, Luhmann JG, Odstrčil D, Schrijver CJ, Li Y. 2004. Stream structure and coronal sources of the solar wind during the May 12th, 1997 CME. J Atmos SolTerr Phys 66: 1295–1309. DOI:10.1016/j.jastp.2004.03.018. [NASA ADS] [CrossRef] [Google Scholar]
 Bartels J, Heck NH, Johnston HF. 1939. The threehourrange index measuring geomagnetic activity. Terr Magn Atmos Electr (J Geophys Res) 44: 411. DOI:10.1029/TE044i004p00411. [NASA ADS] [CrossRef] [Google Scholar]
 Colaninno RC, Vourlidas A, Wu CC. 2013. Quantitative comparison of methods for predicting the arrival of coronal mass ejections at Earth based on multiview imaging. J Geophys Res (Space Phys) 118: 6866–6879. DOI:10.1002/2013JA019205. [CrossRef] [Google Scholar]
 Domingo V, Fleck B, Poland AI. 1995. The SOHO mission: an overview. Sol Phys 162: 1–37. [NASA ADS] [CrossRef] [Google Scholar]
 Emmons D, Acebal A, Pulkkinen A, Taktakishvili A, MacNeice P, Odstrčil D. 2013. Ensemble forecasting of coronal mass ejections using the WSA−ENLIL with CONED Model. Space Weather 11: 95–106. DOI:10.1002/swe.20019. [CrossRef] [Google Scholar]
 Gopalswamy N, Davila J, Cyr OS, Sittler E, Auchère F, et al. 2011. EarthAffecting Solar Causes Observatory (EASCO): a potential international living with a star Mission from SunEarth L5. J Atmos SolTerr Phys 73: 658–663. DOI:10.1016/j.jastp.2011.01.013. [CrossRef] [Google Scholar]
 Jian L, Russell CT, Luhmann JG, Skoug RM. 2006. Properties of interplanetary coronal mass ejections at one AU during 1995–2004. Sol Phys 239: 393–436. DOI:10.1007/s1120700601332. [NASA ADS] [CrossRef] [Google Scholar]
 Jian LK, Russell CT, Luhmann JG, Galvin AB, Simunac KDC. 2013. Solar wind observations at STEREO: 2007–2011. Sol Wind 13 1539: 191–194. DOI:10.1063/1.4811020. [Google Scholar]
 Jian LK, Russell CT, Luhmann JG, MacNeice PJ, Odstrčil D, Riley P, Linker JA, Skoug RM, Steinberg JT. 2011. Comparison of observations at ACE and Ulysses with Enlil model results: stream interaction regions during Carrington rotations 2016–2018. Sol Phys 273: 179–203. DOI:10.1007/s1120701198587. [CrossRef] [Google Scholar]
 Jolliffe I, Stephenson D, 2011, Forecast verification: a practioner’s guide in atmospheric science, 2nd edn, Wiley, New Jersey, USA. [CrossRef] [Google Scholar]
 Kaiser ML, Kucera TA, Davila JM, Cyr OC St, Guhathakurta M, Christian E. 2008. The STEREO mission: an introduction. Space Sci Rev 136: 5–16. DOI:10.1007/s1121400792770. [NASA ADS] [CrossRef] [Google Scholar]
 Lavraud B, Liu Y, Segura K, He J, Qin G, et al. 2016. A small mission concept to the SunEarth Lagrangian L5 point for innovative solar, heliospheric and space weather science. J Atmos SolTerr Phys 146(Supplement C): 171–185. DOI:10.1016/j.jastp.2016.06.004. [CrossRef] [Google Scholar]
 Lee CO, Arge CN, Odstrčil D, Millward G, Pizzo V, Quinn JM, Henney CJ. 2013. Ensemble modeling of CME propagation. Sol Phys 285: 349–368. DOI:10.1007/s1120701299801. [NASA ADS] [CrossRef] [Google Scholar]
 Mays ML, Taktakishvili A, Pulkkinen A, MacNeice PJ, Rastätter L, et al. 2015a. Ensemble modeling of CMEs using the WSAENLIL+Cone model. Sol Phys 290: 1775–1814. DOI:10.1007/s1120701506921. [NASA ADS] [CrossRef] [Google Scholar]
 Mays ML, Thompson BJ, Jian LK, Colaninno RC, Odstrčil D, et al. 2015b. Propagation of the 7 January 2014 CME and resulting geomagnetic nonevent. Astrophys J 812: 145. DOI:10.1088/0004637X/812/2/145. [NASA ADS] [CrossRef] [Google Scholar]
 Menvielle M, Berthelier A. 1991. The Kderived planetary indices − description and availability. Rev Geophys 29: 415–432. DOI:10.1029/91RG00994. [CrossRef] [Google Scholar]
 Millward G, Biesecker D, Pizzo V, Koning CA. 2013. An operational software tool for the analysis of coronagraph images: determining CME parameters for input into the WSAEnlil heliospheric model. Space Weather 11: 57–68. DOI:10.1002/swe.20024. [CrossRef] [Google Scholar]
 Möstl C, Amla K, Hall JR, Liewer PC, De Jong EM, et al. 2014. Connecting speeds, directions and arrival times of 22 coronal mass ejections from the Sun to 1 AU. Astrophys J 787: 119. DOI:10.1088/0004637X/787/2/119. [CrossRef] [Google Scholar]
 Möstl C, Rollett T, Frahm R, Liu Y, Long D, et al. 2015. Strong coronal channeling and interplanetary evolution of a solar storm up to Earth and Mars. Nat Commun 6: 7135. DOI:10.1038/ncomms8135. [NASA ADS] [CrossRef] [Google Scholar]
 Newell PT, Sotirelis T, Liou K, Meng CI, Rich FJ. 2007. A nearly universal solar windmagnetosphere coupling function inferred from 10 magnetospheric state variables. J Geophys Res 112: A01206. DOI:10.1029/2006JA012015. [Google Scholar]
 NievesChinchilla T. Linton MG, Hidalgo MA, Vourlidas A, Savani NP, Szabo A, Farrugia C, Yu W. 2016. A circularcylindrical fluxrope analytical model for magnetic clouds. Astrophys J 823: 27. DOI:10.3847/0004637X/823/1/27. [NASA ADS] [CrossRef] [Google Scholar]
 Odstrčil D. 2003. Modeling 3D solar wind structure. Adv Space Res 32: 497–506. DOI:10.1016/S02731177(03)003326. [NASA ADS] [CrossRef] [Google Scholar]
 Odstrčil D, Pizzo VJ. 1999a. Threedimensional propagation of CMEs in a structured solar wind flow: 1. CME launched within the streamer belt. J Geophys Res 104: 483–492. DOI:10.1029/1998JA900019. [NASA ADS] [CrossRef] [Google Scholar]
 Odstrčil D, Pizzo VJ. 1999b Threedimensional propagation of coronal mass ejections in a structured solar wind flow 2. CME launched adjacent to the streamer belt. J Geophys Res 104: 493–504. DOI:10.1029/1998JA900038. [NASA ADS] [CrossRef] [Google Scholar]
 Odstrčil D, Smith Z, Dryer M. 1996. Distortion of the heliospheric plasma sheet by interplanetary shocks. Geophys Res Lett 23: 2521–2524. DOI:10.1029/96GL00159. [CrossRef] [Google Scholar]
 Odstrčil D, Riley P, Zhao XP. 2004. Numerical simulation of the 12 May 1997 interplanetary CME event. J Geophys Res (Space Phys) 109: A02116. DOI:10.1029/2003JA010135. [Google Scholar]
 Richardson IG, Cane HV. 2010. NearEarth interplanetary coronal mass ejections during solar cycle 23 (1996–2009): catalog and summary of properties. Sol Phys 264: 189–237. DOI:10.1007/s1120701095686. [NASA ADS] [CrossRef] [Google Scholar]
 Rostoker G. 1972. Geomagnetic indices. Rev Geophys Space Phys 10: 935–950. DOI:10.1029/RG010i004p00935. [CrossRef] [Google Scholar]
 Simunac KDC, Kistler LM, Galvin AB, Popecki MA, Farrugia CJ. 2009. In situ observations from STEREO/PLASTIC: a test for L5 space weather monitors. Ann Geophys 27: 3805–3809. DOI:10.5194/angeo2738052009. https://www.anngeophys.net/27/3805/2009/ [CrossRef] [Google Scholar]
 Strugarek A, Janitzek N, Lee A, Löschl P, Seifert B, et al. 2015. A space weather mission concept: Observatories of the Solar Corona and Active Regions (OSCAR). J Space Weather Space Clim 5: A4. DOI:10.1051/swsc/2015003. [CrossRef] [Google Scholar]
 Taktakishvili A, Kuznetsova M, MacNeice P, Hesse M, Rastätter L, Pulkkinen A, Chulaki A, Odstrčil D. 2009. Validation of the coronal mass ejection predictions at the Earth orbit estimated by ENLIL heliosphere cone model. Space Weather 7: S03004. DOI:10.1029/2008SW000448. [CrossRef] [Google Scholar]
 Temmer M, Reiss MA, Nikolic L, Hofmeister SJ, Veronig AM. 2017. Preconditioning of interplanetary space due to transient CME disturbances. Astrophys J 835: 141. http://stacks.iop.org/0004637X/835/i=2/a=141. [NASA ADS] [CrossRef] [Google Scholar]
 Vourlidas A. 2015. Mission to the SunEarth L5 Lagrangian point: an optimal platform for space weather research. Space Weather 13: 197–201. DOI:10.1002/2015SW001173. [CrossRef] [Google Scholar]
 Vršnak B, Temmer M, Žic T, Taktakishvili A, Dumbović M, Möstl C, Veronig AM, Mays ML, Odstrčil D. 2014. Heliospheric propagation of coronal mass ejections: comparison of numerical WSAENLIL+Cone model and analytical dragbased model. Astrophys J Suppl Ser 213: 21. DOI:10.1088/00670049/213/2/21. [CrossRef] [Google Scholar]
 Weinzierl M, Mackay DH, Yeates AR, Pevtsov AA. 2016. The possible impact of L5 magnetograms on nonpotential solar coronal magnetic field simulations. Astrophys J 828: 102. DOI:10.3847/0004637X/828/2/102. [CrossRef] [Google Scholar]
 Wilks D, 2011, Statistical methods in atmospheric sciences: an introduction, Academic Press, Massachusetts, USA. [Google Scholar]
 Winslow RM. 2015. Interplanetary coronal mass ejections from MESSENGER orbital observations at Mercury. J Geophys Res 120: 6101–6118. DOI:10.1002/2015JA021200. [NASA ADS] [CrossRef] [Google Scholar]
 Xie H, Ofman L, Lawrence G. 2004. Cone model for halo CMEs: application to space weather forecasting. J Geophys Res (Space Phys) 109: A03109. DOI:10.1029/2003JA010226. [Google Scholar]
 Zhao XP, Plunkett SP, Liu W. 2002. Determination of geometrical and kinematical properties of halo coronal mass ejections using the cone model. J Geophys Res (Space Phys) 107: 1223. DOI:10.1029/2001JA009143. [Google Scholar]
Cite this article as: Wold AM, Mays ML, Taktakishvili A, Jian LK, Odstrcil D, MacNeice P. 2018. Verification of realtime WSA−ENLIL+Cone simulations of CME arrivaltime at the CCMC from 2010 to 2016. J. Space Weather Space Clim. 8: A17
All Tables
Brief description of skill scores derived from the contingency table. The false alarm rate is also known as the probability of false detection (POFD) and the hit rate as the probability of detection (POD).
Hit, miss, false alarm, and correct rejection rates for the WSA−ENLIL+Cone model for the period March 2010–December 2016.
All Figures
Fig. 1 Distribution of CME arrival time prediction errors at Earth (green), STEREOA (red), STEREOB (blue), and all locations (black). The bins are as follow for each of the locations: [−30, −20], [−20, −10], [−10, 0], [0, 10], [10, 20], [20, 30]. The results for each location are distributed laterally within each bin space for clearer presentation. 

In the text 
Fig. 2 Average absolute error of CME arrival time predictions at Earth, STEREOA, STEREOB, and all together for four different time periods. 

In the text 
Fig. 3 CME arrival time prediction error versus CME input radial speed (top) and CME arrival time percent error versus CME input radial speed (bottom). 

In the text 
Fig. 4 CME arrival time prediction errors versus CME longitude. 

In the text 
Fig. 5 CME arrival time prediction errors versus input CME latitude (left) and CME halfwidth (right). 

In the text 
Fig. 6 Success ratio, false alarm ratio, accuracy score, bias score, POD, POFD, and HK (defined in Tab. 2) of total modeled CME events which predict hits at Earth, STEREOA, STEREOB, and all locations combined. Error bars derived from Wilks (2011). 

In the text 
Fig. 7 Success ratio, false alarm ratio, bias score, and hit rate skill scores based on whether the observed K_{pmax} falls within the predicted K_{P} range, grouped by forecast time period. 

In the text 
Fig. 8 K_{pmax} and K_{pmin} prediction error plotted against the CME input speed, at times showing large errors given the range of the index from 0 to 9. There is a general overprediction of K_{pmax} for CME input speeds above ≈1000 km s^{−1}, whereas the K_{pmin} forecast tends underpredict below ≈1000 km s^{−1}. 

In the text 
Fig. 9 Success ratio, false alarm ratio, accuracy, bias score, and hit rate skill scores for K_{Pmax} forecasts as a function of different thresholds. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.