Issue 
J. Space Weather Space Clim.
Volume 5, 2015



Article Number  A7  
Number of page(s)  9  
DOI  https://doi.org/10.1051/swsc/2015008  
Published online  09 March 2015 
Research Article
Solar wind driven empirical forecast models of the time derivative of the ground magnetic field
^{1}
Swedish Institute of Space Physics, Scheelevägen 17, 223 70
Lund, Sweden
^{2}
Finnish Meteorological Institute, Erik Palménin Aukio 1, 00560
Helsinki, Finland
^{*} Corresponding author: peter@lund.irf.se
Received:
29
October
2014
Accepted:
16
February
2015
Empirical models are developed to provide 10–30min forecasts of the magnitude of the time derivative of local horizontal ground geomagnetic field (dB_{h}/dt) over Europe. The models are driven by ACE solar wind data. A major part of the work has been devoted to the search and selection of datasets to support the model development. To simplify the problem, but at the same time capture sudden changes, 30min maximum values of dB_{h}/dt are forecast with a cadence of 1 min. Models are tested both with and without the use of ACE SWEPAM plasma data. It is shown that the models generally capture sudden increases in dB_{h}/dt that are associated with sudden impulses (SI). The SI is the dominant disturbance source for geomagnetic latitudes below 50° N and with minor contribution from substorms. However, at occasions, large disturbances can be seen associated with geomagnetic pulsations. For higher latitudes longer lasting disturbances, associated with substorms, are generally also captured. It is also shown that the models using only solar wind magnetic field as input perform in most cases equally well as models with plasma data. The models have been verified using different approaches including the extremal dependence index which is suitable for rare events.
Key words: Solar wind / Ground magnetic field / Forecast
© P. Wintoft et al., Published by EDP Sciences 2015
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
The main factor for space weatherrelated effects on electrical power grids are geomagnetically induced currents (GIC). The effects of GIC can either be immediate, leading to voltage fluctuations, tripping of relays, heating and possible loss of transformers (Bolduc 2002) or cumulative with the buildup of damaging levels of gasses in the transformer oil (Gaunt 2014). Several studies have been carried out that investigate space weather events, and related GIC and technological effects (Lundstedt 2006; Wik et al. 2009; Schrijver & Mitchell 2013).
The distribution and magnitude of GIC in a power grid are determined by a geophysical component (electric fields, ground conductivity) and a technological component (grid topology, resistances) (Pirjola et al. 2004). Due to limited direct measurement of the electric field, such as measurements carried out at a few locations in the UK (private comm. A. Thomson) and Hungary (Kis et al. 2007), techniques have been developed to compute the Efields from observed geomagnetic fields such as the complex image method (Boteler & Pirjola 1998), thinsheet approximation (Beggan et al. 2013), finite element method (Dong et al. 2013). In this work we have used the spherical elementary current system (SECS) approach (Viljanen et al. 2004).
The aim of this work is to develop models predicting a quantity that can be related to electric fields using upstream solar wind data. Empirical and data driven modelling have been applied to many different geomagnetic indices, originating from the work by Burton et al. (1975) for a solar wind – Dst model. The data driven approach to modelling assumes some form of the inputoutput mapping function, either from basic physical reasoning, or from a generic function, and finds the unknown coefficients from observed data. The target output should capture some essential feature of the phenomena under study, e.g. indices for various geomagnetic and ionospheric processes (Mayaud 1980). Previous studies have shown that there is a good relation between rateofchange of the magnetic field dB/dt and calculated electric field and measured GIC (Viljanen et al. 2001; Wintoft 2005), where the temporal resolution of B should be 1 min or better to capture the dominant variability in dB/dt (Wintoft 2005). It is not feasible to predict the detailed variation of dB/dt at 1min resolution, therefore a summary measure is adopted. The relationship between the envelopes of geomagnetic disturbances and GIC is stronger than between the detailed variations themselves (Trichtchenko & Boteler 2004). In Wintoft (2005) the 10min rootmeansquare (RMS) of the magnitude of the horizontal derivate (dB_{h}/dt) was used. In this work a similar approach is used based on 30min maximum values driven by 1min realtime solar wind data.
2. Forecast parameter, data selection and models
2.1. Datasets
The data consist of ground geomagnetic field measurements, and solar wind plasma and magnetic field. Geomagnetic data, at 1min resolution, are obtained from World Data Centre for Geomagnetism (Edinburgh).^{1} One minute differences are formed as an approximation to the time derivative (dB_{x}/dt, dB_{y}/dt, dB_{z}/dt) in the XYZ coordinate system, where X points north, Y east and Z down. The data quality is generally high, however, clearly erroneous dB/dt values, with 1–2 min duration, are replaced with interpolated values. These appear as single spikes in dB/dt associated with sudden shift in baseline of B from 1 min to the next, and as double spikes with different signs in dB/dt associated with a single spike in B at a single station during quiet periods.
The solar wind plasma and magnetic field data come from two instruments on the Advanced Composition Explorer (ACE) spacecraft (Stone et al. 1998). The SWEPAM instrument (McComas et al. 1998) provides the bulk density and velocity with 64 s resolution. The magnetic field instrument (Smith et al. 1998) has a resolution of 16 s. The ACE Level 2 data are obtained from The ACE Science Center web site.^{2} Both plasma and magnetic field are resampled to 1min resolution and data gaps of typically less than 15 min are replaced by linearly interpolated values. Data gaps are more common in the density while the magnetic field data have very few data gaps.
2.2. Time derivative of horizontal geomagnetic field
We make the approximation that the horizontal electric field E is proportional to dB_{h}/dt, i.e. (E_{x}, E_{y}) ∝ (dB_{y}/dt, dB_{x}/dt). This step neglects the ground conductivity which affects both the amplitude and power spectrum of the electric field (Cagniard 1953; Viljanen et al. 2012). However, the simplification is still reasonable as shown in Section 2.5. The magnitude of the horizontal rateofchange, defined here as e_{h}, is formed$${e}_{h}=\left\mathrm{d}{\mathit{B}}_{h}/\mathrm{d}t\right=\sqrt{(\mathrm{d}{B}_{x}/\mathrm{d}t{)}^{2}+(\mathrm{d}{B}_{y}/\mathrm{d}t{)}^{2}}\propto {E}_{h}=\left{\mathbf{E}}_{h}\right.$$(1)
2.3. Temporal filtering
During events associated with sudden impulses (SI) and geomagnetic storms the changes in e_{h} are often abrupt with most of the variability taking place for time scales less than 30 min, and with the peak in the power spectrum around 1 min or less (Wintoft 2005). From 10second magnetic field data it can be shown that the peak in power spectrum is at, or slightly below, 1 min indicating that a sampling rate of 30 s (Nyquist rate) or better should be used in order to resolve the largest peaks in dB/dt. However, using 1min resolution data in estimating E works reasonably well (Pulkkinen et al. 2006) although the amplitude of E will be slightly underestimated. A similar variability is seen in measured GIC at a location in southern Sweden. The response at other locations will be different due to geographical differences in ionospheric and magnetospheric processes, and ground conductivity.
Typical prediction lead times range from about 30 min, or less, to a couple of hours depending on the dominant geophysical response. The shortest lead times are associated with solar wind shocks and other sudden increases in pressure which creates SI’s, while a longer lead time may be possible depending on local time arrival of the solar wind disturbance and the following geomagnetic storm. In this work a constant prediction lead time of 30 min is used.
From the two considerations above, temporal forward moments are formed$${{e}_{h}^{\left(a\right)}\left(t\right)=\left(\frac{1}{30}\sum _{{t}^{\text{'}}=t+1}^{t+30}\mathrm{}{e}_{h}^{a}\left({t}^{\mathrm{\prime}}\right)\right)}^{\frac{1}{a}},$$(2)where t is time in minutes. The first two moments are the arithmetic mean (a = 1) and the rootmeansquare (a = 1), respectively. Higher moments will give higher weight to large values in e_{h}. In the extreme case with a → ∞ it becomes the 30min maximum, i.e. . Applying the average (a = 1), used in e.g. Weigel et al. (2002), will result in strong damping of the signal, more strongly for shortlived features like SI’s. Another aspect is that for a sudden increase in e_{h} there will be a gradual increase in . At the other extreme, with a → ∞, the response will be immediate to any increases in e_{h}. Thus, an SI will be captured by , but it will not resolve at what time within the 30min interval the maximum occurs. For different types of activities, like SI’s, substorms or pulsations, will follow the envelope of the disturbance but the temporal location of individual peaks within 30min intervals will not be resolved. As mentioned in the Introduction, this work is based on the 30min maximum of dB_{h}/dt, i.e. . The 30min maximum dB_{h}/dt was also used in the study by Schrijver & Mitchell (2013).
2.4. Data selection
Table 1 shows the 90% and 99% percentiles, and the maximum of (30min maximum dB_{h}/dt) for all available data for stations with good temporal coverage. The stations are ordered with increasing geomagnetic latitude. It is seen that 90% of the time is below 10 nT/min, except for the very high latitude stations. At the 99% level it is still quite small and typically less than 20 nT/min up to geomagnetic latitudes of 55° N. The maximum value for each station range from about 100 nT/min to close to 3000 nT/min. Any detailed statistical comparison between stations is not possible in this table as they cover different time spans and have various degree of data gaps, and therefore are subject to different levels of geomagnetic activity.
The table lists for each station (Id) the geomagnetic latitude (Lat), the start and end dates for the period when data exist, the number of 30min samples (N), and the 90% and 99% percentiles, and the maximum of for all available data in units of nT/min. The magnetic coordinates are based on IGRF for the year 2010. Note that the statistics is only meaningful per station basis and that comparison between stations cannot be done as the they cover different periods and have varying degree of data gaps.
In Table 2, a subset of stations is selected for which there are good temporal overlap between stations. Further, only samples for which data exist simultaneously for all stations have been included. Therefore, the statistics formed for this set will include data for which all stations are subject to the same geomagnetic events. Generally, increases with increasing geomagnetic latitude, although there are a few exceptions to the general trend. The table also lists the computed conductances for different integration depths of 40, 80 and 160 km, respectively, where the ground resistivity used are those given in the map by Adam et al. (2012). It is seen that, to some degree, stations at similar geomagnetic latitudes have larger observed dB_{h}/dt when the conductance is higher. Of course, the individual stations may be subject to more localised ground conductance effects than those captured by the map. The final column shows the p coefficient from Eq. (4) and is described in the next section.
The table lists for each station (Id) the geomagnetic latitude (Lat), and the 90% and 99% percentiles, and the maximum of for all available data. The computed conductances in log 10(S) are shown using three different integration depths: 40 km (S40), 80 km (S80) and 160 km (S160). The coefficient p for the linear relation with electric field is also shown in units of (mV/km)/(nT/min). Only data with simultaneous 30min intervals are chosen. There are N = 134,452 30min samples covering the period 19990106 to 20081231.
The database is searched for events with increased activity in e_{h}. To specify criteria that defines an event is not a trivial task. The start time of an event could be defined as the time of the SI. However, not all storms are associated with a clear SI, and the identification of SI’s is complex and their is no real consensus on how to automatically identify them (Curto et al. 2007). Instead, we adopt an eventsearch algorithm as described in the following paragraph.
The first step is to find the first timestamp for which e_{h} exceeds a threshold e_{high.} The timestamps marking the start and end of event one are selected as the timestamps before when e_{h} is below a threshold e_{low}. However, as single 1min e_{h} values may fall close to zero even within an active period some temporal filtering should be applied. The autocorrelation of e_{h} indicates that there is a quite rapid decay in activity, which is latitude dependent, and has fallen below 1/e ≈ 0.37 at around 150 min or less. The latitude dependent decay is the result of different dominant processes, at high latitudes the substorms dominate, while at low latitudes the shortlived SI’s dominate. Guided by this, temporal maximum filtering with a time window of 3 h = 180 min is applied to e_{h}. Therefore, an event is defined as the time interval containing at least one value m(t) = max{e_{h}(t), …, e_{h}(t + 180)} exceeding a threshold value e_{high} and falling below e_{low} outside the event, where t is in minutes. Thus, the filtering is a 3h forward maximum. The search algorithm can be summarised as$${t}_{i}^{\mathrm{*}}=\mathrm{min}\left\{t;m\left(t\right){e}_{\mathrm{high}}\forall t{t}_{i1}^{\left(b\right)}\right\},$$ $${t}_{i}^{\left(a\right)}=\mathrm{max}\left\{t;m\left(t\right){e}_{\mathrm{low}}\forall t{t}_{i}^{\mathrm{*}}\right\},$$ $${t}_{i}^{\left(\mathrm{b}\right)}=\mathrm{min}\left\{t;m\left(t\right){e}_{\mathrm{low}}\forall t{t}_{i}^{\mathrm{*}}\right\},$$(3)where the subscript i (i ≥ 1) refers to event i, is the first time the upper threshold is exceeded for event i, and and are the start and end times, respectively. For the first event, i = 1, the limit is set to the first time stamp in the dataset which reduces the first expression to . The start and end times for the first event, and , respectively, can then be determined. The other events are then found for i > 1. The classification of an event is illustrated in Figure 1.
Fig. 1. Illustration of the 3hour maximum filtering (m(t), red curve), thresholds, and the event timestamps , and . The example shows a period in November 2003 for the ESK station (blue curve). 
The midlatitude station ESK is chosen as the reference to define the events. Generally, an event at ESK is accompanied with events at the other locations, although their amplitudes are different. There might also be events that fall below the ESK threshold that would have been selected if another station would be used as reference. Setting the limits to e_{low} = 6.4 nT/min and e_{high} = 48 nT/min results in 185 events with an average length of 38 h, ranging from 9 to 96 h. The lower limit and upper limit correspond to the 90% and 99.8% percentiles, respectively.
The target data are based on data from the stations in Table 2. However, the ACE spacecraft further limits the amount of available data due to its shorter temporal coverage and due to data gaps. When only ACE magnetic field data are used the number of available events is reduced to 80. When the density and velocity also are included the number of events decreases to 58. Another aspect is that there are more events with large e_{h} for the ACE magnetic fieldonly set; e.g. for the magnetic field set there are 19 events with a maximum e_{h} above 100 nT/min, while for the plasma set there are only 10 events. The main reason for this is the lack of data from the SWEPAM instrument during strong proton events that in turn are the result of fast CMEs generating the strongest geomagnetic storms.
2.5. The geoelectric field
The electric field E cannot be directly calculated from using the method by Viljanen et al. (2012). However, an empirical relation can be derived between observed and the calculated 30min maximum E_{h} of the horizontal electric field. We call this quantity .
We applied the EURISGIC ground conductivity model (Adam et al. 2012) to calculate the electric field in 1996–2008 at selected points in Europe using the method by Viljanen et al. (2012). Figure 2 shows the relation between and at four observatories in October 2003. We note that within each 30min interval the maximum E_{h} is not necessarily simultaneous with the maximum dB_{h}/dt. However, there is a clear linear dependence at all sites. Therefore, we express the maximum electric field as$${E}_{h}^{\left(\mathrm{\infty}\right)}=p\cdot {e}_{h}^{\left(\mathrm{\infty}\right)},$$(4)where p is an empirical coefficient specific for each site. This coefficient depends on the local ground conductivity model.
Fig. 2. Maximum of the modelled 1min horizontal electric field versus the maximum of dB_{h}/dt for 30min sequences 00:00–00:29, 00:30–00:59, … UT in October 2003. Top left: ABK, top right: UPS, bottom left: BFE, bottom right: FUR. C is the linear correlation coefficient, p(1) is the slope of the fitted linear curve. 
The coefficient p is determined at individual stations for each month using least squares fit. The variation in p from month to month is rather small, so it is reasonable to use a single value for each site, for which we select the average of monthly values. In Table 2 the coefficients and one standard deviations are shown for the selected stations.
2.6. Models
The models mapping from solar wind to consist of Elman neural networks (Elman 1990) and have previously been used for similar problems (Lundstedt et al. 2001; Wintoft et al. 2005). We will not go into the details of the models here, but instead refer to the mentioned papers and references therein. The Elman network can learn not only structures that are spatially related but also structures over time through its internal memory. This is an important feature when the system under study exhibits memory, like the solar wind coupling to geomagnetic storms and substorms. It should also be noted that the temporal resolution of the network memory decreases for more distant features. Therefore, a tapped delay line of the inputs is applied to fully resolve features that are in the near past. The network performs the following mapping$$\mathit{u}\left(t\right)=\left({B}_{y}\left(t\right),{B}_{z}\left(t\right),B\left(t\right),n\left(t\right),V\left(t\right)\right),$$(5) $$y\left(t\right)=f\left({g}_{s}\left(t\right),{g}_{c}\left(t\right),\mathit{u}\left(t\right),\mathit{u}\left(t\Delta \right),\dots ,\mathit{u}\left(t\left({n}_{\Delta}1\right)\Delta \right)\right)=f\left({x}_{1}\left(t\right),{x}_{2}\left(t\right),\dots ,{x}_{n}\left(t\right)\right),$$(6)where g_{s} and g_{c} are the sine and cosine of local time; B_{y}, B_{z}, are the ACE magnetic field y and z components in GSM; B is the magnetic field magnitude; and n and V are the plasma density and velocity. The tapped delay line has n_{Δ} terms, each separated with Δ minutes. Each input is also labelled as x_{i}(t) where the total number of inputs is n. The choice of values of n_{Δ} and Δ is described below. The g_{s} and g_{c} components are included to capture local time variation. The mapping, with sine and cosine functions, ensures that two instances close in time are numerically close, and that a specific timestamp is uniquely described. Models are also developed without the plasma n and V.
The Elman network implements the mapping$${z}_{j}\left(t\right)=\mathrm{tanh}\left(\sum _{i=1}^{n}\mathrm{}{w}_{j,i}{x}_{i}\left(t\right)+\sum _{k=1}^{m}\mathrm{}{c}_{j,k}{z}_{k}\left(t1\right)+{a}_{j}\right),$$(7) $$y\left(t\right)=\sum _{j=1}^{m}\mathrm{}{v}_{j}{z}_{j}\left(t\right)+b.$$(8)
The free parameters are the number of hidden neurons m, the weights w_{j,i}, c_{j,k}, and biases a_{j} and b. The optimisation regarding the free parameters are done in the same way as described in Wintoft (2005).
For an average L1Earth distance of 1.5 × 10^{6} km the solar wind bulk flow will have typical travel times ranging from 60 min to 25 min for velocities from 400 km/s to 1000 km/s. However, one should note that for events with shocks, the bulk velocity and the shock speed are not the same (Viñas & Scudder 1986). There are also extreme cases, like on 29th October 2003 when the solar wind magnetic disturbance (no velocity data were available) preceded the following SI with only 13 min, indicating a shock speed close to 2000 km/s.
The target output for the model is (Eq. (2)). As is a 30min forward maximum any geomagnetic disturbance at t + 30 min will be reflected at timestamp t in . For most solar wind velocities a prediction lead time of 30 min is reasonable, therefore, the model output y(t) is targeted to predict . However, as noted above, longer delays between the solar wind disturbance and the geomagnetic response are also possible and the delay line is set to (n_{Δ}, Δ) = (4, 10 min), in order to have a detailed description of the solar wind state for the past 30 min. It should also be noted that events with an L1Earth travel time less than 30 min will be predicted, if the model works, but that the lead time will be shorter, in the extreme cases just 10 min.
2.7. Training and validation
The dataset is divided into three subsets, each set having similar statistics with respect to . One set is used for training, i.e. the weights and biases are incrementally adjusted in order to minimise the errors between target and model outputs. A large number of networks with different initial weight values and different numbers of weights are trained. A second set, the validation set, is used to monitor the performance, and the network with the minimum validation error is selected as the optimal network. The third set is used for verification.
3. Results
A large number of networks, with different initial weight values and number of hidden units, have been explored through the trainingvalidation process, leading to one optimal network per station. However, there is generally several networks that have meansquareerrors (MSE) that are similar to the MSE of the optimal network. We focus the analysis on the ESK station. We discuss the result for ESK model and explore different approaches to judge the skill of the model.
We start with comparing the model output with observed data for two different events (Fig. 3). For both events, models using solar wind magnetic fields only (MFI), and plasma and magnetic fields (SWE + MFI), are compared.
Fig. 3. Two cases with forecasts using events not in the training set. The black curves are the observed , the green curves the forecast from the SWE + MFI model, and the red curves from the MFIonly model. Note that the two events have very different magnitudes. 
The first event (19990922) is of moderate magnitude with a peak observed nT/min. There is an SI around noon reaching close to 30 nT/min which is followed by the stronger substorm activity in the evening. In this case the MFIonly model quite accurately predicts the time of the SI (see insets in figures) but overpredicts the magnitude by 70%, while the SWE + MFI model shows a slower response, although the forecast is within ≈20% from the observed with a lag of about 10 min. The predicted activity during the afternoon drops slightly but not as much as the observed. The evening and nighttime activity is to some extent captured by the models although individual peaks are offset in time.
The second event (20031119) is much stronger, with peak magnitude of 300 nT/min. The SI in the morning of the 20th reaching close to 50 nT/min is accurately predicted in time and only overpredicted by 10% for the MFIonly model, and slightly underpredicted for the SWE + MFI model. The activity then settles down somewhat and is slightly better traced by the SWE + MFI model compared to the MFIonly model. The onset of activity at 13 UT is probably due to a sudden density fluctuation in the solar wind. The SWE + MFI model captures this, although overforecasting with a factor of two. The MFIonly model also seems to capture this, however, in this case it is a result of the strong solar wind magnetic field. Later in the afternoon/evening the substorm activity sets in which is predicted by both models, although the timing of the SWE + MFI model is better. However, the peak magnitude is underpredicted by a factor of two.
It is interesting to see, for these two cases, that the MFIonly model is able to capture the onset of the SI, as the SI is related to the plasma pressure pulse in the solar wind. However, a solar wind pressure pulse is almost always associated with a sharp increase in the solar wind magnetic field, and this correlation is picked up by the model. Another aspect is that the MFI data are generally of higher quality and with fewer datagaps compared to the plasma data.
We now turn to the problem of verifying the models in the general case. We run the models for all available data for the period January 1998–December 2010. We do not only run the models on the selected events, as there always are some subjective assumptions that will affect the verification. The rootmeansquareerror RMSE and the prediction efficiency PE (which is a MSEbased skill score using observed variance as reference model) are computed for the SWE + MFI model and a combined model, shown in Table 3. The combined model uses SWE + MFI when the SWE data exist, and the MFI model when only MFI data exist. For comparison the measures are also computed for two simple models: climatology model, i.e. a constant forecast using the mean value; persistence model, i.e. the next value is equal to the current value. Of course, neither of these two models provide any forecast capability. However, the persistence model will often score very high for many measures as the output values have the same statistics as the observed values. Looking at Table 3 it is seen that the persistence model scores best on these measures, and that the neural network model is on par with the climatology model.
The RMSE (nT/min) and PE are computed for three different types of models: climatology, persistence and neural network. Each model type is based on either the SWE + MFI set or the combined set. See text for detailed explanation.
In order to explore if there is a gain in using a more complex model over very simplistic models, such as climatology or persistence, or any other model, we study different approaches. Clearly, from the two examples in Figure 3, the neural network model performs better than the climatology model. Further, the persistence model is by its definition not capable of producing a forecast, but it will always trace the observed values and therefore will have small errors.
In Figure 4 we show the RMSE as function of a threshold value (τ, left) and positive difference value (δ, right), respectively. We define τ and δ in the following two paragraphs.
Fig. 4. The RMSE as function of threshold (τ, left) and as function of positive difference (δ, right). The RMSEs are calculated for the SWE + MAG, climatology, and persistence models. Only data for which SWEPAM data exist have been used. 
In the first case (left figure), only pairs of forecast and observed values are included in the RMSE calculation for which the observed values are above the threshold τ, i.e. ∀t for which . The points at τ = 0 corresponds to the values in Table 3. It is seen that the NN and persistence models have similar RMSE (τ), but that the climatology model has more rapidly increasing RMSE. Thus, the NN model now shows a better performance than the climatology model, and similar performance as the persistence model.
In the second case (right figure) pairs are selected when the observed increase is above δ, i.e. ∀t for which . Such a selection will penalise persistence, and any other model that has forecast potential should perform better. The figure shows that indeed the NN model performs better than the persistence model, thus indicating that it actually produces forecasts.
The most interesting events are the extreme events, where extreme can be defined based on different criteria depending on the system at study. However, extreme events are often synonym with rare events. Studying forecasting of dichotomous events, a contingency table can be formed by counting the number of outcomes in the four classes: a = number of hits; b = number of false alarms; c = number of misses; d = number of correct rejections. The total number of outcomes is n = a + b + c + d. The event rarity, also known as base rate, then becomes p = (a + c)/n. Extreme events can thus equivalently be defined as events occurring with small base rate p. The forecast rate is similarly defined as q = (a + b)/n. Further, the hit rate and false alarm rate are defined as H = a/(a + c) and F = b/(b + d), respectively. Under certain assumptions on the behaviour of the base rate p and the forecast rate q, it can be shown that both log H and log F are proportional to log p and can be expressed in terms of the extremal dependence index (Ferro & Stephenson 2012)$$\mathrm{EDI}=\frac{\mathrm{log}F\mathrm{log}H}{\mathrm{log}F+\mathrm{log}H}.$$(9)
The EDI describes the performance in relation to random forecasts with varying p. EDI lies in the range [−1, +1], where EDI = 0 corresponds to random forecasts, and EDI > 0 corresponds to forecasts that are better than random forecasts. Some interesting properties of EDI are that it is base rate independent and asymptotically equitable. The EDI requires that p = q, which can be achieved by recalibrating q using different thresholds on the forecasts and observations, respectively. To complement EDI the frequency bias is calculated based on the uncalibrated q as$$\mathrm{Bias}=q/p.$$(10)
The Bias indicates whether the model overforecasts (Bias > 1) or underforecasts (Bias < 1) at a certain base rate.
Figure 5 shows EDI and Bias for the ESK model as function of observed threshold τ. The base rate decreases as the threshold increases. The top left plot shows the Bias for two different models using two different datasets. The “SWE + MFI” label corresponds to the SWE + MFI model evaluated on data for which SWE data exist. The “SWE + MFI, all data” label corresponds to the same model evaluated on all data, including events with missing SWE data during which the forecasts will be zero. Finally, the “MFI only, all data” is the MFI model evaluated on all data.
Fig. 5. The Bias (top row) and EDI (bottom row) computed for the ESK model. Plots at left show the SWE + MFI model evaluated on data for which SWE data exist (SWE + MFI legend) and for all data (SWE + MFI, all data legend), and the MAG only model. Plots at right show the combined model, the climatology model, and the persistence model evaluated for all data. Lines without symbols indicate the 95% confidence levels. 
Generally, all models overforecast at the lower levels (≲30 nT/min), e.g. the number of times the “MFI only” model forecasts values above 20 nT/min is ≈3.6 times that observed. For a large range (40 ≲ τ ≲120 nT/min) the “SWE + MFI” model has Bias ∈ [0.8, 1.2], then increases to 1.9 at τ = 150 nT/min, and drops to zero for τ ≳ 190 nT/min. This means that the forecast rate q = 0, thus there are no forecasts above that threshold. The “SWE + MFI, all data” Bias values are consistently lower, a natural consequence of the zeroforecast for missing SWE data. Finally, the “MFI only” model has Bias closer to one for a large range of thresholds compared to the “SWE + MFI” model, except for small τ. The EDIs are similar for both models, within 95% confidence limits, for τ ≳ 60 nT/min. As EDI is asymptotically equitable the models can be compared even though they are based on different datasets.
The EDI stays at a high level of EDI ≈ 0.8 for large . As stated above, the forecasts are recalibrated to force q = p before EDI is computed. This is achieved by lowering the forecast threshold which will increase the number of hits (a) and false alarms (b). If the model would produce random forecasts, then both the hit rate and false alarm rate would increase equally much, with increasing threshold, leading to EDI ≈ 0. The EDI thus indicates that the model provides actual forecasts but, taken together with the Bias, the forecast values are below observed values at the high end range.
4. Conclusions
Although the transformation of 1min dB/dt to 30min maxima inhibits the calculation of the geoelectric field, it has been shown that there is a reasonably linear relationship between and the 30min maxima of the computed geoelectric field (Fig. 2). Therefore, can be used as a proxy of the electric field using the pcoefficients in Table 2. The is simple and well defined, and depends only on the measured Bfield. At the same time, if the electric field model or ground conductivity model are updated, the pcoefficients can easily be recalculated.
The verification of forecast models is an important and difficult subject. Case studies, such as those in Figure 3, are useful but difficult to summarise. Measures, as given in Table 3, summarise the verification into a single number, but are hard to interpret. In this special case, we know that the climatology and persistence models cannot provide any forecasts, but still they score better with these simple measures. A more careful selection of data, over which the measures are calculated, can provide further insight to the performance. As Figure 4 shows, the neural network model performs better than the climatology model for large values, and better than the persistence model for large increases.
From Tables 1 and 2 it is seen that the most interesting events are rare. To verify the forecasts of rare events we applied the EDI method. The analysis indicates that the models have a stable EDI around ≈0.8 for large and rare events (Fig. 5, bottom left). Within the 95% confidence limits it is not possible to decide which model performs best. However, the “MFI” model has a Bias closer to one, and especially it provides forecasts also for the largest values. In principle, a combined model could be used depending on whether the plasma data are reliable or not. For the combined model (Fig. 5, right) the EDI and Bias are very similar to that of the persistence model. Thus, in this case EDI suffers from the same problem as RMSE (Table 3) and RMSE (τ) (Fig. 4, left) in its ability to distinguish between the persistence model and the neural network models.
In a realtime situation it can be difficult to determine when to switch model. Therefore, a realtime forecast system could rely on the “MFI only” model and ignore forecasts below ≲30 nT/min. A related issue for realtime forecasting is that the models generally overforecast for the low values (Fig. 5, top left), most likely a result of the event selection that is necessary for training.
The models have been implemented for realtime operation and are available at http://corona.lund.irf.se/eurisgic/.
Acknowledgments
The research leading to these results has received funding from the European Community’s Seventh Framework Programme (FP7/20072013) under Grant Agreement No. 260330. We thank WDCEdinburgh and IMAGE, and the contributing institutes, for providing geomagnetic observatory data in Europe, and the ACE SWEPAM and MFI instrument teams and the ACE Science Center for providing ACE data. The editor thanks D. Danskin and an anonymous referee for their assistance in evaluating this paper.
References
 Adam, A., E. Pracser, and V. Wesztergom. Estimation of the electric resistivity distribution (EURHOM) in the European lithosphere in the frame of the EURISGIC WP2 project. Acta Geod. Geoph. Hung., 47, 377–387, 2012. [Google Scholar]
 Beggan, C., D. Beamish, A. Richards, G. Kelly, and A. Thomson. Prediction of extreme geomagnetically induced currents in the UK highvoltage network. Space Weather, 11, 407–419, 2013. [CrossRef] [Google Scholar]
 Bolduc, L. GIC observations and studies in the HydroQuébec power system. J. Atmos. Sol. Terr. Phys., 64, 1793–1802, 2002. [Google Scholar]
 Boteler, D.H., and R.J. Pirjola. The compleximage method for calculating the magnetic and electric fields produced at the surface of the earth by the auroral electrojet. Geophys. J. Int., 132, 31–40, 1998. [Google Scholar]
 Burton, R.K., R.L. McPherron, and C.T. Russell. An empirical relationship between interplanetary conditions and Dst. J. Geophys. Res., 80 (31), 4204–4214, 1975. [NASA ADS] [CrossRef] [Google Scholar]
 Cagniard, L. Basic theory of the magnetotelluric method of geophysical prospecting. Geophysics, 18, 605–635, 1953. [CrossRef] [Google Scholar]
 Curto, J.J., T. Araki, and L.F. Alberca. Evolution of the concept of sudden storm commencements and their operative identification. Earth Planets Space, 59, i–xii, 2007. [CrossRef] [Google Scholar]
 Dong, B., D.W. Danskin, R.J. Pirjola, D.H. Boteler, and Z.Z. Wang. Evaluating the applicability of the finite element method for modelling of geoelectric fields. Ann. Geophys., 31 (10), 1689–1698, 2013, http://www.anngeophys.net/31/1689/2013/. [Google Scholar]
 Elman, J.L. Finding structure in time. Cognitive Science, 14, 179–211, 1990. [Google Scholar]
 Ferro, C.A.T., and D.B. Stephenson. Forecast verification: A practioner’s guide in atmospheric science, 2nd edn., John Wiley and Sons Ltd, Chapter 10 Deterministic forecasts of extreme events and warnings, 2012. [Google Scholar]
 Gaunt, C.T. Reducing uncertainty – responses for electricity utilities to severe solar storms. J. Space Weather Space Clim., 4, A01, 2014. [CrossRef] [EDP Sciences] [Google Scholar]
 Kis, A., A. Koppan, I. Lemperger, T. Prodan, J. Szendröi, J. Verö, and V. Wesztergom. Longterm variation of the geoelectric activity index. Publs. Inst. Geophys. Pol. Acad. Sc., C99 (398), 353–360, 2007. [Google Scholar]
 Lundstedt, H. The sun, space weather and GIC effects in Sweden. Adv. Space Res., 37, 1182–1191, 2006. [CrossRef] [Google Scholar]
 Lundstedt, H., H. Gleisner, and P. Wintoft. Operational forecasts of the geomagnetic Dst index. Geophys. Res. Lett., 106 (A6), 1–4, 2001. [CrossRef] [Google Scholar]
 Mayaud, P.N. Derivation, meaning, and use of geomagnetic indices, American Geophysical Union, Vol. 22 of Geophysical monograph, 1980. [CrossRef] [Google Scholar]
 McComas, D.J., S.J. Bame, P. Barker, W.C. Feldman, J.L. Phillips, P. Riley, and J.W. Griffee. Solar wind electron proton alpha monitor (SWEPAM) for the Advanced Composition Explorer. Space Sci. Rev., 86, 563–612, 1998. [NASA ADS] [CrossRef] [Google Scholar]
 Pirjola, R., A. Viljanen, A. Pulkkinen, S. Kilpua, and O. Amm. Ground effects of space weather, NATO Science Series, Kluwer Academic Publishers, Chapter Space weather effects on electric power transmission grids and pipelines, 2004. [Google Scholar]
 Pulkkinen, A., A. Viljanen, and R. Pirjola. Estimation of geomagnetically induced current levels from different input data. Space Weather, 4, S08005, 2006. [CrossRef] [Google Scholar]
 Schrijver, C.J., and S.D. Mitchell. Disturbances in the US electric grid associated with geomagnetic activity. J. Space Weather Space Clim., 3, A19, 2013. [CrossRef] [EDP Sciences] [Google Scholar]
 Smith, C.W., J. L’Heureux, N.F. Ness, M.H. Acũna, L.F. Burlaga, and J. Scheifele. The ACE magnetic fields experiment. Space Sci. Rev., 86, 611, 1998. [Google Scholar]
 Stone, E., A. Frandsen, R. Mewaldt, E. Christian, D. Margolies, J. Ormes, and F. Snow. The advanced composition explorer. Space Sci. Rev., 86, 1–22, 1998, DOI: 10.1023/A:1005082526237. [Google Scholar]
 Trichtchenko, L., and D.H. Boteler. Modeling geomagnetically induced currents using geomagnetic indices and data. IEEE Trans. Plasma Sci., 32 (4), 1459–1467, 2004. [Google Scholar]
 Viñas, A.F., and J.D. Scudder. Fast and optimal solution to the “RankineHugoniot Problem”. J. Geophys. Res., 91 (A1), 39–58, 1986. [Google Scholar]
 Viljanen, A., H. Nevanlinna, K. Pajunpää, and A. Pulkkinen. Time derivative of the horizontal geomagnetic field as an activity indicator. Ann. Geophys., 19, 1108–1118, 2001. [Google Scholar]
 Viljanen, A., R. Pirjola, M. Wik, A. Ádám, E. Prácser, Y. Sakharov, and Y. Katkalov. Continental scale modelling of geomagnetically induced currents. J. Space Weather Space Clim., 2, A17, 2012, DOI: 10.1051/swsc/2012017. [CrossRef] [EDP Sciences] [Google Scholar]
 Viljanen, A., A. Pulkkinen, O. Amm, R. Pirjola, T. Korja, and B.W. Group. Fast computation of the geoelectric field using the method of elementary current systems and planar earth models. Ann. Geophys., 22, 101–113, 2004. [Google Scholar]
 Weigel, R.S., D. Vassiliadis, and A.J. Klimas. Coupling of the solar wind to temporal fluctuations in ground magnetic fields. Geophys. Res. Lett., 29 (19), 1915, 2002. [CrossRef] [Google Scholar]
 Wik, M., R. Pirjola, H. Lundstedt, A. Viljanen, P. Wintoft, and A. Pulkkinen. Space weather events in July 1982 and October 2003 and the effects of geomagnetically induced currents on Swedish technical systems. Ann. Geophys., 27, 1775–1787, 2009. [Google Scholar]
 Wintoft, P. Study of the solar wind coupling to the time difference horizontal geomagnetic field. Ann. Geophys., 23, 1949–1957, 2005. [CrossRef] [Google Scholar]
 Wintoft, P., M. Wik, H. Lundstedt, and L. Eliasson. Predictions of local ground geomagnetic field fluctuations during the 7–10 November 2004 events studied with solar wind driven models. Ann. Geophys., 23, 3095–3101, 2005. [CrossRef] [Google Scholar]
Cite this article as: Wintoft P., M. Wik & A. Viljanen. Solar wind driven empirical forecast models of the time derivative of the ground magnetic field. J. Space Weather Space Clim., 5, A7, 2015, DOI: 10.1051/swsc/2015008.
All Tables
The table lists for each station (Id) the geomagnetic latitude (Lat), the start and end dates for the period when data exist, the number of 30min samples (N), and the 90% and 99% percentiles, and the maximum of for all available data in units of nT/min. The magnetic coordinates are based on IGRF for the year 2010. Note that the statistics is only meaningful per station basis and that comparison between stations cannot be done as the they cover different periods and have varying degree of data gaps.
The table lists for each station (Id) the geomagnetic latitude (Lat), and the 90% and 99% percentiles, and the maximum of for all available data. The computed conductances in log 10(S) are shown using three different integration depths: 40 km (S40), 80 km (S80) and 160 km (S160). The coefficient p for the linear relation with electric field is also shown in units of (mV/km)/(nT/min). Only data with simultaneous 30min intervals are chosen. There are N = 134,452 30min samples covering the period 19990106 to 20081231.
The RMSE (nT/min) and PE are computed for three different types of models: climatology, persistence and neural network. Each model type is based on either the SWE + MFI set or the combined set. See text for detailed explanation.
All Figures
Fig. 1. Illustration of the 3hour maximum filtering (m(t), red curve), thresholds, and the event timestamps , and . The example shows a period in November 2003 for the ESK station (blue curve). 

In the text 
Fig. 2. Maximum of the modelled 1min horizontal electric field versus the maximum of dB_{h}/dt for 30min sequences 00:00–00:29, 00:30–00:59, … UT in October 2003. Top left: ABK, top right: UPS, bottom left: BFE, bottom right: FUR. C is the linear correlation coefficient, p(1) is the slope of the fitted linear curve. 

In the text 
Fig. 3. Two cases with forecasts using events not in the training set. The black curves are the observed , the green curves the forecast from the SWE + MFI model, and the red curves from the MFIonly model. Note that the two events have very different magnitudes. 

In the text 
Fig. 4. The RMSE as function of threshold (τ, left) and as function of positive difference (δ, right). The RMSEs are calculated for the SWE + MAG, climatology, and persistence models. Only data for which SWEPAM data exist have been used. 

In the text 
Fig. 5. The Bias (top row) and EDI (bottom row) computed for the ESK model. Plots at left show the SWE + MFI model evaluated on data for which SWE data exist (SWE + MFI legend) and for all data (SWE + MFI, all data legend), and the MAG only model. Plots at right show the combined model, the climatology model, and the persistence model evaluated for all data. Lines without symbols indicate the 95% confidence levels. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.