GNSS positioning error forecasting in the Arctic: ROTI and Precise Point Positioning error forecasting from solar wind measurements

A model forecasting ionospheric disturbances and its impact on GNSS positioning is proposed, called HAPEE (High lAtitude disturbances Positioning Error Estimator). It allows predicting ROTI index and corresponding Precise Point Positioning (PPP) error in Arctic region (i.e. latitudes > 50 ). The model is forecasting for the next hour a probability of a disturbance index or PPP error to exceed a given threshold, from solar wind conditions measured at L1 Lagrange point. Or alternatively, it is forecasting a disturbance index level that is exceeded during the next hour for a given percentage of the time. The ROTI model has been derived from NMA network measurements, considering a database covering the years 2007 up to 2019. It is demonstrated that the statistical variability of the ROTI index is mainly following a lognormal distribution. The proposed model has been tested favorably on measurements performed using measurements from stations of the NMA network that were not used for the model derivation. It is also shown that the statistics of PPP error conditioned by ROTI is following a Laplace distribution. Then a new compound model has been proposed, based on a conditional probability combining ROTI distribution conditioned by solar wind conditions and error distributions conditioned by ROTI index level.


Introduction
High-latitude regions, in the vicinity of the auroral oval, frequently experience ionospheric disturbances which leads to cycle slips, loss of lock and positioning errors. These effects are feared by the community using GNSS applications, notably air navigation that requires a high level of accuracy and integrity. To quantify disturbance effects, different indices are commonly used to characterize the impact of ionospheric fluctuations on GNSS signal degradation in amplitude and in phase. In the polar region, where phase and slant total electron content (STEC) fluctuations are particularly important, r / and ROTI (Rate Of Total Electron Content Index) indices are relevant to study such disturbances events (Cherniak et al., 2015;Jacobsen & Andalsvik, 2016). However, their forecasting remains a challenge because our understanding of the causes and consequences of the complex physics involving ionospheric storms is far from complete.
In Space Weather research, numerous magnetic indices exist and the relation between magnetic indices and ionosphere scintillation indices can be researched. Despite the complexity, some authors have investigated the topic. In (Hu et al., 1998), a physical model for ionospheric storm prediction is proposed, based on Dst index to monitor storm onset. The model is proposing short term (in the hours following a storm) predictions of ionospheric storm effects on HF communications.
Fortunately, measurements performed from spacecraft placed at the L1 Lagrange point between the Sun and the Earth allow getting information on the solar wind approximately 1 h upstream before it reaches the Earth. This type of measurement allows proposing new forecasting approaches such as the OVATION (Oval Variation, Assessment, Tracking, Intensity and Online Nowcasting) model (Newell et al., 2010(Newell et al., , 2014). The OVATION model forecasts the average characteristic energy of precipitating electrons and ions incoming in ionosphere layers in the polar region, as well as the total energy flux (North or South poles) and total number flux.
Forecasting models of Space Weather impact on ionosphere can also be found in the literature. In (Tsagouri & Belehaki, 2015) the SWIF model (Solar Wind driven auto regression model for Ionospheric short-term Forecast) is presented, producing nowcasting and forecasting of foF2 disturbances for the European region. The SWIF model is an empirical temporal model of ionospheric storms using as proxy the solar wind parameters from ACE spacecraft (Koutroumbas et al., 2008). The model predicts a global occurrence of time disturbances of foF2 due to ionospheric storms. Prikryl et al. (2012Prikryl et al. ( , 2013 are also using as proxy observations at the L1 point from ACE spacecraft to model phase scintillation occurrence at high latitude, from L1 GPS data collected with the Canadian High Arctic Ionospheric Network (CHAIN). They propose a method similar to the McPherron & Siscoe (2004) one, construed by analogy to an approach used in weather meteorology forecasting and based on atmospheric air mass concept.
Lastly, deep learning approaches have also been applied in this context, as for example the work proposed by McGranaghan et al. (2018). They have used CHAIN network scintillation measurements, Solar Wind and Geomagnetic Activity Data, and Particle Precipitation dataset to develop a predictive model for high-latitude ionospheric phase scintillation based on support vector machine (SVM) machine learning algorithm.
To the best of our knowledge, there is no published prediction model of GNSS positioning error at high latitude. In this work, the precise point positioning (PPP) technique has been selected to describe GNSS error.
PPP is a processing strategy for GNSS observations that enables the efficient computation of high-quality coordinates, utilizing undifferenced dual-frequency code and phase observations by using precise satellite orbit and clock products. More detailed descriptions of PPP can be found in e.g. Zumberge et al. (1997) and Kouba & Héroux (2001).
Previous studies by Tiwari et al. (2009) and Moreno et al. (2011) have examined the effects of ionospheric disturbances on PPP calculations at low/equatorial latitudes. Moreno et al. (2011) concluded that the presence of large ROT (Rate Of Change of TEC) can induce a significant degradation of the position estimation.
The Norwegian Mapping Authority (NMA) operates a national network of GNSS receivers, which is used for positioning services and various studies. Researchers at the NMA have previously studied the statistics of ionospheric disturbances and their correlation with GNSS positioning errors in Jacobsen & Dähnn (2014), and published case studies on the effects of space weather events on GNSS in Jacobsen & Schäfer (2012), Andalsvik & Jacobsen (2014) and Jacobsen & Andalsvik (2016).
In this study, a global statistical approach for short term forecasting of ROTI ionospheric disturbances index and associated GNSS PPP error is proposed for high latitudes. It is based on statistical regression of large datasets. Because ionospheric disturbances at high latitude is a highly random process, a statistical approach is the preferred strategy. Moreover, a statistical approach allows introducing the concept of relative risk, associated to the definition of a threshold level in ionospheric disturbance indices, positioning error, or other metrics. This paper is organized as follows. In Section 2, the NMA database and Solar wind data used for HAPEE model regression are presented. Section 3 presents the ROTI index forecasting model, its solar wind driver, the model regression and its analysis. In Section 4, a validation of the ROTI forecasting is presented. Section 5 details the conditional probability approach used to build the GNSS positioning PPP error statistical model. The conclusion of the work and the prospects are introduced in Section 6.
2 Ionospheric disturbances datasets presentation 2.1 NMA database The Norwegian Mapping Authority (NMA) has deployed instrumentation, developed software and services to monitor space weather impact on the ionosphere over Norway. NMA is also studying the impact of the ionospheric disturbances on its real-time kinematic (RTK) positioning service and precise point positioning (PPP) techniques (Jacobsen & Andalsvik, 2016).
In this study, ionospheric disturbances are measured by the Rate of TEC Index (ROTI) (Pi et al., 1997). The ROTI index is able to detect ionosphere disturbances and is a robust index, not affected by detrending of the phase measurements needed in the phase scintillation index definition (Ghobadi et al., 2020). While the use of dual-frequency observations allows the correction of ionospheric delay to the first order, higher order terms remain. ROTI main advantage over scintillation indices is that it is calculated based on measurements from standard dualfrequency GNSS receivers sampling at 1 Hz, which have been and still are more common than scintillation receivers. It is noted that ROTI based on 1 Hz measurements will often be dominated by contributions from refractive effects, whereas the scintillation indices calculated based on data with higher sampling rates are designed to measure contributions from diffractive effects (McCaffrey & Jayachandran, 2019). Rino et al. (2019) demonstrates that the phase-scintillation errors are a negligibly small fraction of the stochastic TEC component for most global navigation satellite system operating conditions. ROTI index has been chosen to characterize disturbances and/or rapid variations of TEC, and is related, but not equivalent, to scintillation (Basu et al., 1999;Mushini et al., 2012;Yang & Liu, 2016;Carrano et al., 2019).
In this study, the ROTI values are based on 1 Hz measurements (Át ¼ 1 60 min), and calculated for time intervals of 5 min (N = 300). An elevation cutoff of 5°was used for the input data. However, for the ROTI data used in the processing in this paper, a further elevation angle cutoff of 30°was applied for the ROTI data records. This is done to avoid errors that occur at low elevations, in particular those due to reflections or other types of signal degradation caused by terrain or structures nearby.
The elevation of a ROTI data record is defined as the average elevation angle of the satellite during the time period used for the ROTI calculation.
ROTI is defined as the standard deviation of the Rate Of TEC (ROT) over some time interval. It is calculated as follows, where Ln, k n and f n are the phase measurement, wavelength and frequency for the n-th frequency.
L GF (i) is the geometry-free phase combination at time i: V. Fabbro et al.: J. Space Weather Space Clim. 2021, 11, 43 ROT (in TECU/minute) is calculated as, TECU (TEC Unit) is defined as 10 16 electrons per m 2 . Dt is the time difference between the epochs, in minutes. Finally, ROTI, calculated over N epochs is, ROT is the average of ROT for the interval k ROT ¼ À In addition to basic checks of the input data a particularly important issue is the detection of cycle slips, as undetected cycle slips will lead to erroneous and very large ROTI values. The vast majority of the database used is from a real-time ionosphere monitoring system. In the real-time system, several checks are applied: For each signal frequency used, if Doppler measurement is available, the change in phase from the previous to the current timestep is compared to the Doppler. If the difference exceeds a threshold, it is considered as a cycle slip. The acceleration of TEC (second order time difference) is compared to a threshold. If the threshold is exceeded, it is considered as a cycle slip. The change of the Melbourne-Wübbena linear combination (Melbourne, 1985;Wübbena, 1985) from the previous to the current timestep is compared to a threshold. If the threshold is exceeded, it is considered as a cycle slip. A description of this cycle slip detection method can be found in Liu (2011). If one or more of these checks trigger, the data is considered to contain a cycle slip. Among these the check using the Melbourne-Wübbena is the best, as the threshold can be set quite low compared to the change incurred in this measure by a cycle slip. However, there are some very special cases of combinations of cycle slips that the Melbourne-Wübbena check cannot detect which may be detected by the other methods. The processing software will consider the time series before and after a cycle slip as separate time series, ensuring that ROTI values are not calculated using data that contains cycle slips. Some parts of the data may be from a post-processing. The post-processing is equivalent to the real-time processing except that a more advanced cycle slip detection method, which is also based on the Melbourne-Wübbena linear combination, is used. A description of that detection method is found in Cai et al. (2013). Regardless of the outcome of the cycle slip detection, the calculated ROT values are also subject to simple checks for outliers or obviously false values (i.e. extremely large values).
A large database composed of ROTI index has been built. The ROTI database corresponds to the period (2007-2019), covering the 24th solar cycle. ROTI data have been recorded every 5 min, with a sample rate of 1 Hz, from 12 stations deployed all over Norway (see stations positions in Fig. 1). For the ROTI model developed in this study, only GPS satellites signals have been considered, leading to 78,990,363 measurements in total. For each ROTI value, the inhomogeneities position is estimated considering the link geometry and assuming an ionosphere layer altitude of 350 km. This generic value is implying that disturbances occurs at the same altitude whatever the solar conditions and close to F region peak level. This point remains questionable and should be studied in further work. For the purpose of the statistical model researched here, this choice is assumed as reasonable. In this paper, all coordinates are converted from geographical to altitude corrected geomagnetic coordinates using AACGM-v2 code (Altitude Adjusted Corrected GeoMagnetic (Shepherd, 2014). ROTI values have been corrected for elevation angle from the theoretical behavior proposed by Carrano et al. (2019) for the ROTI and assuming an isotropic ionosphere. This way, the elevation dependency of ROTI 2 is proportional to sec(h) with h the zenith angle at the ionosphere inhomogeneity position (i.e. the angle complementary of the elevation angle). A practical reason for the choice of a duration of 5 min for ROTI is that it is the time resolution used in the archive from which we retrieved ROTI data. However, it is generally known that during active conditions, the ionospheric disturbances can vary on timescales shorter than 5 min. If the disturbance levels change during a 5-min interval, the ROTI value computed will represent an average of the disturbance level during that time interval. While this may not be good enough for a scientific study of the exact physics taking place in the ionosphere, we claim that it is sufficiently good for statistical characterization.

Solar wind data
For the purpose of ionosphere disturbance forecasting, the solar wind and interplanetary magnetic field (IMF) data from a satellite which is stationed in orbit around the Sun-Earth Lagrange Point L1 are a very interesting driver. The OMNIweb site (https://omniweb.gsfc.nasa.gov/) is providing high resolution solar wind data from the OMNI database. These data, which are 1 min averages timeshifted to the bow-shock-nose, have been considered as proxies for ionosphere disturbance forecasting. The Solar Wind data covering all the ground disturbance measurements have been considered, i.e. from 2007 up to 2019.
3 Ionospheric disturbances index model

Solar wind driver
In previous studies (Fabbro et al., 2019), the two parameters B z (component of the interplanetary magnetic field perpendicular to the ecliptic, created by waves and disturbances in the solar wind) and p (solar wind pressure) were selected as proxies of the model. We have studied the correlation between ionospheric disturbance indices and these parameters. A global trend of increasing ionosphere ROTI index with increasing pressure and high negative values of B z was observed, but the influence of other solar wind parameters dependency was missing. Hence, a coupling function that can better represent the interaction between solar wind and Earth magnetosphere has been researched. Newell et al. (2007) have investigated a few of existing coupling functions over a wide variety of magnetospheric activity, and proposed a new one dU MP dt . This coupling function corresponds to the rate of magnetic flux that is opened at the magnetopause. Notably, the dU MP dt coupling function is used in OVATION-Prime auroral precipitation model (Newell et al., 2014). The definition of this coupling function is, where the clock angle is h ¼ tan À1 By , and m is the solar wind velocity in km/s, whereas dU MP dt is in Wb/s. In Figure 2 a representation of dU MP dt variability versus the IMF components B y and B z is shown. The latter shows the strong variability of this coupling function with B z sign, whereas the sign of B y sign has no impact.
In our approach, the coupling function proposed by Newell et al. (2007) has been chosen as proxy and is computed from solar wind velocity and IMF parameters individually averaged over 10 min before the coupling function calculation (considering averaged IMF parameters from t 0 -10 mn up to t 0 for a coupling function at time t 0 ). This choice of 10 min is based on the time variability of the parameters investigated by Cousins & Shepherd (2010). To study dU MP dt , Newell et al. (2007) compared it with the K p -index that is a global geomagnetic index, based on 3-h measurements of ground-based magnetometers placed around the world. Considering Figure 4 of Newell et al. (2007), the relation between dU MP dt and mean K p -index is close to be linear in the range 1-6 for K p . From these results, a simple linear regression can be proposed, K p ffi a þ b dU MP dt with a = 1.1 and b = 3.74 Â 10 À4 . In this paper, this relation is used to compute an equivalent K p -index to feed a model of Auroral oval boundaries used to be compared with our approach in Section 3.3.

ROTI model regression
The model has been regressed by a classification of the coupling function values and of the corresponding disturbance indices. To build the regression, 11 intervals of dU MP dt have been defined in linear scale, with an initial bin [0, 10 3 ), in Wb/s, followed by 9 intervals of width 2 Â 10 3 Wb/s and a final bin [19 Â 10 3 , +1) Wb/s. Considering the position of each ROTI measure at 350 km and the measurement date, MLat (Magnetic Latitude) and MLT (Magnetic Local Time) are estimated by AACGM-v2 routine (Shepherd, 2014), model recommended by SuperDARN radar community. Then, a classification in terms of MLat and MLT has been proposed, with intervals of 1°of latitude between [50, 82)°for MLat, with two additional intervals [0, 50)°and [82,90]°for remaining extra points, and intervals of half an hour between [0, 24) h for MLT. This choice leads to 34 intervals in MLat and 48 intervals in MLT. To establish the correspondence between the coupling function at bow-shock-nose and the disturbance indices at ground level, a time delay has to be defined. As we propose a statistical forecasting model, the idea is, from a coupling function value measured at time t 0 , to consider all the disturbance indices measurements from t 0 up to t 0 + Dt. This leads to propose a model able to estimate the probability of a disturbance index to be exceeded in the following period Dt.
After deep analysis, the distributions of ROTI index appears to follow a distribution close to be lognormal. Then, the ROTI PDF (probability distribution function) is modeled by: Its mean and variance are respectively EðROTIÞ ¼ e lþ r 2 2 and varðROTIÞ ¼ EðROTI 2 Þ À EðROTIÞ 2 ¼ ðe r 2 À 1Þe 2lþr 2 . The parameters l and r have been fitted for each combination of coupling function, MLat and MLT intervals of the whole database.
The relevance of the lognormal regression of ROTI variability has been tested. Three examples of ROTI index CDF (cumulative distribution function) are presented in Figure 3, corresponding to different combinations of location (characterized by MLat and MLT bins) and coupling function level. For each example the histogram plotted in blue line is compared to its lognormal CDF regression, plotted in red line. Examples (a) and (b) show a good agreement between measurements and lognormal model. For each graph, the number of point N pts used in the regression is indicated, and the case (c) clearly underline a lack of measurements with N pts = 323.
From such an analysis, the ROTI index has been studied between 0 and 5. Then, the v 2 goodness of fit test has been applied to check the null hypothesis, between the fitted and measured distributions. The p-value of the v 2 test indicates if the null hypothesis is true. One of the most commonly used p-value threshold is 0.05. If the calculated p-value turns out to be less than 0.05, the null hypothesis is considered to be false. In other words, if p-value is lower than 0.05, it is very probable that the distribution and the model are different. Then, for each combined bins in MLat, MLT and coupling function, the v 2 test is computed with ROTI values in the interval [0.1, 5] sampled with 250 values. The p-values have been computed by the Python "scipy.stats.chisquare" function.
In Figure 3, the p-value of the three cases are provided. The p-value equal to zero in the (c) case indicates that it is very probable that the distribution is not lognormal. The origin of this is presumably the lack of data points to describe the distribution in this case.
The results are reported more globally in Figure 4. To focus on measurements and ignore noise effects, only bins where mean ROTI values are higher than 1 TECu/min have been plotted. This global processing is showing that for ROTI index regression a majority of p-value are higher than the threshold 0.05. Then, it is very probable that the lognormal distribution and the measurements distribution are similar. Moreover, in Figure 4 the areas where ROTI p-value is lower than 0.05 are mainly where the number of points is low (see Fig. 5). So a too low number of points can explain some low p-values, because the fit quality is clearly conditioned by the number of points. This number of points is a function of the solar wind conditions (or coupling function level) and of the magnetic coordinates. As expected, a large number of points is observed for lower coupling function level (higher than 10 5 in most of the areas) and decreases with increasing coupling function level. Note a lower number of points in geomagnetic latitude interval [70,80], which is located north of mainland Norway and has a sparse measurement coverage, mostly from measurements performed by the Ny-Ålesund station in Svalbard (see Fig. 1).

ROTI model analysis
The mean and standard deviation of the lognormal distribution derived by this processing are shown in Figures 6 and 7, with respect to different intervals of coupling function. Note that no interpolation has been applied to get these results. For comparison, the auroral oval limits modeled by Feldstein-Starkov model (Starkov, 1994a;Sigernes et al., 2011Sigernes et al., , 1994b are reported in the graphs. This model describes the equatorward and poleward limits of the auroral oval (drawn in green full line in Figs. 6 and 7), and the equatorward limit of the diffuse aurora (represented in green dashed line in Figs. 6 and 7). In the    Feldstein-Starkov model, diffuse aurora is distinguished from the aurora oval because in diffuse aurora the precipitating protons do not contribute significantly to the excitation (Lui et al., 1973). Occasionally subject to controversy in the literature, the aurora limits proposed here must not be considered as an absolute reference, but as an indicator of trend. To feed the Feldstein-Starkov model an equivalent K p index has been estimated by the simple linear regression proposed in Section 3.1, rounded to an accuracy of 0.3 to follow K p definition. In graphs Figures 6 and 7, the equivalent K p is varying between 0.6 and 9, when dU MP dt is corresponding to intervals [1000, 3000) up to [19,000, +1) Wb/s.
Observing the variability of the mean and standard deviation of ROTI with increasing solar wind disturbance conditions, it is clear that the typical variations of ionosphere perturbations are retrieved. As the auroral oval described by Feldstein-Starkov model, ROTI disturbance index shows an oval that is wider and more asymmetric with increasing coupling function. Ionospheric disturbances is more intense during night time, and still significant during the day. For low level of perturbation,  i.e. dU MP dt in [1000, 3000) Wb/s, the auroral oval shape is not observed by ROTI index, but the cusp region can clearly be distinguished on the day side, centered around 12 h and at latitude [75-80]°. Its size and intensity increase with the level of the coupling function. The night side of the auroral oval intensifies and broadens continuously with increasing coupling function. The auroral oval shape is clearly discernible from interval [3000, 7000) Wb/s on mean ROTI (Fig. 6) and [7000, 9000) Wb/s on ROTI standard deviation (Fig. 7). From [7000, 9000) up to [11,000, 13,000) Wb/s, the ROTI standard deviation limits between 60°and 70°of magnetic latitude match fairly well with the aurora limits proposed by Feldstein-Starkov model on the night side. The dissymmetry of the oval proposed in Feldstein-Starkov model is not well reproduced by ROTI index. From [7000, 9000) Wb/s and above levels of coupling function, at the opposite side of the oval corresponding to MLT time 18 h and 6 h, areas where ROTI mean and standard deviation have high values outside the limits proposed by the Feldstein-Starkov model can be noticed.
From dU MP dt equal to 7000 Wb/s and above, on the dayside the high values of ROTI index (mean or standard deviation > 1) have a larger latitudinal spread than the Feldstein-Starkov oval model, the equatorward limit is exceeded. However, globally the ionosphere disturbances measured by ROTI index happens inside the diffuse aurora limit here described by the Feldstein-Starkov model. Finally, for dU MP dt in [11,000, 13,000) Wb/s (higher values of coupling function reported in Figs. 6 and 7), two maximum values of ROTI distribution parameters appear close to 19 h and 5 h MLT. These intensifications of ROTI index could be due to the plasma inflow and outflow along R1 and R2 field-aligned currents (FAC) that are in the same area. (A review of magnetosphere-ionosphere current systems can be found in Cowley, 2000). Particle precipitation is associated with changes in plasma density and formation of smallscale plasma structures in the ionosphere, which would cause enhanced ROTI values. The enhancements could also be a result of the combination of the particle precipitation in the FAC current regions and plasma blobs embedded in the ionospheric return convection (van der Meeren et al., 2015;Jin et al., 2017;Jin & Oksavik, 2018).
Assuming a lognormal distribution for ROTI, the cumulative distribution function (CDF) is described by PrðROTI T ROTI Þ ¼

Validation
A validation of the HAPEE model has been performed from measurements of a new set of ROTI data, measured in 2017 by receivers not used during the HAPEE model regressions. The receivers are part of the NMA GNSS receiver network, and the stations positions are plotted in graph (b) of Figure 1. ROTI data was calculated by NMAs ionosphere monitoring software. Solar wind data was obtained from NASA/GSFC's OMNI data set through OMNIWeb. For every 5 min in the time periods used, the HAPEE model was run to generate ROTI forecasts if valid solar wind data was available at that time. The output was longitude-latitude grids of the probabilities to exceed each of the thresholds during the coming hour. The grids covered the region of 10°West to 40°East, and 55-80°North. The grid resolution was 1 Â 1°. For each time and each grid point, the observed ROTI values for IPPs (ionospheric pierce points) within the coming hour and spatially close to that grid point were identified. Spatially close is defined as being within 100 km (great circle distance) of the grid point. For the overall statistics, the data are sorted into bins by two parameters; the forecasted ROTI level (low, medium or high), and the forecasted percent chance of exceeding that level within the next hour (a percentage). The number of data points in each bin is shown in Figure 9. For the Validation, three levels of ROTI have been defined: low corresponding to 0.75 TECU/min; medium corresponding to 1.5 TECU/min and high corresponding to 3.0 TECU/min. The observed percentage of data records that exceeded the ROTI level is shown by the color in Figure 10. For a perfect match, the colors of the observations in the figure (percentage of records exceeding levels) should correspond to the abscissa value describing the percentage forecasted. For the low ROTI level, the forecasts matches the observations very well. For the medium ROTI level, there is a shift in the colors on the plot versus the abscissa values, indicating a slight underestimation of the probability of occurrence. The forecasted probability is up to approximately 20 percentage points different from the observed probability. For the high ROTI level the shift is more pronounced, with maximum differences up to approximately 30 percentage points. The shifts in probability at high disturbance levels may indicate an inaccurate characterization of the tails of the distributions for the modeling and/or for the reference data. This is a challenging part which will be further explored in future work. More data are necessary and the future study of the solar cycle 25 are expected to lead to enhancement of the model and of its validation.

Positioning error estimation from ROTI
In this section, the distribution of the PPP errors is calculated for different bins of ROTI index. Then, these distributions of the positioning errors are fitted by a canonical distribution function, and their parameters computed for all the ROTI bins.

Data sources for characterization of ROTI versus PPP accuracy
The work to characterize the connection between ROTI and precise point positioning (PPP) accuracy is based on GNSS data from a set of geodetic receivers for a selection of dates (see Table 1). Figure 1 shows the locations of the receivers. All the receivers are owned and operated by NMA. The receivers record observations at a sample rate of 1 Hz. The days that are included are days that were observed to disturb NMAs RTK positioning systems and/or days with a minimum rating of 2 on NOAAs G-scale for geomagnetic storms (Poppe, 2000). The data has been processed to calculate ROTI (see Sect. 2.1) and PPP coordinates (see Sect. 5.2) at 5 min resolution.
The PPP coordinates were further processed and associated with the ROTI data as described in Section 5.2.

ROTI versus PPP data processing
We have used the GIPSY software provided by NASAs Jet Propulsion Laboratory (JPL) to compute coordinates. The coordinates were computed with a time resolution of 5 min. Important parameters/models used for the GIPSY PPP solutions are summarized in Table 2.
To determine the position error, a true coordinate was defined for each receiver by estimating a long-term trend and yearly variations. The long-term trend can be seen as an estimate of multi-year drift caused by e.g. continental drift or post-glacial rebound, and was modeled as a linear drift over the entire dataset. The yearly variations were modeled by fitting a 4-segment 5th order spline with periodic boundary conditions to the dataset sorted by day-of-year after removing the longterm trend. The true coordinate is then the sum of the long-term trend and the yearly variations. The PPP coordinate error is the difference between the instantaneous coordinate and the "true" coordinate evaluated at that time.
PPP coordinate errors were transformed from latitude, longitude, height to local North, East, Vertical coordinates. A basic outlier detection was applied to remove position records that were very clearly wrong: All PPP records with a position error greater than 20 m in East/North, or greater than 50 m in Vertical, were removed as outliers. This affected approximately 0.2% of the dataset.
For each timetag and each receiver, ROTI was averaged over all satellites. For each averaged ROTI data record, the position error of the PPP data record closest in time (and closer than 300 s) is selected as the position error for that record. Figure 11 shows the distributions of the Vertical position errors, along with fits of Gaussian and Laplace distribution functions. Figures 12 and 13 show the same for the North and East coordinate components. The histograms count the position Fig. 9. In color scale, the logarithm of the number of data points for observed percentage of records exceeding the disturbance level, in each bin, versus forecast in abscissa, and low, medium and high thresholds in ordinate for time period of validation. V. Fabbro et al.: J. Space Weather Space Clim. 2021, 11, 43 errors to the left/right of the x-axis in the left/rightmost bar, but the fits have been made to the actual data. For the plots the distribution functions were evaluated at the points of the center of the histogram bars and scaled so that the sum of their values at those points is the same as the sum of the values of the histogram bars. Table 3 lists the distribution parameters of the Laplace distribution function, being the median and the mean absolute deviation from the median (MAD). That the Laplace distribution is a much better fit than the Gaussian distribution can be shown by comparing the Kolmogorov-Smirnov (KS) test statistic for the two distributions. Figure 14 shows an example of cumulative distribution functions (CDF), for the PPP vertical position error with ROTI in the interval 3-3.5. The blue line is the empirical CDF, while the red line is the Gaussian CDF and the green line is the  V. Fabbro et al.: J. Space Weather Space Clim. 2021, 11, 43 Laplace CDF. The KS test statistic is the maximum vertical difference between a distribution CDF and the reference CDF (in this case, the empirical CDF). Lower values are better, with a value of zero being a perfect fit. In this example, the value is 0.182 for the Gaussian distribution and 0.0466 for the Laplace distribution. From these values, we can conclude that the Laplace distribution is a better fit to the data. Figure 15 shows the KS test statistic values for all the bins. Except for the very first ROTI bin, the Laplace distribution is significantly better. Based on the KS test statistics results and manual inspection of the distributions (including more histogram and CDF plots which are not included in this paper), we conclude that the Laplace distribution provide a good fit to the distribution of position errors and that it is a significantly better fit than a   Gaussian distribution function. Table 3 shows the fitted parameters for the Laplace distribution functions.

Modeling PPP error from solar wind conditions
For a given solar wind condition (or coupling function), the idea is to couple a model forecasting the probability of a disturbance index p(ROTI|sol.wind) with a probability of error conditioned by this ROTI value p(Pos|ROTI). The global compound model giving a probability for a positioning error Pos to exceed a threshold Te is then written as: with pðPosjsol:windÞ ¼ Z þ1 0 pðPosjROTIÞ pðROTIjsol:windÞ dROTI: The problem has been solved supposing the ROTI PDF as lognormal, such as modeled in the climatological part of the model described in Section 3. This way, from a given solar wind condition, the parameters (l sw , r sw ) are chosen for the corresponding coupling function bin and following a lognormal distribution: Then, the probability p(Pos|ROTI) has to be estimated. The following section presents the regression of this part of the model.

Regression of precise point positioning error
In Section 5.3, the statistical relation between positioning errors and ROTI values has been quantified by regression. PPP North, East and Vertical error distributions have been fitted versus ROTI level. The distribution of errors for each ROTI bin has been fitted by a Laplacian distribution: where m i is the mean and the median of the distribution value, 2b 2 i is the variance and b i the mean absolute deviation (MAD) from the median.
The Laplace distribution parameters for the PPP coordinate errors have been derived per ROTI bin and reported in Section 5.3.
In practice, the positioning error is characterized by the absolute value of the error. Moreover, as far as the PPP positioning errors are concerned, the fitted mean of North, East and Vertical PPP errors have been obtained close to zero (see Sect. 5.3). Then, the PDF considered for the problem becomes pð Pos j jjROTIÞ ¼ 1 2bi exp À Pos j j bi . This canonical distribution is therefore introduced in the global compound model.

Simple case of one link
In this section, the ROTI index is supposed to be estimated from one GNSS link. In continuous form, the global problem to solve is the following conditional probability, Inverting the integrals, introducing the Laplace distribution parameters (whose values depend on the ROTI bin) and the lognormal distribution of ROTI, the formulation becomes,  V. Fabbro et al.: J. Space Weather Space Clim. 2021, 11, 43 Prð Pos j j > Tejsol:windÞ ¼ The second integral is R þ1 Te dPos 1 2bi exp À Pos j j bi ¼ Prð Pos j j > TeÞ and reminding the Laplace distribution properties, the complementary cumulative distribution function is, with in case of PPP positioning errors m i ; 0 (see Sect. 5.3). So Te ! m i can be assumed, and The global conditional probability becomes, Prð Pos j j > Tejsol:windÞ ¼ Considering that each i bin of ROTI is defined by a minimum and a maximum value of ROTI, ROTI i min and ROTI i max , a discretized form is proposed, via the formula, Pr Pos j j > Tejsol:wind ð Þ ¼ , and leads to, The final formulation is relatively easy to use once all the distributions parameters (Laplace for positioning PPP error and lognormal for ROTI) are known. But the problem is more complex because positioning error is commonly estimated from several GNSS links, that can be impacted differently by the ionospheric disturbances. Then, the conditional distribution must consider these different links to estimate a mean ROTI value.

Multiple link case
Several GNSS links have to be taken into account in the processes of estimating the PPP error. As these links are not corresponding to close values of ionosphere pierce point, a main assumption is that all links are considered as not correlated. Then, the average ROTI N is estimated from the ROTI values measured on N links: The number N is of the order of 10, if different satellite constellations are used. We have to estimate the distribution of N Â ROTI N , sum of N ROTI k values following lognormal distributions. A second approximation is then assumed: as the GNSS links are corresponding to the same ground station, the ROTI distribution of each link remains the same. The distribution of N Â ROTI N , estimated by the sum of independent ROTI values following the same distribution has to be derived. As no exact analytical expression is known for statistical characterization of the combination of N lognormal distributions, one will turn to approximation strategies. From Romeo et al. (2003), a sum of N lognormal random variables of parameters (l, r) can be approximated to a new lognormal distribution of parameters (l N , r N ): with l and r the parameters of the lognormal distributions of the N processes supposed statistically with the same characteristics, l N and r N are the parameters of the lognormal distribution of N Â ROTI N . Romeo et al. (2003) show in their paper that this formulation is correct with r in [0, 1.25]. For a ROTI lognormal distribution, r parameter is . From the regression presented in Section 3, the maximum values obtained for r induce that the formulation proposed by Romeo et al. (2003) is covering the problem.
From such an approach, the typical sum of lognormal distribution can be written e l N þr N Z where Z is of standard normal distribution. The average is then performed by dividing by N, introducing a normalization e l N þr N ZÀln N ¼ e l 0 N þr N Z . The final characteristics of the average ROTI N are then: V. Fabbro et al.: J. Space Weather Space Clim. 2021, 11, 43 As a conclusion, we propose here an equivalent lognormal distribution of parameters ðl 0 N ; r N Þ of the ROTI N stochastic variable, obtained from the mean of N lognormal random processes. Thanks to this approximation, we can now use the same strategy presented in Section 5.4.2 replacing ROTI by ROTI N in order to estimate the PPP error statistics.
Typical results are shown in Figures 16-18 for East, North and Vertical errors respectively. The four graphs of each figure correspond to different levels of coupling function from 4000 to 30,000 Wb/s. Here the number of GNSS links N has been fixed to 10. The results are maps of the percentage of the time that a given error threshold is exceeded during the hour following 0h UTC, for the arctic region. To illustrate the variability with different thresholds, the values of 0.5, 0.75 and 1.5 cm are considered respectively for East, North and Vertical PPP errors, in Figures 16-18.
This way, in Figure 17 the probability of the East PPP error to exceed threshold 0.5 cm is reported. It shows that for a coupling function of 4000 Wb/s, the threshold of error 0.5 cm is exceeded less than 15% of the time, whereas for 10,000 Wb/s it is exceeded in the cusp close to 20% of the time, and when dU MP dt is increasing up to 30,000 Wb/s, in the auroral oval a percentage close to 30% of the time is reached. In Figure 17 the same representation is proposed, in the same conditions, but for the probability of North PPP error to exceed 0.75 cm. The Probability to exceed the error threshold of 0.75 cm is globally sparsely reached, with a maximum value close to 25% in the auroral oval for a coupling function of 30,000 Wb/s. Considering an error threshold of 1.5 cm, the Figure 18 shows that the Vertical PPP error is exceeded in the cusp around 20% of the time since dU MP dt = 4000 Wb/s, and reach a maximum value close to 30% in the auroral oval for dU MP dt = 30,000 Wb/s. From PPP error estimation used to establish the correspondence between ROTI and PPP errors, an exercise of validation of error model has been performed. It is based on data from a GNSS ground station named "hamc" and located at the magnetic latitude 65.5°North. The dataset used is from the days listed in Table 1. From solar wind conditions measured at L1 at time t 0 , characterized by the Newell coupling function, all the PPP errors in the following hour (up to t 0 + 1 h) have been   V. Fabbro et al.: J. Space Weather Space Clim. 2021, 11, 43 computed to establish the statistics. All MLT time have been considered. Then, this global statistic of error occurence has been compared to the proposed model. The latter has been applied to estimate Pr Pos j j > TejMLT; dU MP dt À Á at the magnetic latitude corresponding to the station position, using equation (10) and (16) Figure 19 are presented the results of comparison between direct and modeled CCDF, for Horizontal, East and Vertical PPP errors. The comparison shows correct results of compliance between the HAPEE model and the direct PPP estimations, with a clear lack of data to get convergence for probabilities lower than 10 À2 (in particular for North error). This validation is a check of the model consistency, and the primary perspective of this work is to realize an operational validation exercise of the PPP error model. It necessitates to characterize long term estimation of PPP errors statistics related to solar wind condition.

Conclusion
A new forecasting model called HAPEE is proposed, able to forecast during the next hour and at latitudes higher than 50°the distribution of the ROTI index or of the PPP error. The model proxy is the solar wind conditions at the Earth bow-shock-nose (typically measured at the L1 Lagrange point and then timeshifted). These conditions are then reduced to the Newell et al. (2007) coupling function.
The model is combining a lognormal distribution describing ROTI conditioned by solar wind conditions and a Laplace distribution describing PPP error conditioned by ROTI index. The global trend of the model has been checked regarding the mean and standard deviation of the ROTI index obtained versus magnetic local time, magnetic latitude and solar wind conditions. Mean and standard deviation of the ROTI modeled by HAPEE follow the expected behavior with respect to solar wind conditions, i.e. an increase of the auroral oval size and width whereas the solar wind conditions increases.
This study has led to a new global model that can be improved in different points. Further complementary validations by comparisons with other disturbances or PPP error data would be very interesting. An extended version of the model should also be regressed following the same philosophy and using more data, covering different countries and other solar cycle. The statistical approach applied to derive the model is very global and refinements can be imagined. For example, no distinction is made between intense solar wind events such as interplanetary coronal mass ejections (ICMEs) and corotating interaction regions (CIRs) events, associated to high-speed streams (HSSs) from coronal holes. In future work, the type of Space Weather event at the origin of disturbances could be distinguished and the regressions adapted.
From the databases available and following the same approach, fits of other outputs could be proposed to complete the model, such as the S4 and r / scintillation indices, or other error parameters.
Both the part of the model that forecasts ROTI based on solar wind parameters, and the part that forecasts PPP error based on ROTI, are empirical models. They are computationally expensive to construct but not computationally expensive to run afterwards. Input solar wind data can be retrieved in near-real-time from e.g. the data service of the Space Weather Prediction Center (SWPC) (currently available at this URL: https://services.swpc. noaa.gov/products/solar-wind/). A proof-of-concept prototype of an operational service has been implemented at NMA, retrieving the solar wind data from SWPC, running the models to generate a forecast map and visualizing this on a webpage. The data retrieval, processing and visualization add less than a minute of latency. The data latency from being measured by the satellite to appearing in the SWPC data file appears to be approximately a few minutes. Thus, forecasts can be made well in advance (typically 30 min to 1 h, depending on solar wind velocity) of the arrival of the solar wind structures. Discussions regarding establishing a full operational service are on-going.