A single-station empirical TEC model based on long-time recorded GPS data for estimating ionospheric delay

Globally distributed GPS (global positioning system) stations have been continuously running for nearly 20 years, thereby accumulating numerous observations. These long-time recorded GPS data can be used to calculate continuous total electron content (TEC) values at single stations and provide an effective modeling dataset to establish single-station empirical TEC models. In this paper, a new empirical TEC model called SSM-T1 for single stations is proposed on the basis of GPS data calculated by IONOLAB-TEC application from 2004 to 2015. The SSM-T1 model consists of three parts: diurnal, seasonal, and solar dependency variations, with 18 coefficients fitted by the nonlinear least-squares method. The SSM-T1 model is tested at four stations: Paris (opmt), France; Bangalore (iisc), India; Ceduna (cedu), Australia; and O’Higgins (ohi3) over the Antarctic Peninsula. The RMS values of the model residuals at these four stations are 3.22, 4.46, 3.28, and 3.83 TECU. Assessment results show that the SSM-T1 model is in good agreement with the observed GPS-TEC data and exhibits good prediction ability at the Paris, Bangalore, and Ceduna stations. However, at the O’Higgins station, the SSM-T1 model seriously deviates from the observed GPS-TEC data and cannot effectively describe the variation of mid-latitude summer night anomaly.


Introduction
Located from about 60 km to 1200 km (including the topside), the ionosphere is an important component of the earth's atmosphere. Under the synthetic effect of solar radiation (ultraviolet and X-rays), cosmic rays, and other settling particles, the neutral atmosphere of the ionosphere becomes ionized, that is, atoms are stripped of one or more electrons and they become positive ions. The abundance of free electrons and ions in the ionosphere significantly influence the propagation of radio waves (Yeh et al., 1972). In a global navigation satellite system (GNSS), satellite signals may be seriously disturbed by passing through the ionosphere and may even be interrupted (Klobuchar, 1987). Ionospheric delay on the satellite signals is positively correlated with total electron content (TEC) and has a negative correlation with the square of the frequency (Klobuchar, 1991). TEC is one of the crucial physical quantities used to study the ionosphere, and its unit of measurement is TECU (1 TECU = 10 16 electrons/m 2 ). This quantity is significant for the correction of radio wave propagation and the study of ionospheric theory. When the frequency of a signal is known, the ionospheric delay can be computed only by determining the TEC on the signal transmission path. Therefore, TEC can also be employed to describe the ionospheric delay of satellite signals. Dual or multi-frequency receivers of GNSS signals make it possible to compute the TEC from different ionospheric delays that signals experience at different frequencies (Ciraolo et al., 2006;Sezen et al., 2013;Minchan et al., 2014;Yang et al., 2018). However, single-frequency users cannot obtain the ionospheric delay or TEC through their own measurement data. In practical applications, an ionospheric empirical model is one of the vital means for ascertaining the TEC. An ionospheric empirical model based on the analysis of the ionospheric spatiotemporal variation and long-time recorded observation data can be established by a reasonable function (Jakowski et al., 2011a;Hoque & Jakowski, 2012;Mukhtarov et al., 2013;Feng et al., 2016Feng et al., , 2017aFeng et al., , 2017b. A global empirical TEC model based on the TEC data can depict the temporal and spatial variations of the TEC in general. However, some global empirical TEC models may have low precision and fail to accurately express the variations of the ionosphere in some local regions (Feng et al., 2016(Feng et al., , 2017b. Two likely factors restrict the accuracy of empirical TEC models. First, deviations of the ionosphere, particularly abnormal phenomena, are not fully considered and not reasonably modeled. Second, the precision of the modeling dataset, such as the TEC derived from global ionospheric maps (GIMs), are not uniform throughout the world.
Globally distributed GPS (global positioning system) stations have been operating for nearly 20 years, and an abundance of satellite observations have been accumulated. These data can be used to calculate the continuous ionospheric TEC time series over single stations, which provide rich modeling materials for the establishment of single-station TEC empirical models. The advantages of the single-station TEC empirical model are as follows: (1) Compared with the global area, the variation of the TEC on the single station is relatively simple and easy to model. (2) The TEC data on single stations are easy to obtain and have high and uniform precision for modeling. (3) The single-station TEC empirical model is simple, targeted, and easy to update. Meanwhile, there are also some disadvantages of single-station models. One of the main disadvantages is that the approach is limited to the location where the model is developed and may not be used for regional ionospheric TEC specification. However, considering that the ionospheric pierce points used to calculate TEC used in modeling may have covered a spatial resolution, the models developed for a specific station can be used to estimate TEC over other locations within a latitudinal and longitudinal coverage of some small degrees (Uwamahoro & Habarulema, 2015). In summary, it is significant to study how to build the single-station TEC models.
Based on GPS-TEC data calculated by IONOLAB-TEC application (Sezen et al., 2013) from 2004 to 2015, a new single-station TEC empirical model called SSM-T1 is proposed. The basic idea of establishing the SSM-T1 model can be described as follows. First, the diurnal, seasonal, and solar dependency variations are incorporated into the proposed models. Then, the modeling dataset derived from longtimerecorded GPS data are applied to the models, and the corresponding model coefficients are fitted using the nonlinear least-squares method. This method of SSM-T1 model is friendly and very easy for using. When the time information (day of the year [DOY] and local time [LT]) and solar activity parameter are given, the TEC over the target stations can be estimated by SSM-T1 model. And then the proposed model is tested through model residual analysis and evaluated through comparison with the GPS-TEC data and the IRI2016 model (Bilitza et al., 2017) in forecasting days.

Approach of empirical TEC model
At the single station, the variation characteristics of the TEC are unaffected by the changes in geographic/geomagnetic latitude and longitude. The deviations mainly include diurnal variation, seasonal variation, and variation related to solar activity. Therefore, the SSM-T1 model for a single station can be expressed as three components: where F 1 represents diurnal variation, F 2 represents seasonal variation, and F 3 represents variation related to solar activity.
The SSM-T1 model is a function of the DOY, LT, and the solar activity parameters. Under normal circumstances, the TEC value is smallest at night, gradually increases after sunrise, and reaches maximum at noon. Klobuchar model (Klobuchar, 1987) describes the diurnal variation of the TEC as two parts: the fixed value is taken at night, the daytime TEC changes are described in the middle part of the cosine function, and the time of the daily maximum TEC is fixed at 14:00 LT. As the Klobuchar model is extremely simple and the night is a fixed value, the model cannot accurately describe the variation of the daytime TEC and cannot depict the undulation of the nighttime TEC. The NTCM-GL model proposed by Jakowski et al. (2011b) describes the daily variation of the TEC as a combination of the diurnal, semi-diurnal, and ter-diurnal fluctuations, with diurnal TEC peak fixing at 14:00 LT. To keep the number of coefficients small as required in some applications the fixing might be justified being aware that the model accuracy is reduced. On the other hand, fixing of maximum TEC values is not set in some other similar models (Jakowski et al., 2011a;Hoque & Jakowski, 2012;Mukhtarov et al., 2013;Feng et al., 2016Feng et al., , 2017bHajra et al., 2016). Following the latter method, the diurnal variation component of the single-station model SSM-T1 proposed in this work does not set the parameters in relation to the maximum time of the TEC, but further refines the diurnal component. In the SSM-T1 model, the TEC diurnal component F 1 consists of four sub-harmonics of the solar day and four coefficients: where LT is the local time, and a i ; i ¼ 1; 2; 3; 4, and b i ; i ¼ 1; 2; 3; 4 are eight coefficients fitted by the nonlinear least-squares method. Similar to the diurnal variation component of the model, the seasonal variation component also uses four sub-harmonics (1, 1/2, 1/3, and 1/4 year) and four coefficients: where DOY is the day of year, and c i ; i ¼ 1; 2; 3; 4, and d i ; i ¼ 1; 2; 3; 4 are eight coefficients fitted by the nonlinear least-squares method.
The SSM-T1 model utilizes F10.7p as a solar activity parameter. F10.7p is the average of F10.7 and F10.7A (81-day moving average of F10.7). Compared with F10.7, F10.7p has a better linear relationship with extreme ultraviolet (EUV) lithography (Richards et al., 1994;Bilitza, 2000;Liu et al., 2006), which can effectively reflect the intensity of solar activity. The relationship between the TEC and the solar activity parameter F10.7p is considered linear, which can be expressed as where e and f are two coefficients fitted by the nonlinear least-squares method.

Modeling dataset and fitting results
The fitting ability of the SSM-T1 model to the observed GPS-TEC data on different single stations is tested in this section. Four different stations located at various places around the world are selected to test the SSM-T1 model: Paris station (opmt, 2.335°E, 48.645°N) in France, Bangalore station (iisc, 77.570°E, 12.937°N) in India, Ceduna station (cedu, 133.810°E , 31.694°S) in Australia, and O'Higgins station (ohi3, 57.901°W, 63.166°S) in the Antarctic Peninsula (Fig. 1). The four stations are located in the mid-latitude region of the northern hemisphere, the equator, the mid-latitudes of the southern hemisphere, and the vicinity of the Antarctic Peninsula, respectively. The GPS-TEC datasets of the four stations are applied to the SSM-T1 model, and the coefficients of the model are fitted by the nonlinear least-squares method. Thus, the corresponding empirical TEC models (SSM-T1-opmt, SSM-T1-iisc, SSM-T1-cedu, and SSM-T1-ohi3) on the four single stations are obtained.
The IONOLAB-TEC application (Sezen et al., 2013) is adopted to obtain the GPS-TEC datasets on the Paris (opmt), Bangalore (iisc), Ceduna (cedu), and O'Higgins (ohi3) stations from January 1, 2004 to June 30, 2015. The IONOLAB-TEC application is provided by the IONOLAB Group (www. ionolab.org). Detailed description of the IONOLAB-TEC can be found in reference of Sezen et al. (2013). Here, the overview of the algorithm is presented. First, using the inter-frequency satellite bias, inter-frequency receiver bias, and pseudorangeleveled baseline phase delay values in the model provided in Arikan et al. (2003Arikan et al. ( , 2004Arikan et al. ( , 2007 and Nayir et al. (2007), slant total electron content (STEC) for every satellite is computed with 30 s time resolution. STEC is converted into vertical total electron content (VTEC) by the application of the mapping function provided in Arikan et al. (2003Arikan et al. ( , 2004Arikan et al. ( , 2007 and Nayir et al. (2007). By combining the VTEC for each satellite in the least square sense by employing a weighting function to reduce the multipath effects with an elevation mask of 30°, Reg-Est algorithm is applied to obtain TEC estimates in the form of IONOLAB-TEC with 30 s time resolution as discussed in Arikan et al. (2003Arikan et al. ( , 2004Arikan et al. ( , 2007 and Nayir et al. (2007).
To reduce the computational complexity of the fitting procedures, the TEC values are resampled with an interval of 30 min (that is, the TEC values for every 30 min are considered). Besides, the GPS-TEC data are further selected under geomagnetic conditions. In some related studies, the Ap index not exceeding 30.0 is considered as quiet geomagnetic conditions (Hedin, 1984;Feng et al., 2016Feng et al., , 2017b. Following the previous studies, only the GPS-TEC data for which Ap index under 30.0 are used for modeling. In this way, the GPS-TEC datasets on the Paris (opmt), Bangalore (iisc), Ceduna (cedu), and O'Higgins (ohi3) stations are obtained, and sample data at 12:00 LT are shown in Figure 2.
The datasets are applied to the SSM-T1 model, and then the fitting results of the 18 · 4 coefficients of the SSM-T1(-opmt, -iisc, -cedu, and -ohi3) models under the 95% confidence interval are ascertained by using the nonlinear least-squares method, as shown in Table 1.
The quality of the fitting results can be evaluated through the model residual that is defined as the difference between the observed GPS-TEC data and the model fitting result and is expressed in equation (5). In this study, the mean error (ME), root mean square error (RMSE), and standard deviation error (STDE) of the model residuals are selected as evaluation parameters, as shown in equations (6)-(8), respectively (Mukhtarov et al., 2013). Figure 3 shows the distribution histograms of the SSM-T1(-opmt, -iisc, -cedu, and -ohi3) models residuals and the evaluation parameters.
As shown in Table 2, although most of the SSM-T1(-opmt, -iisc, -cedu, and -ohi3) model residuals are within ±5 TECU, their corresponding relative residual errors differ. The relative residual errors of SSM-T1-ohi3 are bigger than those of the other three models. This finding indicates that the fitting ability of the SSM-T1 at the ohi3 station is weaker than that of the other three stations: Then, the fitting ability of the SSM-T1 model at the selected four stations to the modeling GPS-TEC data is tested. The testing years are 2008 and 2013. The former is a low solar activity year and the latter is a high solar activity year. Three consecutive days in March, June, September, and December are used to evaluate the SSM-T1 model against the observed GPS-TEC data. Note that the three consecutive days of the selection are the 19th, 20th, and 21st day of each month, but when the missing GPS observation data cause a discontinuity in the TEC sequence, another three consecutive days are selected as the test time points.

Comparison of ionospheric models
The prediction ability of the SSM-T1 model at the four stations (opmt, iisc, cedu, and ohi3) is evaluated in this section. For objective and fair evaluation, the data involved in the assessment should be different from the modeling dataset. Therefore, the GPS-TEC data derived from IONOLAB-TEC application and the IRI2016 model are adopted for comparison with the SSM-T1 model. In addition, given that the modeling time of the four stations is from January 1         section, actual GPS-TEC data are used as a benchmark, and the SSM-T1-opmt and IRI2016 models are compared with it. Figures 8 and 9 show the following findings: (1)    November 20, December 21, 2015), the IRI2016 model is either too high or too low for the assessment of the peak TEC. At some test time points (such as February 20, August 20, 2016, April 20, July 20, August 20, 2017), the IRI2016 model is in good agreement with the GPS-TEC data.
To effectively compare the SSM-T1-opmt model and IRI2016 model to GPS-TEC data, the difference between the SSM-T1-opmt model and GPS-TEC data, and the difference

Evaluation of SSM-T1-iisc model
Comparisons of the SSM-T1-iisc model, IRI2016 model, and the GPS-TEC data in the 2016 and 2017 test time points are shown in Figures 10 and 11. The RMS values of the models' differences between the test time points in 2016 and 2017 are counted, as shown in Table 4. Figures 10 and 11 depict the following findings: (1) Except for test time points in January 20, 2016, the consistency of the SSM-T1-iisc model and the GPS-TEC data is better than that of the IRI2016 model. In test point of January 20, 2016, the SSM-T1-opmt model is in poor conformity with the GPS-TEC data, but it is in good agreement with the IRI2016 model. (2) At the test time points in January 20, February 20, March 20, May 20 and July 20, 2016, December 20, 2017, the IRI2016 model underestimates or overestimates the peak TEC. Besides, as indicated in the statistical data in Table 4, the RMS value of the difference between the GPS-TEC data and the SSM-T1-iisc model is the smallest in the test time periods. Therefore, at most of time points, the SSM-T1-iisc model has stronger predictive power than the IRI2016 model.  Table 5 indicate that the RMS value of the difference between the GPS-TEC data and the SSM-T1-cedu model is the smallest in the test periods. Therefore, in most test time points, the SSM-T1-cedu model has stronger fitting ability than the IRI2016 model.

Evaluation of SSM-T1-ohi3 model
The fitting test of the SSM-T1-ohi3 model in Section 3 shows that the SSM-T1 model is unsuitable for the O'Higgins station (ohi3). To further illustrate this problem, the prediction ability of the SSM-T1-ohi3 model is evaluated. Figures 14 and  15 present the contrast diagram of the SSM-T1-ohi3 model, the GPS-TEC data, and IRI2016 model. Table 6 presents the statistics of the RMS values of the models' differences between the test time points in 2016 and in 2017. Both Figures 14 and 15 and the statistical data in Table 6 indicate that the SSM-T1-ohi3 model is in poor agreement with the GPS-TEC data and cannot describe the change characteristics of the TEC completely. Thus, the SSM-T1-ohi3 model is hardly able to predict the variation of the TEC at the O'Higgins station over the Antarctic Peninsula.

Discussion and conclusion
The validity of the model is verified in three stations (namely, Paris, Bangalore, and Ceduna) at different latitudes. However, the O'Higgins station (ohi3) is located in a typical mid-latitude summer night anomaly (MSNA) area, where the TEC diurnal variation is not uniform, and seasonal differences occur (Horvath & Lovell, 2009;Lin et al., 2009Lin et al., , 2010Thampi et al., 2009;Liu et al., 2010). Figure 16 shows a typical MSNA phenomenon obtained by CODE GIMs on December 12, 2012 at 06:00 UT. The picture indicates the occurrence of a clear Z. Zhao et al.: J. Space Weather Space Clim. 2018, 8, A59 MSNA phenomenon at the O'Higgins station over the vicinity of the Antarctic Peninsula. Studies (Lin et al., 2010) show that the MSNA phenomenon in the southern hemisphere has a long span, usually lasting from October to February of the next year. The GPS-TEC curves in Figures 14 and 15 also reveal the characteristics of the MSNA: the TEC diurnal variation of the local time in summer (January and December) is the opposite of the TEC diurnal variation in other seasons (March, June, and September), demonstrating that the maximum value of the TEC appears at midnight, and the minimum value occurs around noon. The SSM-T1 model does not have an amendment to the MSNA and cannot describe the distinct TEC diurnal characteristics that correspond to different seasons. Therefore, the model is unsuitable for stations in the MSNA area.
The SSM-T1 model exhibits good fit and forecasting precision on testing stations in the middle and low latitudes. Besides, the SSM-T1 model's failure at the O'Higgins station may provide a reference for the construction of other types of single-station TEC empirical models for stations in the MSNA area. Based on the evaluation of the SSM-T1 model at the O'Higgins station and the variation of the MSNA, possible solutions to improve the effectiveness of the SSM-T1 model for an MSNA area are as follows: (1) The diurnal variations of the TEC in MSNA are seasonal or differ monthly, thereby presenting opposite characteristics in summer and winter. Thus, a possibly usable single-station TEC empirical model might be built seasonally or monthly, with its coefficients obtained separately for different seasons or even different months.
(2) In general, particle precipitation during higher geomagnetic activity contributes to the ionization at higher latitudes. Thus, the extension of the model to higher latitudes might require usage of an index of geomagnetic activity, e.g., the hemispheric power index, routinely provided by NOAA.