Solar radio proxies for improved satellite orbit prediction

– Speci ﬁ cation and forecasting of solar drivers to thermosphere density models is critical for satelliteorbitpredictionand debris avoidance.Satelliteoperatorsroutinely forecast orbits upto 30days into the future. This requires forecasts of the drivers to these orbit prediction models such as the solar Extreme-UV (EUV) ﬂ ux and geomagnetic activity. Most density models use the 10.7 cm radio ﬂ ux (F10.7 index) as a proxy forsolarEUV.However,dailymeasurementsatothercentimetricwavelengthshavealsobeenperformedbytheNobeyamaRadioObservatory(Japan)sincethe1950's,therebyofferingprospectsforimprovingorbitmodeling.Herewepresentapre-operationalserviceattheCollecteLocalisationSatellitescompanythatcollectsthesedifferentobservationsinonesinglehomogeneousdatasetandprovidesa30daysforecastonadailybasis.Interpolationandpreprocessingalgorithmsweredevelopedto ﬁ ll in missing data and remove anomalous values. We compared various empirical time series prediction techniques and selected a multi-wavelength non-recursive analogue neural network. The prediction of the 30 cm ﬂ ux,andto a lesser extent that of the 10.7 cm ﬂ ux, performs better than NOAA's present prediction of the 10.7 cm ﬂ ux, especially during periodsofhighsolaractivity.Inaddition,we ﬁ ndthattheDTM-2013densitymodel(DragTemperatureModel)


Introduction
Modeling the upper atmosphere density is essential for nowcasting and forecasting satellite orbits at altitudes up to approximately 1000 km.Space Agencies and satellite operators constantly seek to improve the orbit prediction of the satellites they operate for tracking and programing purposes.Another and increasingly important application is collision avoidance.Upper atmospheric density variations are primarily driven by variations in solar Extreme-UVemissions (EUV, 10-120 nm), for which the 10.7 cm radio flux (F10.7) is routinely used as a proxy (Floyd et al., 2005).However, radio observations at other centimetric wavelengths have recently been shown to offer equivalent if not better performance (Dudok de Wit and Bruinsma, 2017), hence the need for nowcasting and forecasting them.Here, in addition to the classical 10.7 cm radio flux from Penticton Observatory, we consider daily radio observations made between 3.2 and 30 cm at the Nobeyama Radio Observatory in Japan.Although these datasets are publicly available, they are provided in different formats and are occasionally affected by data gaps or corrupted by solar flares, which hinders their immediate use.Therefore, our first objective is to correct for these and deliver one single homogeneous dataset with daily values from 1951 to today.The different data-processing steps take advantage of the simultaneous measurements at the different wavelengths and are described in Section 2. Our second objective is to deliver a one-month forecast.While physical models perform better for forecasting the solar radiative output on time scales of weeks (e.g.Henney et al., 2012), at this stage we consider empirical methods only.We tested two forecast methods, based on autoregressive (AR) and neural networks algorithms, using the complete and preprocessed time series of solar fluxes.The results are described in Section 3.These extrapolated series are used as an input in thermospheric density models.Section 4 describes the Drag Temperature Model (DTM-2013) (Bruinsma, 2015), driven here with the 30 cm radio flux (F30) instead of F10.7 because of its better performance, as recently demonstrated by Dudok de Wit and Bruinsma (2017).In order to assess the contribution of F30 on orbit prediction, we compare its forecast to the forecast obtained with F10.7 by means of an error analysis of the extrapolated orbits obtained with either F30 or F10.7.The resulting errors along the satellite orbit are compared and presented in Section 5.
The objective of this study was to develop a preoperational service that provides real-time specifications and forecasts of radio fluxes at various wavelengths for use in thermospheric models for satellite orbit prediction.In the appendix we describe the organization, the content and the format of the files on the ftp server of the Collecte Localisation Satellites (CLS) company, which hosts this service at the following address: https://spaceweather.cls.fr/services/radioflux/.
Most thermospheric models use both daily values and running averages (typically, over 81 days) to capture the trend.However, because the procedure for estimating such trends is model-dependent, here we concentrate on daily values only.

Data description and preprocessing
Radio emissions at centimetric wavelengths and EUV emissions are generated by different physical processes but originate from the similar regions of the solar atmosphere; as a result, radio emissions (which can be measured from the ground) can provide reasonable proxies for solar EUV and UV emissions (which must be measured from space); see for example (Nicolet and Bossy, 1985;Tapping, 2013).Best known is the radio flux at 10.7 cm wavelength, i.e. the F10.7 index, which is used as a solar input in many space weather applications.Radio emissions at other wavelengths have a different blend of chromospheric and coronal contributions, which is one of the prime motivations for performing a multiwavelength monitoring of the Sun.
The F10.7 index has been measured continuously since 1947, first at Ottawa, and subsequently at the Penticton Radio Observatory in British Columbia.Other wavelengths have been routinely measured at several locations.Of particular interest are the radiopolarimetric observations from the Toyokawa and Nobeyama observatories in Japan because of their continuity and stability.These observations started in 1951.Although numerous other solar proxies exist, these radio observations stand out by offering over 60 years of daily observations with high radiometric stability and very few interruptions.
Here we concentrate on measurements made at 5 wavelengths (see Tab. 1) that offer a high potential for space weather applications.All consist of daily observations, and are almost devoid of flare-related contributions.Daily values of the F10.7 index are obtained from three flux determinations per day that are centered at local noon (Tapping, 2013); for the Toyokawa and Nobeyama observatories flare-corrected daily averages are provided (Tanaka et al., 1973).These instruments are calibrated at least once per day, which ensures a high stability.
The original observations are displayed in Figure 1.On time scales of several months and beyond, the radio fluxes evolve almost identically with the solar cycle.On shorter time scales, however, their evolution differs in subtle ways that are of great interest for understanding their solar sources.These aspects are beyond the scope of our study; we refer to (Schmahl and Kundu, 1994;Dudok de Wit et al., 2014) for further reading.
The measurements that are listed in Table 1 are publicly accessible with a latency of a few hours only.Such a latency is acceptable for daily values of the radio fluxes because subdaily solar transients such as flares have a limited impact on the thermospheric density (except for the largest X-class flares).Several operations and corrections are required before these  data can be used for operational purposes.Here we describe the main preprocessing steps.

Data collection
All the observations from Ottawa/Penticton are available in one single file at the NOAA's National Center for Environmental Information (ftp://ftp.ngdc.noaa.gov/STP/space-weather/solar-data/solar-features/solar-radio/noontimeflux/penticton/penticton_adjusted/listings/) while observations from Toyokawa and Nobeyama are provided in monthly files by the Nobeyama Radio Observatory (http://solar.nro.nao.ac.jp/norp/data/).This dispersion of the information may partly explain why these data have received little attention so far.The first step thus consists in collecting all the data in one single file, with daily values since November 1, 1951.Notice that the complete set of wavelengths is accessible only from 1957 onwards and so earlier values are extrapolated (see below).The fluxes are expressed in solar flux units (sfu) and two records are provided: one with values adjusted to 1 AU, and one with fluxes as measured on top of the atmosphere.For driving thermosphere and ionosphere models, the 1 AU correction is not used.

Timing
The definite value of the F10.7 index from Penticton is made available shortly after the last measurement of the day, before 24:00 UT.For Nobeyama, the provisional values come out around 12:00 UT and definite values with minor adjust-ments are published once per month, with a latency of 15-40 days.Both indices are thus available in a timely manner for operational use.However, they are measured approximately 17 h apart.No attempt was made here to resample them to a common time grid.Such a resampling may be needed for performing more detailed comparisons as in Dudok de Wit et al. (2014).However, because solar variability is undersampled in the records (i.e. the Shannon-Nyquist theorem is not verified), one should resample them with great care.

Anomalous values
The radio observations are occasionally affected by anomalous or deviant values, whose origin may be either instrumental or solar.Since these values are either erroneous (when they are instrumental) or correspond to short-lived solar transients that have limited impact on the thermospheric density, they must be identified and if possible corrected in order to maintain a statistically homogeneous data set.To identify such anomalous values we first need a measure of uncertainty for each observation.In the absence of such a common measure, we define an empirical one and consider the precision or random error, which expresses the closeness of agreement between independent measurements for a constant level of solar activity.We estimate the precision by using the same AR model approach as in Dudok de Wit et al. (2016), in which the residual error ¼ y À ŷ between observations Y and their prediction ŷ is estimated by means of an AR model; the prediction is expressed as a linear combination of past values, estimated here from the 8 preceding days.We then define the precision as the median absolute deviation (MAD) of the residual error s = 1.48MAD ().We prefer the MAD to the more usual standard deviation (std) because it is less sensitive to occasional outliers (to be addressed below).The MAD and the std give identical results for residual errors that are normally distributed.
The precision, when estimated over a time interval of 80 days (corresponding to 3 solar rotations, i.e. the typical lifetime of an active region) increases almost linearly with the average radio flux.This convenient property allows us to express the precision as a simple function of the flux, see Table 2.The relative magnitude of the precision is solar cycle-independent, except during the first few years of operation of each Fig. 2. Probability density function (PDF) of the residual error for different wavelengths, estimated with kernels.A normal distribution is also shown for reference.Here we only consider periods with the same level of solar activity, defined as 80 < F10.7 < 100 sfu.Let us stress that our precision includes both instrumental effects and solar intrinsic variability, which are difficult to separate.As a consequence, this quantity is likely to overestimate the instrumental error.This, however, is compensated by the tendency of the AR model to underestimate the residual error and, finally, precision should still reflect the instrumental error reasonably well.Our definition of the precision clearly has its limitations and some arbitrariness, but these are compounded by its simplicity and by the possibility to apply it in a systematic way to all the records.
With this definition of the precision we now have a criterion for detecting anomalous values.For a given level of solar activity, the residual error ¼ y À ŷ is normally distributed with fat tails that are indicative of outliers, see Figure 2. A simple criterion for detecting the latter then consists in selecting all observations whose residual error, in absolute value, exceeds 4s .
In practice, with this criterion, we tend to detect more outliers during solar minimum, when the residual error is relatively large with respect to the radio flux.Such small outliers represent relatively minor excursions of the radio flux and therefore have no significant impact on thermospheric models.For that reason, we consider two conditions, one relative, and one absolute jj s > 4 and > 10 sfu: With this empirical criterion, typically 0.5% of the observations are flagged as outliers and discarded in what follows.
Figure 3 shows two examples of outliers that have been detected with our criterion.While the first one is clearly suspect because it affects one single wavelength, the second one is more debatable.As it turns out, there are two categories of outliers.About one half of them can be detected by eye and most likely have an instrumental or human origin.The other half occurs during periods of intense solar activity, when observations are more easily contaminated by non-thermal emissions that originate from flares.Some flaring periods tend to last for several days, such as the 2003 Halloween storm period; while the AR model is able to identify the first one or two contaminated days because of the sharp increase in the radio flux, it usually fails to flag the subsequent days because the model starts using outliers for making its prediction.There is no simple workaround to this problem without introducing an independent measure of solar activity that would inevitably add complexity to our model.Additionally, one may improve the outlier detection by comparing values from different observatories.Indeed, since they make observations at different times, the risk of having concomitant contaminations is smaller.
With our criterion, the average number of outliers per year is (2.0, 0.90, 1.24, 0.38, 0.37) for wavelengths ranging respectively from 3.2-30 cm.This number is largest for the 3.2 cm flux, which is more sensitive to transient non-thermal emissions and to tropospheric effects.The number of outliers is also relatively larger for Penticton because its daily value is based on three observations, in contrast to Toyokawa/ Nobeyama, which uses all observations from sunrise to sunset.
We estimate the true number of outliers, including all flarecontaminated observations to be approximately 20-40% larger than the figures given above.Given that these non-detected outliers are rare and considerably more difficult to identify, we decided to consider them as regular values.Note therefore that during periods of intense activity, the radio flux may be overestimated because of the presence of non-thermal contributions.

Missing values
As shown in Table 1, all records occasionally have missing values, for which substitutes need to be found.For Toyokawa/ Nobeyama, these data gaps usually affect one to three wavelengths simultaneously.However, never so far did we observe a simultaneous interruption at all five wavelengths.These missing values tend to occur randomly in time.One exception is the relocation from Toyokawa to Nobeyama, which took place between February and May 1994.During that period, observations at 30, 15 and 3.2 cm were interrupted for three consecutive months; eighteen days are missing at 8.0 cm (Fig. 4).In addition to these gaps, there are other few outliers, which we ignore in what follows.
Data gaps are usually replaced by direct interpolation or by using proxy models.Interestingly, the radio fluxes are very coherent: observations are highly correlated across wavelength for a given time, but also in time for a given wavelength.This coherence allows us to use the expectation-maximization method (Dudok de Wit, 2011), which is ideally suited for reconstructing missing values in correlated multivariate records.The method replaces missing values with a linear combination of observations made on the same day, and on the day before, using all available wavelengths.The performance can be tested by leaving out a few days, and then checking their reconstruction.We find the reconstruction error to be comparable to the precision, which supports the excellent performance of the method.
Note that this gap filling is suboptimal during extended periods of strong solar activity, when the observations are likely to be contaminated by non-thermal emissions.Both the outlier detection and the gap filling methods are then affected, leading to an overestimation of the radio flux.Clearly, this issue can be fixed only by considering additional solar proxies, such as the MgII core-to-wing index.
3 Forecast methods

Introduction
The initial objective of our model was to forecast the F30 proxy up to 7 days ahead.We later extended the horizon to 30 days for all five wavelengths.The results of these investigations were obtained by means of the "R" software (R Core Team, 2017).
Our dataset of solar fluxes contains five records of daily values from November 1st, 1951 till today, including complete solar cycles 19-23, and most of cycle 24.We separated this dataset into two subsets: the first four solar cycles (years 1951-1995), representing 70% of the data, were used as the training set, and the last two cycles (years 1996-2016), as the validation set.Both subsets cover a complete range in solar activity, from cycle minimum to maximum and even varying levels of maximums, with moderate and strong solar cycles.The F30 data are shown in Figure 5.

Autoregression methods
Our initial attempts of forecasting were based on AR models (Percival and Walden, 1993).The basic principle of an AR model is to express the value of day D as a linear  combination of previous values, i.e. days D-1 to D-p, where p is the order of the model.This order can be chosen as the maximum lag for which the autocorrelation function of the time series has significantly non-zero values, which is 23 days in case of F30.
ARIMA (Auto Regressive Integrated Moving Average) models are a more sophisticated version of AR models, where one adds two orders, d and q: d is the order of differentiation of the time series, and q is the moving average order.The ARIMA (p, d, q) model can be expressed as: where y t is the value of the variable at time t, c i is the AR coefficient of order i; d is the average of the differentiated time series; t is the model residual at time t; u j is the moving average coefficient of order j.
There is generally no unique ARIMA model adapted to a given time series, and the determination of the different orders relies somewhat on empirical methods.We followed the widely used one proposed by Box and Jenkins (1970).In the case of the F30, several tests indicate an optimal differentiation order d = 1, which minimizes the std of the series, and for which the autocorrelation function has neither too many positive values, nor too frequent changes in sign.The AR order was chosen at p = 2, corresponding to the maximum lag where the partial autocorrelation function has significant values.Finally, determination of the order of the moving average was not straightforward, an optimal value was found at q = 2 based on systematic tests.The resulting ARIMA (2, 1, 2) model gives results comparable to the AR (23) model, but more efficient, with only 4 coefficients instead of 23.The approach to diagnose model performances was to compute the Root Mean Square Error (RMSE, also designated as RMS in this paper) for each forecast horizon and to compare it to the RMSE of the simple persistence model, which assumes all future values are equal to the one last observed.Figure 6 presents the forecast error for the AR, ARIMA and persistence models.

Neural networks methods
The main part of the modeling investigations is based on a neural networks approach.A neural network is a mathematical set of elements, each composed of an input vector x i , a weights vector w i , a bias b and an activation function c such as its output y is given by: These elements, the neurons, are organized in layers connected to each other: one network comprises an input layer, one or several intermediate (or hidden) layers, and an output layer.The learning process consists in adjusting weights so that the output fits well the data.The use of neural networks in forecasting problems has been detailed by Zhang et al. (1998).
The implementation used in this study relies on a logistic activation function for the hidden layer, and a linear one for the output layer.Several model versions have been tested, in the order of increasing complexity.
The minimal neural network to forecast solar radio flux has a unique output which is the forecast for day D þ 1.Since we want to forecast the flux up to several days out, several approaches are possible: recursive structure: The minimal network is applied recursively for each forecast step, where forecasted values for day D þ n are input data to compute the forecast for day D þn þ 1.This is the solution that requires the least parameters, but there can be an error accumulation effect for the longest forecast horizons, though it was not significant below 30 days; successive structure: Each forecast horizon has an independently adjusted network.This solution is longer to compute since the number of parameters increases linearly with the maximum forecast horizon, but each network remains as simple as the minimal network; parallel structure: A single large network is built, in which each forecast horizon is one of its outputs.In practice, this solution is the simplest to use, but the size of the network and consequently the large number of parameters makes the optimization less efficient.
These three approaches have been tested, and their performances are not significantly different.During the following investigations, both the recursive and parallel structures have been used.
The first optimization phase of the neural network model has been orientated on the input data selection.Considering the rotation period of the Sun of approximately 27 days and as confirmed by the first investigations on AR models, the flux forecast is sensitive to values until at least 27 days in the past.However, all values are not equally important for the forecast, and due to model efficiency concerns, it is preferable to limit the number inputs (and the parameters) to the necessary minimum.To achieve that, the lags were selected iteratively between 1 and 30 days, adding at each step the one minimizing the forecast RMS.Since the model was designed to forecast solar flux at different horizons in the future, a unique criterion was needed to take into account all time horizons at once, without giving too much weight to the farthest one out, which have the largest errors.Thus, we used a mean over all horizons of the ratio of the neural network model RMS to the persistence model RMS, given by: Relative where n is the maximum forecast horizon.Using this criterion, the simple recursive neural network model did not need to have more than 5 inputs, the RMS decreasing no more than 1% starting from the 4th iteration.The results of the successive iterations are listed in Table 3.
After having selected the most adapted input lags for the F30 time series, the forecast could be further improved by adding to the model inputs past values of other wavelengths of the solar radio flux.These complementary wavelengths were again added iteratively, each one with the same time lags as the 30 cm wavelength.A decrease in RMS of almost 10% was achieved by combining the 4 longest wavelengths in the inputs, though the main contribution was due to F10.7.The shortest wavelength, 3.2 cm, did not allow any improvement, which is consistent with its noisier behavior as mentioned in paragraph 2. Consequently, it was not included in the final model.Results for this test are listed in Table 4.The performances of the ARIMA model and the two NNET models (mono and multiwavelengths) are displayed in Figure 7.
After having chosen the best input to forecast F30, the second phase consisted in optimizing the more internal parameters of the neural network such as the size of the hidden layer, the maximum number of iterations during training or the initialization of the weights.
During the input optimization phase, the number of neurons in the hidden layer was arbitrarily fixed at 5, which is a reasonable assumption considering the number of input neurons (5 at first, then 20).A systematic computation of the model RMS for all sizes of hidden layers from 1 to 50 neurons, as shown in Figure 8, indicates that the results are not very sensitive to this parameter: the minimum error corresponds to a size of 10 neurons, but with less than 2% dispersion from 2 to 16 neurons.Finally, this parameter was set at 7 neurons, a good compromise between model error and network size.
The model was then tested for a possible overfitting: during the model training, if the adjustment is too close to the training set, the obtained model will have difficulties reproducing other data sets.This is visible if the model applied on the test set has an increasing error after a given number of training iterations.In our case, there was no significant overfitting, but the minimum model error on the test set was reached at 600 iterations, well before the final convergence of the model on the training set.This is shown in Figure 9.
The last optimization test performed on the model was its sensitivity to the initialization of the weights: before the neural network training, its weights have to be randomly initialized in order to avoid any systematic bias in the final solution.But since there is generally no unique solution, and particularly, since it is not always beneficial to let the model fully converge during training because of overfitting, each set of random initial weights leads to a slightly different solution.Probing 50  sets of initial weights, the 43 results have an error dispersion of around 3%, which is obviously not negligible compared to the previous optimization tests.
The neural network model development was focused on a 7 days forecast horizon.However, there is an interest for operational applications in having 30 days forecasts.Since the final multi-wavelength model has a parallel structure, with one output neuron per forecast horizon, a 30 days model network is intrinsically different from one for 7 days.All optimization steps were run again, but it led to very little improvement: the 30 days forecast obtained from a specific version has an error of only 0.7% smaller than the 30 days forecast based on the parameters of the 7 days version.
In order to estimate the error on forecasts of future values, a simple statistical model was deduced from the relationship between forecast model RMS and flux values.81 days exponential moving averages of both quantities show a strong positive linear correlation, from which a linear regression can be derived.Figure 10 illustrates the example of 1 day forecasts.

Comparison of model errors
During model development, only the global error over the whole period of the validation set has been considered.Analyzing the errors per year showed them to be positively correlated with the level of solar activity, with an error 5-6 times larger in 2001 than in 2009.The distribution of the errors between different models, however, does not significantly depend on the year, except for the fact that all models give almost similar performances during solar minimum, as displayed in Figure 11.
At this point, the only diagnosis of the models was based on the RMS, which only represents absolute values of the errors.If we consider the mean error (estimated from the difference between model and observations) then more complex models appear to have greater biases than simpler ones.At all forecast horizons, the multi-wavelength model is significantly more biased than the 3 others: the mean underestimation of the flux ranges from 0.3 sfu at a 1 day horizon to 0.7 sfu at 7 days, when all the other models have no significant bias at 1 day, and slightly overestimate of 0.1 sfu at 7 days (Fig. 12).However, for all models, the mean biases are always 1 order of magnitude smaller than the RMS.The different behavior of the multi-wavelength model also stands out when considering the temporal evolution of the biases.As shown in Figure 13, all models except this one evolve from a   bias during the ascending phase to a positive bias during the descending phase, which is one can expect from forecasts that, on average, lag behind observations.In the meantime, the multi-wavelength model remains always negatively biased, with an amplitude that roughly follows the evolution of the cycle.

Comparison to NOAA F10.7 forecast
The results of the model developed for F30 have been compared to F10.7 forecasts provided by the US Air Force, which are published by the NOAA/SWPC (ftp://ftp.swpc.noaa.gov/pub/latest/45DF.txt).These forecasts are currently used on an operational basis by CNES in orbit prediction.The comparison is focused on four different 365 days periods picked inside the validation set, covering four different levels of solar activity described in Table 5.

Results for F30
In order to compare F30 and F10.7 forecast errors, F30 is rescaled to F10.7 values.The transformation considered is the one used by the thermosphere density model DTM-2013, which takes F30 scaled to F10.7 as input (cf.Sect.4): The comparison is based on the mean error and std of the error, computed here as "observation À forecast".When comparing both quantites, F30 forecast performances are as good as F10.7 ones during low solar activity, and significantly better during the three other periods (Fig. 14).The F30 also gives logically better results when comparing maximum error, and accumulated error over all forecast horizons, a quantity that better represents the error during orbit calculations.

Results for F10.7
The same comparison was performed between NOAA F10.7 forecast and a F10.7 forecast obtained using the same method as for F30, i.e. with a neural network based model, and all the radio wavelengths as input.The error is also lower for this new model than with the NOAA model for all configuration of solar activity, especially during high activity with an improvement of 15%.However, the gain in performances is less evident than with F30, especially during low and decreasing solar activity.Besides, our model is more     than the NOAA model during low solar activity period (Fig. 15).

Thermosphere density model
Semi-empirical thermosphere models are used in satellite orbit determination and prediction programs to compute the atmospheric drag force, as well as in upper atmosphere studies.They predict temperature and (partial) density as a function of the location (altitude, latitude, longitude, local solar time), solar and geomagnetic activities, and day-of-year.The first Drag Temperature Model, DTM-78 (Barlier et al., 1978), was developed in the seventies; its most recent version is DTM-2013 (Bruinsma, 2015).
The DTM models are constructed by fitting to the underlying density database, which is considerably larger and more accurate for DTM-2013 than for DTM-78, to reproduce the climatology of the thermosphere.The spatial resolution of the models is of the order of thousands of kilometers.The solar proxy used in DTM-2013 is F30 instead of F10.7, which has improved modeling performance during the solar cycle minimum in 2008-2009, even if density is still being overestimated, as well as for the years 2010-2011.The density at about 800 km altitude in the interval 1992-2011 can also be more accurately reproduced thanks F30 (Bruinsma, 2015).The switch to F30 was made based on a study by (Dudok de Wit et al., 2014), in which test models were developed with F10.7 and F30; the results were clearly in favor of F30, which had 7% lower RMS of the observed-tomodeled density ratios.
Recently, the part of F30 versus F10.7 in the improvement of thermosphere modeling was analyzed more in detail and quantified in (Dudok de Wit and Bruinsma, 2017).They reported up to 20% smaller model bias, a better stability of the bias over time, and variations at the solar rotation period being reproduced with 35-50% smaller errors.

Contribution of the new indices to orbit extrapolation
In this section, we intend to evaluate the contribution of the F30 prediction to the orbit extrapolation using the DTM-2013 atmosphere density model (see Sect. 4).The theoretical alongtrack error is estimated for the four periods described at the end of Subsection 3.5 with our F30 and F10.7 predicted series.This error is compared to the along-track error obtained with the NOAA F10.7 predicted indices.It shall be noticed that the estimated errors cover only the uncertainty of the predicted solar flux.It is an important part of the error, but other contributions have to be taken into account to estimate the whole orbit prediction error: the uncertainty of predicted geomagnetic coefficients, the intrinsic accuracy of the density model and the drag characteristics of the spacecraft.
The goal is to estimate how our predictions of the solar flux influence satellite position on the orbit path.To this aim, only the solar fluxes used are different in the comparison: on the one hand we use the observed fluxes and on the other hand the predicted ones.We choose to use the DTM-2013 atmospheric density model as it is able to exploit either the F30 or the F10.7 as an input.The Ap geomagnetic coefficients are the predicted series from NOAA.The chosen orbit is quasi circular, sun-synchronous at an altitude of 600 km and the satellite ballistic coefficient is fixed to 0.01 kg.m À2 .As the variation of the semi major axis is proportional to this coefficient and to the atmospheric density, one can compute its time derivative using the extrapolated solar flux or the observed solar flux.Finally, the third Kepler law leads to establish the following equation: where D stands for the difference between real and predicted values; a is the position on orbit; n is the mean motion of the satellite orbit; a is the semi major axis of the satellite orbit.
A double numerical integration gives Da which corresponds to the along track error.A positive error means the predicted position is behind the real one, with a longer orbital period, indicating that the atmospheric density (and therefore the solar flux) was underestimated.A negative error corresponds to an overestimated solar flux.
Four periods of distinct solar activity are considered in this analysis (same as Tab.5), and the prediction horizon ranges from 1 to 30 days.In addition to the F30 and F10.7 extrapolated time series computed at CLS, we use the F10.7 extrapolated by NOAA as a reference.An example at high solar activity is given in Figure 16.On the left side the mean and std of the along-track error is represented on 30 days for F30/CLS compared with F10.7/NOAA.On the right side, the function distribution of the along-track error after 30 days is represented for F10.7/CLS compared with F10.7/NOAA.
Complete statistics are given in Figure 17 including respectively the absolute value of the mean error, the std, the absolute value at 1% percentile (expressing the minimum value) and the value at 99% percentile (expressing the maximum value).The latter is interesting as it is commonly used as a proxy in mission analysis The more important period is the one with the highest solar activity, as the prediction is much more difficult than for the lower activity period.Concerning the mean error, it is a bit less using the F30 indices, either at 7 or 30 days.However the biases are negligible compared to the variability of the alongtrack error, expressed by the std.At 7 days, the std values are quite the same for the three tests.But at 30 days, the error using the F30 series is dramatically lower than when using the F10.7 from NOAA.The std is about 40% lower (127 km compared to 211 km).As for the extreme values, CLS predictions (F30 and F10.7) and NOAA predictions perform equivalently after 7 days.But after 30 days, like the stds, the error is 40% lower with the F30 indices compared to the F10.7 from NOAA.
On the period during descending solar activity, the std of the along track error is slightly better with the F30 indices, whatever the horizon.The extreme negative (labeled "|min| 1%" in Fig. 17) error values with F30 are also better by 15% than with F10.7 from NOAA.As for the extreme positive (labeled "max 1%") values, they are all much bigger than the negative ones, indicating that the solar flux predictions are generally underestimated during the descending phase of the solar cycle.The maximum error at 7 days using F30 is especially quite high (15% above the error with F10.7 from NOAA).After 30 days, on the contrary, the F30 performs more than 20% better than the F10.7 from NOAA.
On the ascending and the low periods of the solar activity, the performances of the three configurations are quite similar.However it should be noticed a small advantage to the F30 indices, as its resulting along-track error is systematically equal or slightly better than the error with F10.7 from NOAA.

Conclusions
A prototype service delivering nowcasts and forecasts of radio solar fluxes at five wavelengths, from 3.2 cm (F3.2) to 30 cm (F30), has been operational at CLS since May 2016 and accessible at ftp://ftpsedr.cls.fr/pub/previsol/solarflux.The archive starts on November 1st, 1951, and new values are added on a daily basis, at 04:00 UT.The processing consists first in filling the gaps and correcting the spurious data based on an expectation-maximization method.Next, we forecast the five fluxes up to 30 days ahead by using an analogue neural network.The relative performance for F30, in terms of std of the error, can be as much as 25% better after 7 days than that of the present F10.7 forecast by NOAA, with a marked improvement at high solar activity.When applied to the F10.7, our forecast method is also better (still in terms of std) whatever the level of the solar activity.It reaches a 15% improvement rate during high solar activity.When applied to orbit error estimation with DTM-2013 thermospheric model, our F30 predictions result in lower error than the F10.7 from NOAA.The improvement is noteworthy during periods of high solar activity and for the longest prediction horizons: the difference in along-track error reaches 40% after 30 days during solar cycle 23 maximum.Future plans include the replacement of the empirical forecast by a physical one that relies on magnetic flux transport at the photospheric level.

Acknowledgments.
This work was supported by the French Space Agency (CNES) under Contract RTD-0167-CN.We greatly acknowledge the Nobeyama Radio Observatory (Nobeyama, Japan) and the Dominion Radio Astrophysical observatory (Penticton, Canada) for making their high quality solar observations publicly available.The Nobeyama Radio Polarimeters are operated by the Nobeyama Radio Observatory, a branch of the National Astronomical Observatory of Japan.Finally, we thank the anonymous referees for helping improve the manuscript.The editor also thanks anonymous referees for their assistance in evaluating this paper.This appendix describes the infrastructure of the service, including the data flow, the data organization on the server and the content of the files.
The data are stored on the following CLS ftp server: ftp:// ftpsedr.cls.fr/pub/previsol/solarflux.They are also accessible on https://spaceweather.cls.fr/services/radioflux/ by following the links at the bottom of the web page.The main steps of the data processing, as described in Figure 18, are: -flux acquisition from Nobeyama and Penticton; interpolation in case of data gap; -flux correction in case of eruptive effect; computation of predicted fluxes (and associated error) with neural network; -file formatting of the observed and predicted data; deposit of the files on CLS FTP server.
On CLS ftp server, the files are organized as described in Figure 19.The files are split into two main categories: the files in the "observation" folder contain only the observed data (whether they are final or provisional), and the ones in the "forecast" folder contain the predicted solar flux on 30 days, preceded by 81 days of observations (whether final or provisional).
The data are given both at 1 Astronomical Unit (TYPE = "adjusted"), and at the real Sun-Earth distance (TYPE = "absolute").The absolute flux is obtained from the adjusted flux with the following formula: where R 0 is the distance corresponding to 1 Astronomical Unit; R is the Sun-Earth distance, given by Zdunkowski et al. (2007): R ¼ 1:00011 þ 0:034221 cosðuÞ þ 0:00128 sinðuÞ þ 0:000719 cosð2uÞ þ 0:000077 sinð2uÞAU where: d being the day in the year.
The content of each file is described hereafter.They all have the same structure: 25 header lines (slightly different for the observation and the forecast), followed by the data with one line per day and organized in 24 rows as follow: 4 rows for the date: year; month; day; -CNES Julian day (days since 01/01/1950 at 0hUT).
4 rows for each of the five wavelengths (30, 15, 10.7, 8 and 3.2 cm): observed flux from the radio-astronomical stations (including interpolated values in case of data gaps) for the past days, and "À1" for the future days; observed flux (including interpolated values) corrected for flare effect for the past days, and extrapolated flux computed by the neural network algorithm for the future days; error on the flux value: precision for the observed flux (as computed in) or RMS for extrapolated flux; -flag indicating the processing applied (see header for detail).In the "observation" files the final data from 01/11/ 1951 (start of the Japanese measurements) to the end of month M-3 or M-2, M being the current month, and the provisional data range from the beginning of month M-2 or M-1 to the eve of the current day.In the "forecast" files, the number of lines is fixed: there are 81 days (3 solar rotations) of observed values and 30 days of forecast values.An example of a forecast file is given in Figure 20.

Fig. 3 .
Fig. 3. Two examples of detected outliers: two consecutive days in the 30 cm radio flux (upper plot) and one single spike in the 10.7 cm radio flux (lower plot).

Fig. 4 .
Fig. 4. Illustration of the gap filling at three wavelengths.The long interruption in the 30 cm flux corresponds to the relocation from Toyokawa to Nobeyama.

Fig. 5 .
Fig. 5.The F30 proxy with the training dataset (blue) and the validation dataset (red).

Fig. 8 .
Fig. 8. Performances of the multi-wavelength neural network model as a function of the hidden layer size.

Fig. 9 .
Fig. 9. Performance of the multi-wavelength neural network model as a function of number of iterations.

Fig. 10 .
Fig. 10.Model RMS as a function of the moving average flux on 81 days, and its linear regression.

Fig. 11 .
Fig. 11.Performance per year of the optimized models at a 4 days forecast horizon.

Fig. 12 .
Fig. 12. Mean error (model-observation) as a function of the forecast horizon.

Fig. 13 .
Fig. 13.Mean error per year at a 4 days forecast horizon.

Fig. 14 .
Fig. 14.Mean and þ/À 1 standard deviation of error for scaled F30 (blue) and NOAA F10.7 forecast (red), during 4 periods of different solar activity (from left to right and top to bottom: high, decreasing, low, increasing).

Fig. 15 .
Fig. 15.Mean and þ/À 1 standard deviation of error for our F10.7 forecast (green) and NOAA F10.7 forecast (red), during 4 periods of different solar activity (from left to right and top to bottom: high, decreasing, low, increasing).

Fig. 16 .
Fig. 16.Example of along track error at high activity period on 30 days of extrapolation (left fig.) and distribution function of the along track error after 30 days (right).

Fig. 17 .
Fig. 17.Statistics on along-track error in km after 7 days (upper figs.) and 30 days (lower figs.) of extrapolation, for four different level of of solar activity.

Fig. 20 .
Fig. 20.Example of a radio flux file formatted by CLS.

Table 1 .
Main characteristics of the five radio fluxes that are considered in this study.

Table 2 .
Expression of the precision s e as a function of the radio flux y, in sfu.

Table 3 .
Optimization of the lags as input of the recursive neural network model.

Table 4 .
Optimization of the wavelengths as input of the multiwavelength neural network model.

Table 5 .
Periods of various levels solar activity used for comparison to NOAA F10.7 forecast.