A new method for forecasting the solar cycle descent time

The prediction of an extended solar minimum is extremely important because of the severity of its impact on the near-earth space. Here, we present a new method for predicting the descent time of the forthcoming solar cycle (SC); the method is based on the estimation of the Shannon entropy. We use the daily and monthly smoothed international sunspot number. For each nth SC, we compute the parameter [Tpre]n by using information on the descent and ascent times of the n 3th and nth SCs, respectively. We find that [Tpre] of nth SC and entropy can be effectively used to predict the descent time of the n + 2th SC. The correlation coefficient between [Td]n+2 [Tpre]n and [E]n is found to be 0.95. Using these parameters the prediction model is developed. Solar magnetic field and F10.7 flux data are available for SCs 21–22 and 19–23, respectively, and they are also utilized to get estimates of the Shannon entropy. It is found that the Shannon entropy, a measure of randomness inherent in the SC, is reflected well in the various proxies of the solar activity (viz sunspot, magnetic field, F10.7 flux). The applicability and accuracy of the prediction model equation is verified by way of association of least entropy values with the Dalton minimum. The prediction model equation also provides possible criteria for the occurrence of unusually longer solar minima.


Introduction
Forecasting the solar cycle (SC) characteristics is an important aspect of space weather studies. There are several statistical and mathematical models that can be used for the prediction of SC characteristics. These models utilize either precursor or extrapolation methods (Ohl 1966;Feynman 1982;Wilson 1990;Thompson 1993;Hathaway et al. 1994;Wilson et al. 1998;Kane 1999;Solanki et al. 2002;Hathaway & Wilson 2006;Kane 2007;Podladchikova & Van der Linden 2011). While some methods provide predictions well in advance or close to the time of initiation of an upcoming SC, a few other methods offer step-by-step predictions as an SC advances. It has been shown that precursor methods show better performance compared with other prediction methods (Li et al. 2001;Brajsa et al. 2009). Some of the aforementioned methods are based on a physical approach (Dikpati & Charbonneau 1999;Schatten 2005;Dikpati et al. 2006;Dikpati & Gilman 2008;Svalgaard et al. 2005) rather than a strictly numerical approach. Pesnell (2008) examined over 50 forecasting methods and compared the amplitudes predicted by these methods for SC 24. A careful scrutiny of available models suggests that the focus of most studies has been on forecasting the peak amplitude and ascent time of SCs, and little attention has been paid to the prediction of the descent time of an upcoming SC.
Recently, predictions on the minimum of SC 24 have failed, and its minimum was delayed by nearly two years. Many interesting observations of the sun and near-earth environment have been reported during this solar minimum (McComas et al. 2008;De Toma et al. 2010;Echer et al. 2012;Solomon et al. 2013;Fröhlich 2013;Hajra et al. 2014). Haigh et al. (2010) found that during the declining phase of solar cycle 23, there was a four to six times larger decline in ultraviolet emissions. Past studies have reported significant changes in the earth's atmosphere during the extended solar minimum, from 2007 to 2009 (Emmert et al. 2010;Ermolli et al. 2012). Hathaway & Upton (2014) showed that meridional flow variations contributed to the weak polar fields at the end of SC 23, leading to the extraordinary SC 23/24 minimum. Another important feature of solar activity is the occurrence of a prolonged/grand minimum. Choudhuri & Karak (2012) showed that 1-4% of SCs may have conditions suitable for inducing a grand minima. Climatologists believe that the frequent occurrence of prolonged low solar activity periods may result in significant cooling of the earth's atmosphere. The peculiar extended minimum of SC 24 has raised questions on future solar activity. An obvious question that arose was, ''Are we approaching a Maunder minimum or Dalton minimum?'' (Miyahara et al. 2010;Jager & Duhau 2012). The effect of the extended low solar activity period on the near-earth environment is indeed a cause of concern. Therefore, predictions of both the length and descent time of SCs are of interest.
Presently, there are no methods to predict the descent time of an upcoming SC. In this study, an empirical model is developed to predict the descent time of a forthcoming SC; the model based on the estimation of the Shannon entropy. This paper is structured as follows. The data used and Shannon entropy computation are described in Section 2. The development of the model is discussed in Section 3, and the results are presented in Section 4. Possible clues to the occurrence of a grand minimum are presented in Section 5, and implications of the present work are elaborated in Section 6.

Data used and Shannon entropy estimation
We use the daily and monthly smoothed international sunspot numbers (available at http://ngdc.noaa.gov and http://sidc. oma.be/sunspot-data), and they are denoted by S, and S ms , respectively. It is well known, each SC is characterized by a solar maximum S max , a solar minimum S min , an ascent time T a , a descent time T d , and a length T cy . The occurrence times of the solar minimum and solar maximum are considered as the start time [t s ] n and peak time [t p ] n for each nth SC. The end time of the nth SC is the start time of the (n + 1)th SC, that is, [t e ] n = [t s ] n+1 . The length of the SC is obtained from the relation [T cy ] n = [t e ] n À [t s ] n . It should be noted that the SC characteristics depend on the method adopted to determine the solar maximum/minimum and their occurrence times. These SC characteristics are provided by the NGDC-NOAA (http://www.ngdc.noaa.gov/nndc/struts/results?t=102827& s=1&d=8,4,9), and they are commonly used by the scientific community for SC studies. In this method, the minimum of an SC is determined by considering the number of spotless days and the frequency of occurrence of old and new cycle spot groups along with the mathematical minima in the monthly smoothed sunspot number. Kakad (2011) compared SC characteristics from NGDC-NOAA with those obtained using a mathematical minimum and found the SC characteristics obtained by both methods to be in good agreement, with the deviation being very small. Here, the SC characteristics are estimated using mathematical minima and maxima in the monthly smoothed sunspot number. If a minimum (maximum) value is encountered more than once, then we choose the first instance as the start (peak) time. The SC characteristics obtained are tabulated in Table 1. We also use the solar magnetic field (B 0 ) and F10.7 flux data from http://spidr.ngdc.noaa.gov; these data are available only for the last two and last five SCs, respectively.
The Shannon entropy has its origin in the information theory (Shannon 1948), and it is a measure of the uncertainty associated with a random variable. In recent times, it has been widely used to understand various phenomena linked with space weather, climate, and earth-related studies (Materassi et al. 2007;Bapanayya et al. 2011;De Michelis et al. 2011). As the first step, we compute the Shannon entropy for each SC. For this computation, it is necessary to obtain variations in the daily sunspot number, which is denoted by DS. A time series related to any natural phenomenon such as sunspots is non-stationary and needs to be transformed into a suitable form for statistical analysis. This is accomplished by applying a moving average filter to the time series (Carbone et al. 2004), which is akin to detrending the time series in order to extract statistically meaningful information. In particular, adequate caution should be exercised to avoid both under-and oversmoothing of data (Das Sharma et al. 2012). Several moving average time window sizes (w s ) are examined. In order to impart stationarity to the SC time series, the moving average is removed from the original time series, giving the following stochastic sequence (DS): This stochastic sequence is used for statistical analysis. In the present analysis, we applied centered moving average windows of various sizes (w s = 3, 5 . . . 15) to the daily sunspot number data (S). The original and smoothed sunspot number data for w s = 3, 9, and 15 are shown in Figures 1a-1c, respectively, as an example. It can be observed that when w s = 15, the moving average tends to oversmooth the original data, whereas for w s = 3, the smoothed signal nearly represents the original signal such that DS represents only the highfrequency variations of the data. The optimal range of w s for which adequate information on the sunspot variations is contained in the new series (DS) is obtained to be 9-13. These new series can now be used to compute the Shannon entropy. In Figure 2a, we plot daily (red color) and monthly smoothed (black color) sunspot number data for SCs 10-23. The vertical dashed lines indicate the start (t s ) time of SCs 10-24. In the SCs prior to SC 10, several gaps exist in the daily sunspot data, and hence, those cycles are excluded from the computation of the entropy. As an illustration, in Figure 2b, we show the change in the daily sunspot number centered on the nine-day mean. Figures 2c and 2d show the variation of the mean solar magnetic field and F10.7 flux for w s = 9 for the preceding two and five SCs, respectively. For each SC, we estimate the Shannon entropy by treating DS as a random variable. The Shannon entropy is given by E ¼ À P l¼m l¼1 pðx l Þ log 2 ½pðx l Þ, where x is a random variable with m the number of outcomes and p(x l ) the probability of x l . The computation of entropy requires information on the probability distribution of the random variable p(x l ). Here, we use the probability density function computed from histograms to obtain the entropy (Wallis 2006) and is given by Eq. (2): where p k is the probability and w k is the width of the kth bin of the histogram. The parameter N represents the total number of bins in the histogram. The estimated probability density function (PDF) is such that P k¼N k¼1 p k ¼ 1. It may be noted that the shape of the probability density function obtained from histograms is sensitive to the choice of the bin size. In order to get appropriate estimates of the PDF associated with DS, we determine the bin width by using two methods: (i) Scott's method and (ii) Knuth's method. In Scott's method, the bin width w k is given by 3.49 · r/ m 1/3 , where r and m, respectively indicate the standard deviation of DS and number of random observations (Scott 1979). In the method presented by Knuth (2013), the number of bins associated with the maximum posterior probability is considered as the optimum number of bins (bin opt ), and the width of the histogram is taken as w k = (DS| max À DS| min )/bin opt . Figure 3 shows the variation of DS for w s = 9 and the corresponding estimated PDF based on the histogram technique for both Scott's and Knuth's optimum bin widths for SCs 21 (left panel) and 22 (right panel). The estimates of the entropy and bin width are given in the corresponding subplots. By applying both these binning methods and Eq. (2), we compute the entropy for SCs 10-23 by utilizing DS obtained for w s = 9, 11, and 13. As an example, the entropy [E] n and bin width w k obtained by applying Scott's and Kunth's methods to DS generated using w s = 9 are provided in Table 1. It is to be noted that estimates of the entropy obtained by using the optimum bin widths are the same for both Scott's and Knuth's methods. Here, we use the entropy computed using Scott's method.

Development of the model
We treat the ascent [T a ] and descent [T d ] times of the SC as two variables. As a first step, we explored how the entropy of a nth SC is related to the ascent and descent times of past, present, and future SCs (in the range n À 3 to n + 2) by adopting a correlation analysis. The search range n À 3 to n + 2 is found  1904 1904.1 1904.2 1904.3 1904.4 1904.5 1904.6 1904.7 1904.8 1904.9 1905 1904 1904.1 1904.2 1904.3 1904.4 1904.5 1904.6 1904.7 1904.8 1904.9 1905 1904 1904.1 1904.2 1904.3 1904.4 1904.5 1904.6 1904.7 1904.8 1904 optimal by way of desirable number of data points (N = 12) to carry out meaningful statistical evaluation. A correlation coefficient between entropy of nth SC and [T a ], [T d ] of n À 3 to n + 2 SCs (total 12 parameters) is shown in Figure 4a for w s = 9, 11, and 13. It is found that the good correlation coefficients emerge from [E] n and (i) and it is consistent for w s = 9,11, and 13. These three parameters are marked by black dotted circles in Figure 4a. These correlation coefficients are statistically significant (confidence limit ! 85%) and do not vary considerably    is used and its correlation with [E] n is computed for n = 10-21. It is found that this particular combination yields the correlation coefficients of 0.95, 0.93, and 0.90 for w s = 9, 11, and 13 respectively. However, we realize that the high value of the formal correlation coefficient (!0.9) is not sufficient to justify the uniqueness of this particular combination. Thus we developed a following test model to check the robustness of this particular combination. We utilize the ascent and descent times of SCs n À 3 to n + 2 in the test model. Therefore, apart from the entropy [E] n , we have 12 other variables, namely [T a ] j and [T d ] j , where j = n À 3 to n + 2. Hence, it is reasonable to assume that the entropy of the nth SC is dependent on these 12 variables represented as T a and T d of past, present, and future SCs. Thus, the prediction problem is now reduced to the identification of the optimal combination (from these 12 variables) that best correlates with the entropy of the nth SC. We define the following test parameter [T test ] n based on these 12 variables: where W represents the weight on each variable and has a value of either 0 or 1 (i.e. W = [0 1]). The ± sign in the equation ensures that all possible combinations are considered. shown for w s = 9. It is found that only 6 out of the possible 531,441 combinations have correlation coefficients r greater than 0.95. However, it is pertinent to note that the number of variables (nv i ) that contribute to these best correlated six combinations C i (i = 1-6) can vary. In Figure 4b, for the six correlated combinations, the number of variables C i (i = 1-6) is plotted as a function of the correlation coefficient (r i ). It can be seen that all the combinations yield correlation coefficients in the narrow range of 0.95-0.96. It can also be seen that combination C 1 is associated with contributions from the least number of variables (three), while the remaining combinations (C 2 to C 6 ) are associated with five or more variables. In such situations, the simplest of the models is preferred. In the present case, as C 1 can be modeled using only three variables as opposed to five or more variables for the other combinations, it qualifies as the simplest combination to derive the entropy of the nth SC and is given as follows: The above test model reveals that the initially formulated simple combination, which is based on only the correlation coefficient, turns out to be the simplest combination with the least number of parameters and passes the test for the best prediction equation.
We performed an additional test to confirm that the above best argued combination has not emerged by chance. We generated a number of artificial random series of the same length as that of the original SC time series. Treating these artificial data as real data (DS), we carried out the same analysis described by Eqs. (2) found that none of the combinations obtained using the artificially generated random series produced a correlation coefficient of 0.9 or above when the number of parameters was less than or equal to three, which is argued as optimal in Figure 4b. It is pertinent to note that the results obtained from the artificial time series lay clearly away from the region of optimality. The additional test therefore reduces to below 1/9 the probability that the above argued best combination for the SC time series has emerged by chance. It is found that the entropy of the nth SC can be determined efficiently using three parameters, namely the descent time of the (n À 3)th SC, the ascent time of the nth SC and the descent time of the (n + 2)th SC. The contribution of the preceding, present and future SCs to the entropy of the present SC is illustrated in Figure 5a. It is interesting to note that the variables contributing to the entropy of the nth SC are separated by 22 years, a period close to the Hale magnetic SC. In the new parameter [T pre ] n , we therefore combine information on the past ([T d ] nÀ3 ) and present ([T a ] n ) for each nth SC as follows: The values of T pre are presented in Table 1 and nearly fall in the range of 8-13 years. Therefore, we use [E] n , [T pre ] n , and [T d ] n+2 in the prediction model.

Results and discussion
We find that the parameter [T pre ] n estimated for each nth SC and the Shannon entropy [E] n can be used to determine the descent time of the (n + 2)th SC. Figure 5b shows the parameter [T d ] n+2 À [T pre ] n as a function of E n for the n range 10-21. On the basis of Eq. (5), the ordinate can be viewed as a modified ascent time of the nth SC. It is evident from Figure 5b that these two parameters correlate well, yielding a high correlation coefficient of r = 0.95. The obtained correlation coefficient has a confidence limit of more than 99%, and thus, these parameters can be effectively used in the prediction of the descent time of forthcoming SCs. The following equation shows a strong linear relationship obtained from the least squares fit of the parameters discussed above.
½T d nþ2 À ½T pre n ¼ 8:1946 Â ½E n À 48:6: As an illustration, results related to SCs 24 and 25 are presented. Data of SCs 22 and 23 such as [T pre ] 22 = 9.4168, [T pre ] 23 = 11.25, [E] 22 = 5.6159, and E 23 = 5.2617 enable us to predict the descent times for SCs 24 and 25 as [T d ] 24 = 6.84 ± 0.09 years and [T d ] 25 = 5.77 ± 0.21 years, respectively. These predictions are presented in Table 1. The standard errors of the slope and intercept in the prediction equation are utilized to calculate the error in the prediction of the descent times for SCs 24 and 25. As an exercise, we computed the value of T d for SCs 10-21 using Eq. (6). The absolute difference between the predicted and the observed values of the descent time for each SC (i.e. 1 T d ) is presented in the last column of Table 1. The standard error in the observed and predicted values of the descent time for SCs 12-23 is found to be 0.4 years, indicating that the proposed model can be used to predict the descent times of future SCs with better accuracy.
In the present model, the descent time of the forthcoming SC (i.e. n + 2) is determined from parameters derived from previous (i.e. n À 3 and n) SCs, suggesting that SCs have long-term memory (nearly extending to the preceding five SCs). It is often debated whether the solar dynamo possesses long-term or short-term memory or both. Long-term (>1000 years) solar activity proxy data indicate that the occurrences of grand minima and maxima are not uncommon (Usoskin et al. 2007(Usoskin et al. , 2012. It should be noted that the short-term (intracycle) memory may be insufficient to maintain grand minima/ maxima. A recent study (Petrovay 2010 & references therein) indicates the presence of long-term memory in SCs apart from short-term memory. Furthermore, persistence analysis yields a Hurst exponent greater than 0.7, which is sufficiently significant to conclude that the solar dynamo indeed has long-term memory (Ruzmaikin et al. 1994;Oliver & Ballester 1996;Kilcik et al. 2009).
For any statistical forecasting model, a large number of observations are necessary to get reliable predictions. For solar activity studies, long-term sunspot number data are readily available, and hence, the scientific community has extensively used it. The use of physical parameters like F10.7 flux and solar magnetic field in statistical based models should be encouraged. However, both solar flux and solar magnetic field observations are available only for the past few SCs. Their use leads to less reliable predictions because of restriction of fewer observations. Nevertheless, we have computed the Shannon entropy using F10.7 flux and solar magnetic field observations, which are available for the previous five and two SCs, respectively. Such an exercise is important to understand the deviation in the entropy estimated from sunspot number as compared to that in the entropy obtained using F10.7 flux and solar magnetic field. These estimates of the entropy are shown in Figure 6, and it is clear that the entropy determined from physical parameters such as the F10.7 flux and solar magnetic field shows variations similar to those in the entropy obtained from the sunspot number. Thus, it is evident that the Shannon entropy, a measure of randomness inherent in the SC, is reflected well by the various proxies of the solar activity (viz. sunspot number, solar magnetic field, F10.7 flux). The model proposed in the present study is robust and can be used to predict the descent time of future SCs.

Possible clues to the occurrence of a grand minimum
It is important to note that Eq. (6) can also be used to get estimates of the entropy for earlier SCs (n in the range 4-9) since [T pre ] n and [T d ] n+2 are available. Figure 6 shows the entropy values for SCs 4-23. The entropy values for SCs 4-9 are obtained from Eq. (6) and are shown as red dots, whereas those estimated from the daily sunspot number for SCs 10-23 are depicted as black dots. Figure 6  ) can become negative when the entropy of the system decreases to an extent that renders Term 3 ! Term 2 . Under these conditions, the system becomes mathematically untenable. Such periods may be associated with prolonged solar minima/grand minima.

Conclusions
Here, we propose a model exclusively for the prediction of the descent time of SCs. For SCs 10-23, this model involves the use of daily international sunspot number data. We estimate the Shannon entropy for each nth SC and utilize the estimated entropy values to predict the descent time of the (n + 2)th SC. Equation (6) is a vital output of the present prediction model. The parameter T pre that appears on the left-hand side of the equation is derived using the descent and ascent times of the (n À 3)th and nth SCs, respectively. The average T pre is 11 ± 2 years, which is almost half of the Hale magnetic SC of 22 years (Hale et al. 1919). The time constant on the right-hand side of Eq. (6) (i.e. 48.6 years) is close to half of the Gleissberg cycle period of 80-90 years (Gleissberg 1939;Peristykh & Damon 2003). Our model forecasts the length of SCs 24 and 25 as 6.84 ± 0.09 and 5.77 ± 0.21 years, respectively, which are within the range of descent times of earlier SCs. The predicted descent time for SC 24 ([T d ] 24 = 6.84 years) suggests that this SC will cease close to February 2021. This is in agreement with recent predictions for SC 24, which are available on http://solarscience.msfc.nasa.gov/ predict.shtml. An interesting feature revealed by the present model is the coincidence of the lowest values of the entropy with the period of Dalton minimum. Our model suggests that when the entropy of the nth SC falls below the critical value of {48.6 À [T pre ] n }/8.1946, the (n + 2)th SC may enter an extended low solar activity period or a grand minimum. If we assume the average estimate of [T pre ] as 11, then the critical value of the entropy [E] c turns out to be 4.59. Mörner (2013) proposed that the sun may enter an extended minimum during the period 2030-2050. However, the descent times forecast for SCs 24 and 25 by the present model suggest that such extended solar minimum periods are not likely during SCs 24 and 25.