Issue 
J. Space Weather Space Clim.
Volume 7, 2017



Article Number  A25  
Number of page(s)  19  
DOI  https://doi.org/10.1051/swsc/2017019  
Published online  24 October 2017 
Research Article
Space climate and space weather over the past 400 years: 1. The power input to the magnetosphere
Department of Meteorology, University of Reading,
Earley Gate,
Reading
RG6 6BB, UK
^{*} Corresponding author: m.lockwood@reading.ac.uk
Received:
4
June
2017
Accepted:
16
August
2017
Using information on geomagnetic activity, sunspot numbers and cosmogenic isotopes, supported by historic eclipse images and in conjunction with models, it has been possible to reconstruct annual means of solar wind speed and number density and heliospheric magnetic field (HMF) intensity since 1611, when telescopic observations of sunspots began. These models are developed and tuned using data recorded by nearEarth interplanetary spacecraft and by solar magnetograms over the past 53 years. In this paper, we use these reconstructions to quantify power input into the magnetosphere over the past 400 years. For each year, both the annual mean power input is computed and its distribution in daily means. This is possible because the distribution of daily values divided by the annual mean is shown to maintain the same lognormal form with a constant variance. This study is another important step towards the development of a physicsbased, longterm climatology of space weather conditions.
Key words: magnetosphere / space environment / interplanetary medium / historical records / energy deposition
© M. Lockwood et al., Published by EDP Sciences 2017
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 Historical data
In recent years, reconstruction of space climate conditions (in the form of annual means of relevant parameters), has been made possible from five main types of historic observation series relating to some aspect of solar magnetism. These records measure different aspects of space climate and extend back into the past over intervals of different lengths.
1.1 Telescopic sunspot observations
When, where, and by whom the telescope was invented is a matter of debate because it was preceded by a number of simpler optical devices (King, 1955; Van Helden, 1977; Watson, 2004). However, there is little doubt that the realisation of the telescope's application spread from the Dutch town of Middelburg, the site of a glass factory using Italian glassmaking techniques. The invention was claimed by Jacob Metius of Alkmaar and Sacharias Janssen of Middelburg, but all we know for sure is that a spectacles manufacturer in Middelburg, Hans Lippershey, filed for a patent in 1608. Following reports of this patent, scientists around Europe began experimenting with the device in 1609, including Thomas Harriot in England and Paolo Sarpi in Italy. It is probable that it was through his friend and patron, Sarpi, that Galileo Galilei came to hear of the device and his work greatly improved its design. Johannes Kepler in Prague was able to borrow one of Galileo's telescopes from Duke Ernest of Cologne and, in improving it further, he founded the science of optics in 1610.
Nakedeye observations of sunspots had been made since ancient times, many by Chinese and Korean astronomers, but only when atmospheric conditions were favourable (e.g., Willis et al., 1980) and when the political situation meant they were sought by rulers as a source of guidance (e.g., Hayakawa et al., 2015). The first known telescopic observations were by Thomas Herriot towards the end of 1610, as recorded in his notebooks (North, 1974), followed shortly after by those by Christoph Scheiner and Johannes Fabricus in March 1611 (Vaquero & Vázquez, 2009). Fabricus had been a medical student and returned home from University in Leiden with some telescopes which, with his father, he began to use to observe the Sun: he was the first to publish such observations in his 22page pamphlet “De Maculis in Sole Observatis” (“On the Spots Observed in the Sun”) which appeared in the autumn of 1611. Galileo started his scientific study of spots in April 1612 after he received a copy of Scheiner's written report and published his first reports in 1613. Telescopic observations of sunspots have been made ever since and composite records compiled to give a nearcontinuous record over 400 years. Calibration of different observers that contributed to this record is very difficult and composites vary in detail, whilst agreeing on the major features (Clette et al., 2014; Lockwood et al., 2016a; Usoskin, 2017). In particular, “daisychaining” of sequences using intercalibration observers from intervals of simultaneous observations is fraught with difficulties and potential errors (Lockwood et al., 2016b) and techniques that calibrate all observers to a common standard are greatly preferable (Usoskin et al., 2016) but can only be applied to data for after 1750. In recent years there has been an ongoing effort to improve the composites by improving the intercalibrations between different observers and to lower the weighting given to data that are not of sufficient quality (Clette et al., 2014; Usoskin, 2017).
1.2 Geomagnetic observations
Whereas regular observations of sunspots were established in just a few short years after the invention of the telescope, the next major development took over a century to come to fruition. It was made by George Graham in London, Alexander von Humboldt in various locations, Olof Peter Hiorter in Uppsala and Carl Friedrich Gauss in Göttingen. Graham first reported geomagnetic activity in 1724, Hortier noted its regular diurnal variation in 1740 and the connection of large transient events to aurora in 1741 (a magnetic disturbance also seen by Graham in London). From his geomagnetic observations at various locations, von Humboldt introduced the concept of a “geomagnetic storm” and sparked the interest of his friend Gauss in 1828. Gauss subsequently developed the first magnetometer that could reliably measure the field strength and/or its horizontal component, establishing the first geomagnetic observatory in Göttingen in 1832. Reviews of the development of the observation of geomagnetic activity have been given by Stern (2002) and Lockwood (2013). Some composites have used geomagnetic activity data from soon after this date; for example, Svalgaard & Cliver (2010) used regressions with different types of geomagnetic data to extend the sequence back to 1831. However, there are concerns about the calibration, stability and homogeneity of these earliest data and so Lockwood et al. (2014a, b) consider the usable data to start in 1845.
1.3 In situ observations
A key development was the advent of observations of Earth's magnetosphere and nearEarth interplanetary space by spacecraft. The solar wind had been postulated to occur in transient bursts which caused geomagnetic storms by Chapman & Ferraro (1931) but was deduced to be continuous from studies of comet tails by Cuno Hoffmeister in 1943 and Ludwig Biermann in 1951 (see Biermann et al., 1967). Initial in situ indicators of the solar wind were reported from the Lunik 2 and 3 spacecraft in 1959, but the first unambiguous data were recorded in 1961 by Explorer 10. When Mariner 2 flew towards Venus in 1962 it found the solar wind to be continuous and also showed fast and slow streams that repeated with the rotation of the Sun. Parker (1958) predicted that this flow should bring with it a magnetic field of solar origin and this “interplanetary magnetic field” (IMF) was also detected by Explorer 10 and Mariner 2. The Explorer 10 data were initially a puzzle, being complicated by an orbit that meant the spacecraft continuously moved between the magnetosphere and the shocked solar wind of the magnetosheath. The Mariner2 magnetometer suffered from spacecraft magnetic cleanliness issues: hence the first reliable and continuous data on the IMF were obtained only when IMP1 was launched towards the end of 1963 (see review by Parker, 2001). A full survey of the heliospheric field at all heliographic latitudes (the IMF being the heliospheric field in the ecliptic plane) only commenced in 1992 when the Ulysses spacecraft used the gravitational field of Jupiter to move out of the ecliptic plane (Balogh et al., 1992; see review by Owens & Forsyth, 2013).
Observations of the solar wind and the embedded IMF have been made since 1963, and collected into the muchused “OMNI” dataset by the Space Physics Data Facility at NASA/Goddard Space Flight Center (Couzens & King, 1986). Observations are relatively sparse until 1966 and do not become (almost) continuous until the advent of the WIND and ACE spacecraft in 1996 (Finch & Lockwood, 2007).
1.4 Solar magnetograph data
In 1904 George Ellery Hale discovered magnetic fields in sunspots using the Snow telescope and an 18foot focal length solar spectroscope at Mount Wilson Observatory (MWO) from the Zeeman splitting of spectral emission lines, confirmed in 1908 using the 60foot “solar tower” constructed the year before (Hale, 1908). However, fulldisk magnetograph observations did not begin until 1957 after Horace Babcock had developed the principles of the modern magnetograph (Babcock, 1953) and a version of his instrument was installed at the 150foot tower at MWO. Routine full disk measurements of the Sun's photospheric field (magnetograms) began at MWO in 1966, and similar measurements began both at the National Solar Observatory on Kitt Peak (NSO) and Wilcox Solar Observatory (WSO) roughly a solar cycle later in the mid1970s (Stenflo, 2015).
The Potential Field Source Surface (PFSS) method to model the solar corona from magnetograms was developed by Schatten et al. (1969) and Altschuler & Newkirk (1969). In this method, photospheric magnetic fields are mapped to the top of the solar corona on the basis of number of assumptions. This involves solving Laplace's equation within an annular volume above the photosphere in terms of a spherical harmonic expansion, the coefficients of which are derived from Carrington maps of the photospheric magnetic field (i.e., magnetograms assembled over an entire solar rotation from magnetographs on Earth's surface or on board a spacecraft in orbit around the L1 Lagrange point). Two major assumptions are that temporal variations within the 27 days taken to build up the map have no effect and that there are no current sheets in the corona (these are neglected so as to allow unique solutions in closed form). To eliminate the possibility that such simple harmonic expansions would result in all of the magnetic field lines returning to the Sun within a small heliospheric distance, the coronal field was required to become radial at the outer boundary, termed the “source surface”. Essentially PFSS yields the equilibrium (minimum energy) configuration that the corona is heading towards for a given distribution of photospheric field, whereas the state of the real corona at any time is also dependent on its time history. Despite its many assumptions and obvious limitations, PFSS has been very successful in the study of a wide range of solar and heliospheric phenomena (see review by Mackay & Yeates, 2012) and, in the present context, has supplied information about the tilt of the Heliospheric Current Sheet (HCS) (see review by Owens & Forsyth, 2013). PFSS has been applied routinely to WSO magnetograms since 1976.
1.5 Cosmogenic isotopes
In addition to these four sets of observations, there is one other dataset which both extends these data sequences and helps bring them together into a coherent understanding. Cosmogenic isotopes are products of Galactic Cosmic Rays (GCRs). They tell us about the state of the Sun because the GCR flux is modulated by the heliosphere and, in particular, the HMF dragged out of the Sun because it is embedded in the solar wind flow. Three isotopes have been particularly valuable in this context: ^{10}Be (mainly measured in ice sheets), ^{14}C (mainly measured in tree rings) and ^{44}Ti (found in meteorites) (see review by Usoskin, 2017). The ^{10}Be and ^{44}Ti are spallation products (respectively, from impacts on O and N atoms in the atmosphere and Fe and Ni atoms in the meteorite). The variations in the ^{14}C isotope production are much larger because they arise from capture of thermal neutrons that are generated in large numbers by relatively low energy secondary particles. These provide proxy datasets which, unlike the other “asithappened” observations, can be improved, for example, by taking more and/or better measurements of the terrestrial reservoir where the isotopes are deposited, and by improved modelling of the production of the isotopes and of their transport and deposition through the Earth system. These records extend back over thousands of years but their indirect nature means that there is need to use both heliospheric and Earthsystem models to interpret them. However, they can be used to check and confirm variations of reconstructed parameters such as the nearEarth HMF (the IMF, Owens et al., 2016a, b) and the calibrations used in composite data series, such as sunspot number (Asvestari et al., 2017).
1.6 Derived open solar flux
Figure 1 summarises some annual means from these data and compares the durations of the datasets. So that they can be compared, each has been used to generate the first of the spaceclimate parameters to be reconstructed, namely the signed open solar flux (OSF, F_{S}), as discussed in the next section. This is the total flux of one polarity leaving the top of the solar corona. Figure 1a shows the open solar flux from two sources. The red line shows F_{S} derived from PFSS modelling based on a combination of MWO and WSO magnetograph data (Wang et al., 2000). The blue dots joined by the thin blue line give the value derived from the OMNI data from interplanetary spacecraft: these values are derived from the radial component of the IMF in the ecliptic plane which gives information about the HMF at other latitudes (and hence the total open flux) because of the result from the Ulysses observations that the radial component of the heliospheric field is independent of latitude (Smith & Balogh, 1995; Lockwood et al., 2004; Owens et al., 2008). These data have used the solar wind velocity observations to remove the contribution of “excess” or “folded” flux using the kinematic correction developed by Lockwood et al. (2009), but similar answers can be obtained from the IMF radial field alone if averaging is carried out over 1 day, which effectively removes most of the folded flux without averaging out genuine sector structure that maps back to the solar corona. Even and odd numbered solar cycles (from minimum to minimum) are shaded white and grey, respectively.
Figure 1b shows the geomagnetic reconstructions of F_{S} by Lockwood et al. (2014a). The colours show the results from 4 different pairings of geomagnetic indices and the grey area the estimated 1σ uncertainties derived using MonteCarlo fitting techniques where random noise is added to the data and each fit repeated 10 000 times. Figure 1c shows model fits to these reconstructions from geomagnetic data using the continuity equation model formulation of Owens & Lockwood (2012): for this modelling, the rate of OSF emergence is quantified using group sunspot numbers. Results are shown for 4 different composites of sunspot group numbers which are presented in Figure 1e: (blue lines) the group sunspot number R_{G} of Hoyt & Schatten (1998) (with various corrections and additions, detailed by Lockwood et al. (2014c) and without the 12.08 multiplication factor); (red lines) the corrected sunspot number, R_{C} (again divided by the 12.08 factor so that it corresponds to a group number) of Lockwood et al. (2014b, c); (green lines) the “backbone” group number R_{BB} of Svalgaard & Schatten (2016); and (black lines) a new composite R_{14C} by that compares the 11year running means of the sunspot data with radiocarbon abundance data to calibrate observers. This calibration cannot be continued past about 1930 due to the increasing anthropogenic influence on the isotope abundances after this date, it also has to make allowance for the variations in the cosmic ray shielding effect of the geomagnetic field on the production of ^{14}C and its exchange between the atmosphere, oceans and biomass. We here use the decadal means of sunspot numbers derived from ^{14}C abundance by Usoskin et al. (2014). These are the only sunspot number composites that extend back to before the Maunder minimum (shaded in pink in Fig. 1d and e).
Figure 1c shows that we cannot distinguish these sunspot number series using the geomagnetic reconstructions. A similar conclusion was reached by Lockwood et al. (2014c, 2016a). The best fit variations of F_{S} using these four sunspot group number data sequences are almost indistinguishable and all agree very well with the geomagnetic reconstructions, as well as the data from PFSS and the in situ interplanetary spacecraft. The yellow band shows the ±1σ uncertainties computed from the fit residuals with the interplanetary data shown in Figure 1a (data that are repeated in all panels of Figure 1 that present F_{S} data series).
However, if we extend these sunspot and modelled F_{S} data series to before the start of reliable geomagnetic data (1845), and back to the start of regular telescopic monitoring of the Sun (1612), larger differences become apparent, as shown in Figure 1d. There has been a great deal of recent literature about the composite sunspot number series (see reviews by Clette et al., 2014; Lockwood et al., 2016a; Usoskin, 2017). The corrections used in R_{C} are simple and certainly not the last word on the subject; however, the similarity to R_{14C} shows that, for the time being at least, using R_{C} remains a valid option. There is now quite widespread agreement that R_{G} is too small in the interval between the end of the Maunder minimum and the middle of the Dalton minimum (c. 1710–1810) (see reviews by, e.g., Clette et al., 2014 and Lockwood et al., 2016a) and this tendency can be seen in Figure 1. R_{BB} is intended as the replacement for R_{G}; however, there is concern that the daisychaining of calibrations inherent in R_{BB} will compound errors as one goes back in time. In addition R_{BB} assumes not only will the data from two observers vary linearly but that they are proportional, an assumption that other studies have shown to be invalid (Usoskin et al., 2016). Using the Royal Greenwich Observatory photoheliographic data and progressively degraded version versions thereof (degraded my omitting groups of area below a threshold value), Lockwood et al. (2016b) showed that this will cause sunspot numbers to be consistently overestimated as one goes back in time when observer acuity was lower because of instrumental limitations. This effect is consistent with the behaviour of R_{BB} and the OSF modelled from it in Figure 1.
Figure 1d also shows two OSF sequences derived from cosmogenic isotope records. The black line is the OSF modelled using R_{14C}, i.e., the sunspot number calibrated using the 14C radiocarbon data from tree rings around the world using the same model (i.e., that by Owens & Lockwood, 2012) and fitting procedure (minimisation of r.m.s. differences using the Nelder & Mead, 1965, search algorithm) used to derive the variations from R_{C}, R_{G} and R_{BB}. The cyan line is taken from annual values of the nearEarth interplanetary magnetic field, B, derived by McCracken & Beer (2015) from ^{10}Be isotope abundances in the Dye 3 and the North Greenland Icecore Project (NGRIP) ice sheet cores. We here use their data series in which potential major transient Solar Energetic Particle events have been identified from large increases in ^{10}Be in a single year and removed by replacing with a linear interpolation from the surrounding years. We convert this estimate of the IMF B to OSF F_{S} using the polynomial fit to the geomagnetic reconstructions and nearEarth in situ data given by Lockwood et al. (2014a). It should be noted that all data shown in Figure 1d, including the in situ spacecraft data, have been passed through a 131 filter to smooth the great interannual variability in the ^{10}Be data caused by the low numbers of the isotope atoms in the ice cores, often near the 1count limit. As reported by Owens et al. (2016a, b), the reconstructions of B based on geomagnetic and solar observations are very similar over the 1845–present interval; agreement of both of these series with cosmogenicbased reconstructions of B, is clearly present but is less close. Measurements of ^{44}Ti in meteorites that have fallen to Earth since 1766 provide a valuable comparison with average levels of the early sunspot records. This proxy is affected by neither natural nor anthropogenic terrestrial influences, since the isotope is produced in space as a spallation product of Fe and Ni in the meteorite by irradiation by cosmicray protons (>70 MeV) during its path through the solar system to Earth. Because of its long halflife (59 years), ^{44}Ti is insensitive to solar cycles but tells us about average levels on centennial scales. Asvestari et al. (2017) have shown that R_{C} is the composite most consistent with the ^{44}Ti measurements and that R_{BB} is consistently too large in and around the Maunder minimum. Note that, contrary to some recent suggestions, all the sequences in Figure 1d indicate that the Maunder minimum is a deeper and longerlived depression in sunspot activity than the Dalton minimum (c. 1800–1825) (see discussion by Usoskin et al., 2015).
Fig. 1 Variations of annual values of the signed open solar flux F_{S}, from various sources, showing how modern values (from magnetogram data via PFSS modelling and from the OMNI dataset of nearEarth interplanetary observations, panel a) are extrapolated back in time using geomagnetic observations (panel b), models based on sunspot numbers (panels c and d) and cosmogenic isotopes (panel d), Sunspot number sequences used as input to the models are shown in part e. Even/oddnumbered solar cycles are shaded white and grey and the Maunder minimum is shaded pink. (See text for further details). 
2 Reconstructions of space climate
The basic technique facilitating reconstructions of average conditions in nearEarth space has been to take historic observations and interpret them using knowledge obtained from modern data (for example from spacecraft or magnetographs). Initially this was simply using single or multiple regression fits of coincident data, but they have grown increasingly complex and intricate. Table 1 summarises the development of the reconstructions and models of past space climate.
Feynman & Crooker (1978) reconstructed annual means of the solar wind speed, V_{SW}, from the geomagnetic index aa, which extends back to 1868, using the fact that aa, like all “range” geomagnetic indices, has an approximately dependence (see Lockwood, 2013). However on annual timescales, aa also has a dependence on the IMF field strength, B, which contributes considerably to the long term drift in aa. Lockwood et al. (1999) removed the dependence of aa on V_{SW} using its 27day recurrence (which varies with mean V_{SW} on annual timescales) and computed the open solar flux (OSF, the total flux leaving the top of the solar corona) using “the Ulysses result” that the radial component of the heliospheric field is largely independent of heliographic latitude (Smith & Balogh, 1995; Lockwood et al., 2004; Owens et al., 2008). The principles of modelling this variation using the OSF continuity model were laid down by Solanki et al. (2000). They used sunspot number to quantify the OSF production rate, and although that parameterisation has become more sophisticated, something like it remains in use in all models today (Lockwood, 2003; Vieira & Solanki, 2010, Owens & Lockwood, 2012; Rahmanifard et al., 2017). The free parameters in the model were then defined by fitting to the geomagnetic reconstruction of Lockwood et al. (1999). The parameterisation of OSF production rate was improved by Solanki et al. (2002) and Vieira & Solanki (2010) and allowing calculation of photospheric flux which has, in turn, allowed solar irradiance modelling and comparisons with magnetograph data. The model of Owens & Lockwood (2012) developed the parameterisation of the OSF loss rate from a simple linear loss law to one that varies over the solar cycle, as predicted theoretically by Owens et al. (2011). Because they are based on sunspot numbers, these models can make reconstructions that extend back to the start of regular telescopic observations of the Sun (c. 1612).
One point to note is that the continuity models apply to OSF but have also frequently applied to derive reconstructions of the nearEarth IMF (e.g., Rahmanifard et al., 2017) which requires understanding of how OSF and HMF are related: often the assumption is made, either explicitly or implicitly, that the two are linearly related (e.g., Svalgaard & Cliver, 2010). This assumption was indeed made in the analytic equations used in the first reconstruction of OSF by Lockwood et al. (1999) – however, this could be done only because the difference between the real OSFHMF relation and the assumed linear one was then accounted for in the regressions then used to derive OSF from the data. However, in general for a fixed OSF, the nearEarth HMF will decrease with increasing solar wind speed because of the unwinding of the Parker spiral. In addition, as the solar wind velocity increases the longitudinal structure in the solar wind also increases (Lockwood et al., 2009a) which means that the kinematic folding of open field lines is greater which increases the HMF for a given OSF. Hence these two effects are of opposite sense and have the same magnitude for nearEarth HMF of near 6 nT: below this threshold the Parker spiral effect dominates and above it the flux folding effect dominates (Lockwood et al., 2009b). The resulting relationship of OSF to nearEarth HMF has been studied by Lockwood & Owens (2011) and Lockwood et al. (2014a).
Svalgaard & Cliver (2005) noted that different geomagnetic indices have different dependencies on the IMF, B and the solar wind speed, V_{SW}, and therefore could be used in combination to derive both B and V_{SW}. The longterm variations that they derived have been questioned (see Lockwood et al., 2006) because they employed nonrobust regression procedures to also filled large data gaps the IMF and solar wind speed timeseries with interpolated values (a much more reliable option is to mask out the geomagnetic data during data gaps when the interplanetary data are missing, as used by Finch & Lockwood, 2007). However the insight of Svalgaard & Cliver (2005) is very valuable and Rouillard et al. (2007) exploited it to reconstruct both B and V_{SW}, and Lockwood et al. (2014a) used 4 different pairings of indices to derive both with an uncertainty analysis back to 1845. There is always the possibility of calibration drift in the historic geomagnetic data (Svalgaard, 2015; Holappa & Mursula, 2015), some of these have been identified, agreed upon and corrected for (e.g., Lockwood et al., 2014d) whereas others are still debated, and others noted but their effects not properly accounted for. This is an ongoing process which may possibly never arrive at a definitive conclusion (or if it does, it may not be clear that it has) for years in which data are sparse and/or of poor quality. However, there is a growing convergence between the different geomagnetic reconstructions of heliospheric parameters (Lockwood & Owens, 2011) and also with those from cosmogenic isotopes (Asvestari & Usoskin, 2016; Asvestari et al., 2016, 2017; Owens et al., 2016b) which indicates further corrections will be second order in nature, rather than fundamental.
As discussed above, the date to which we can extend back using the geomagnetic data is debatable and depends on an assessment at what date the surviving data can be considered sufficiently homogeneous, continuous and wellcalibrated. Lockwood et al. (2014a) argue that this is 1845 but, for sure, it could not be before 1832 when Gauss introduced the firstproperly calibrated magnetometer. Models based on sunspot number and the OSF continuity equation have been used to extend the sequence back from 1845 to the start of regular telescopic observations in 1612. Lockwood & Owens (2014) extended the OSF modelling to compute the flux in the streamer belt and in coronal holes and so computed the streamer belt width variation which matches well that deduced from historic eclipse images (Owens et al., 2017). The streamer belt width and OSF were used by Owens et al. (2017), along with 30 years of output from a dataconstrained magnetohydrodynamic model of the solar corona based on magnetograph data, to reconstruct solar wind speed, number density and the IMF field strength based primarily on sunspot observations. Using these empirical relations, they produced the first quantitative estimate of global solar wind variations over the last 400 years and these are employed in the present paper.
The CMIP6 project (the 6th Coupled Model Intercomparison Project) has led to the development of what are, to date, the most comprehensive and detailed set of solar forcing reconstructions for studies of global and regional climate and of space weather (Matthes et al., 2016). These extend back to 1850. The current paper is aimed at making reconstructions that extend back to close to the invention of the telescope (c. 1612), and so include the Maunder minimum, and also at including as much of the physics of solar wind generation and solar windmagnetosphere coupling as possible − thereby making the extrapolation of conditions seen during the space age to quieter solar conditions more robust and also providing a better understanding of the implications of reconstructed annual mean values for the occurrence probabilities of space weather events of a given magnitude.
The development of reconstructions of annual means of nearEarth space conditions from observations of geomagnetic activity and from models using sunspot number observations as input
3 The need to use annual means
All the reconstructions discussed in Section 2 are of annual resolution. The fundamental reason for this is that, through the process of magnetic reconnection in the dayside magnetopause, the coupling of solar wind energy, mass and momentum into the magnetosphere depends on the southward component of the IMF (in a frame relative to Earth's magnetic axis, such as Geocentric Solar Magnetospheric, GSM). Some of this variation is predictable, arising from the diurnal and annual variations in the Earth's magnetic dipole axis in a Suncentred frame or from annual variations in the heliographic latitude of Earth (Lockwood et al., 2016c). While these factors are predictable, others are not because much variation in IMF orientation in nearEarth space arises from streamstream interactions in the heliosphere (of which Coronal Mass Ejections and Corotating Interaction Regions are the large end of the spatial spectrum), from Alfvén waves and from the solar wind's turbulent evolution from the solar corona to Earth (Lockwood et al., 2016c). We have no historic measurements from which we can separate these variations in IMF orientation from other variations in the solar wind. Hence the only way they can be dealt with is to average them out (Lockwood, 2013).
To illustrate how this works, we here make use of the IMF orientation factor in solar windmagnetosphere coupling, sin^{4}(θ/2) where the IMF “clock angle”, θ = tan^{−1}(B_{Y}/B_{Z}) and B_{Y} and B_{Z} are the Y and Zcomponents of the IMF in the GSM reference frame. When the IMF component in the YZ plane points northward in the GSM frame, θ = 0 and sin^{4}(θ/2) = 0; when it points southward in the GSM frame θ = π and sin^{4}(θ/2) = 1. In general, the function sin^{4}(θ/2) is somewhat similar in effect to a halfwave rectified function such as B_{S}/B where B_{S} = −B_{Z}[GSM] when B_{Z}[GSM] < 0 and B_{S} = 0 when B_{Z}[GSM] ≥ 0. However it is preferable because, unlike B_{S}/B, it has a continuous gradient as a function of clock angle θ and also because it does not give a sharp cut off in transfer to the magnetosphere when the IMF turns northward (θ falls below π/2). Figure 2 shows distributions of observed sin^{4}(θ/2) for 1996–2016 (inclusive): this interval is used because the IMF data are almost continuous and data gaps are few and short. As a consequence, when we average over subintervals of duration τ, the number of 1minute samples in each subinterval is essentially the same. The distributions are given for various timescales, τ, over which the IMF data are averaged. In each panel, the red dashed line is the overall mean of 1minute sin^{4}(θ/2) values for the whole 21year dataset, 〈sin^{4}(θ/2)〉_{all} = 0.355. Because the numbers of samples in each averaging interval τ is the same, the “mean of the means” (sometimes called the “grand mean”) equals 〈sin^{4}(θ/2)〉_{all}, irrespective of the value of τ employed. Hence the red dashed lines give the mean of each of the distributions shown. The distribution yscales are normalised with N being the number of samples in each sin^{4}(θ/2) bin, the largest value of which for a given τ is N_{max}. Panel (a) is for τ = 5 min. Rounding up of the data to two decimal places artificially increases the occurrence of samples with sin^{4}(θ/2) = 0 and sin^{4}(θ/2) = 1, but only slightly. The distribution shows the full range of sin^{4}(θ/2) factors from 0 to 1 and the peak of the distribution is at sin^{4}(θ/2) = 0 with no peak at or near the mean value 〈sin^{4}(θ/2)〉_{all}. Part (b) is for τ = 1 h, and the averaging smoothes out some of the variations but the full range of sin^{4}(θ/2) factors is still seen and the peak is again at sin^{4}(θ/2) = 0 with no peak at or near 〈sin^{4}(θ/2)〉_{all}. For τ = 6 h (part c), the distribution is broad and asymmetric with a mode value that is increased but is still less than 〈sin^{4}(θ/2)〉_{all}. Only for τ = 1 day is the mode value approximately the same as 〈sin^{4}(θ/2)〉_{all} and the distribution close to symmetric. The standard deviation of the distribution for τ of 1 day is σ = 0.42 × 〈sin^{4}(θ/2)〉_{all}. This means that the error in applying the overall mean 〈sin^{4}(θ/2)〉_{all} to 1day averages would give an error (at the 1σ level) of 42.0%. For τ = 27 days (part e) the distribution is centred on 〈sin^{4}(θ/2)〉_{all} but narrower in width, such that this 1σ error is 10.3%, and for τ = 1 yr (part f) this error is further reduced to 4.9%.
The reduction in the error caused by using the overall average value 〈sin^{4}(θ/2)〉_{all} with increasing τ is the main reason why annual means have been used in the reconstructions. Averaging over a year allows us to use 〈sin^{4}(θ/2)〉_{all}, for each annual value to within a 1σ uncertainty of just 5% and hence the unknown effect introduced by the IMF orientation is averaged to the constant value of 〈sin^{4}(θ/2)〉_{all} to within this small error.
Fig. 2 Distributions of the IMF orientation solar windmagnetosphere coupling function factor, sin^{4}(θ/2), for 1996–2016 (inclusive) for various averaging timescales, τ, where θ = tan^{−1}(B_{Y}/B_{Z}) where B_{Y} and B_{Z} are the Y and Zcomponents of the IMF in the Geocentric Solar Magnetospheric (GSM) reference frame: (a) τ = 5 min; (b) τ = 1 h; (c) τ = 6 h; (d) τ = 1 day; (e) τ = 27 days; (f) τ = 1 yr. N is the number of samples in each sin^{4}(θ/2) bin, the largest value of which for a given τ is N_{max}. In each panel the red dashed line is the overall mean for the whole dataset which, because the averaging intervals of length τ contain equal numbers of samples, equals the mean of each distribution (the “mean of means” or the “grand mean”). The raw B_{Y} and B_{Z} data used are of resolution 5 min and have been rounded up/down to 2 decimal figures. 
4 Power input into the magnetosphere
An equation to compute the power input from the solar wind into the magnetosphere, P_{α}, was derived by Vasyliunas et al. (1982): (1) where L_{o} is the mean radius of the magnetosphere as seen from the Sun, such that the magnetic field presents an area to the solar wind flow and is the dominant energy density in the solar wind, the bulkflow kinetic energy density, so that is the total energy flux incident on the magnetosphere (m_{sw} is the mean ion mass, N_{sw} the number density and V_{sw} the speed). The term t_{r} is a dimensionless transfer function, being the fraction of the incident energy flux that is transferred into the magnetosphere.
Assuming a hemispheric shape to the dayside magnetosphere, pressure balance at the nose of the magnetosphere gives (2) where k_{1} is a geometric factor for a bluntnosed object, M_{E} is Earth's magnetic dipole moment, μ_{o} is the magnetic constant, and P_{sw} is the solar wind dynamic pressure, given by (3)Vasyliunas et al. (1982) adopted the dimensionless transfer function: (4)where α is the “coupling exponent” (a free fit parameter), θ is the clock angle that the IMF makes with the north in the Earth's GSM frame of reference (as discussed above) and M_{A} is the Alfvén Mach number of the solar wind flow, related to m_{sw}, N_{sw}, V_{sw} and the IMF field strength B by (5)
Substituting equations (2)–(5) into (1) yields (6)
The interplanetary parameters B, N_{sw} and V_{sw} have been routinely measured by nearEarth, interplanetary craft. We here adopt a mean solar wind composition of 96% proton and 4% helium ions, giving m_{sw} = 1.12 a.m.u. The constant (k_{1}k_{2}) is here removed by normalising to the mean Pα for a reference period, Po. This also removes any dependence on the assumed m_{sw} − but this assumes that both (k_{1}k_{2}) and m_{sw} remain constant. Note that the assumption of constant m_{sw} is accurate to within a few percent. Consider the two dominant constituents of the thermal solar wind, namely alpha particles and protons: using the nearcontinuous number density data for these ions from near the L1 Lagrange point over 1996–2016, m_{sw} averages 1.102 a.m.u. and the standard deviation of 1hour values is 0.062 a.m.u. and in annual values is 0.023 a.m.u., which gives errors in using the constant value at the 1σ level of 5.63% and 2.09%, respectively. There is a clear but weak solar cycle variation in this m_{sw} estimate, with the largest annual mean being 1.139 a.m.u. (a deviation from the mean of +3.4%) in the year 2000 (sunspot maximum) and the lowest value being 1.050 a.m.u. (a deviation of −4.7%) in the low sunspot minimum year of 2009.
The Earth's dipole moment varies on the centennial timescales of interest to this paper. We use the dipole moment variation provided by the gufm1 model prior to 1900 (Jackson et al., 2000) and by the IGRF11 model after that date (Finlay et al., 2010): some intercalibration and smoothing is needed to prevent a discontinuity at 1900. One major and very important advantage of equation (6) is that there is just the one free fit parameter, α. It is possible, in principle, to include all of the factors in equation (6) with their own weighting factors and exponents and get extremely good fits that are, nevertheless, completely statistically meaningless and have no predictive capability. This pitfall is called “overfitting” and it is a serious but underappreciated problem in multiple regression analysis of geophysical time series that have internal “geophysical” noise. Overfitting refers to the situation when a fit has too many degrees of freedom and starts to approximate the noise in the training subset, which is not robust throughout the whole dataset. This is a recognised pitfall in areas where quasichaotic behaviours give large internal noise such as climate science and population growth (for example, Knutti et al., 2006; Knape & de Valpine, 2011) but is not recognised as widely in space physics where systems tend to be somewhat more deterministic with lower internal variability. The above theory constrains the relationship between the parameters, giving just one free fit parameter which greatly reduces the danger of overfitting.
The coupling exponent α influences almost all the factors in equation (6) but is unknown and has to be derived empirically. The remainder of this section looks at best practice in deriving the optimum α for the annual mean data available from the reconstructions discussed in the previous section. It is shown that, although compiling Pα from annual means of N_{SW}, V_{SW}, B and sin^{4}(θ/2) is a less satisfactory approach and does slightly alter the optimum α, it does not lower the correlation obtained, It is also shown that derivation of α is best done using the interplanetary data after 1996 which is continuous because data gaps have considerable influence.
To determine α, we need to assume a functional form for the dependence of a terrestrial space weather parameter or index on Pα. In general that form can be nonlinear and is likely to be different for different space weather indices. We here demonstrate the principles using planetary geomagnetic index, Ap, which like all such “range” indices shows a linear dependence on Pα on all timescales (Finch & Lockwood, 2007). This linearity will be demonstrated below.
The solid lines in Figure 3 show the linear correlation coefficients r between Ap and Pα, as a function of the α value used to compute Pα. These correlations are made using annual mean data but there is a complication here in that annual Pα values can be constructed in two different ways. The simplest is to take annual means of variables, M_{E}, B, N_{sw}, V_{sw} and sin^{4}(θ/2) and then put these into equation (6). We here term results from such an “averagethencombine” procedure as Pα_{ann}. The second way is to take high resolution data (we here use 5minute means), evaluate Pα for each and then average them. We here term the results of such a “combinethenaverage” procedure as 〈Pα〉_{τ=1yr} . The correlograms with Ap as a function of α are shown in Figure 3 for Pα_{ann} by the red solid line and for 〈Pα〉_{τ=1yr} by the black solid line. The peak correlations for the two occur at quite similar α values (0.53 for Pα_{ann} and 0.46 for 〈Pα〉_{τ=1yr} marked, respectively, by the red and black vertical solid lines). The peak is more marked for 〈Pα〉_{τ=1yr} as r falls rapidly with α above the peak, whereas it falls only slowly for Pα_{ann}. Both constructions generate extremely high peak correlations with Ap (r > 0.99), being almost unity in both cases (details are given by cases A and B in Table 2). The fraction of the variation of Ap “explained” by a linear dependence on these Pα estimates is r^{2} and hence more than 98.6% of the variation in annual Ap values can be explained by the annual mean power input to the magnetosphere, as estimated by either procedure. In general, potential nonlinearities make the combinethenaverage procedure preferable, but because only annual mean data are available in reconstructions of past behaviour, the averagethencombine procedure is needed. The fact that that both give extremely high peak correlations at similar α values shows us that using annual means of interplanetary parameters is acceptable. Table 2 also gives the confidence level S of each correlation evaluated by comparison against the autoregressive AR1 red noise model (thereby allowing for the “persistence” − also termed “conservation” or “autocorrelation” − in the two data series). In all cases S is less than 100% by only an extremely small amount.
The dashed lines in Figure 3 show the corresponding correlograms for Pα_{ann} and 〈Pα〉_{τ=1yr} against the fraction of time in each year that Ap is in the top 5% of its overall distribution of values (f[Ap > Apo] where Apo = 36 nT, the 95 percentile of the overall distribution of all daily means of Ap observations which began in 1932). The peak correlations are not as great as for 〈Ap〉_{τ=1yr} but are still very high, exceeding 0.95, and are for almost the same α that gives peak correlations with the mean values. This fact will be exploited further in a second paper in this series.
Figure 4 shows a scatter plot of the estimated normalised power input, Pα/Po, for the bestfit α against Ap for 1996–2016. The blue dots are for daily means 〈Pα〉_{τ=1day} and 〈Ap〉_{τ=1day} and use the bestfit α of 0.46. The best linear regression fit is the black line, given by Ap_{p} = 13.32(Pα/Po)−0.43. The correlation coefficient is 0.905 (case C in Table 2). The distribution of the fit residuals [Ap − Ap_{p}]_{τ=1day} is Gaussian (required for a valid linear regression) with a standard deviation of σ = 5.4 nT. The red dots show the annual means with Pα_{ann}/Po values computed from annual means of interplanetary parameters and using the best fit α of 0.53. The correlation coefficient is 0.994 (case B of Table A). The red line is the best fit linear regression to these points, Ap_{p} = 12.32(Pα/Po) − 0.01. The distribution of the fit residuals [Ap − Ap_{p}]_{τ=1yr} is also Gaussian with a standard deviation of σ = 1.1 nT.
The additional scatter in daily values is to be expected, physically as well as statistically, because the magnetosphere shows some persistence on timescales of several hours and so a daily value of Ap is expected to show some effect of energy input on the previous day (or maybe even a few days) as well as that on the day in question. This does not apply on annual timescales.
Figure 5 shows the annualresolution time series giving the correlations listed in Table 2. Panels (a) and (b) are for the period of almost continuous data coverage (1996–2016), whereas (c) and (d) are for the full interval of interplanetary observations, 1964–2016. Panels (b) and (d) give the fraction, f, of hours giving valid means of Pα (requiring over 45 out of the 60 1minute samples for each of B, V_{SW}, N_{SW} and θ available in each hour): (b) shows that f is close to unity after 1996, but (d) shows that availability is very low for many of the years before that date. A threshold value of f = 0.15 is shown by the dashed line. In panel (a), the green line is the observed annual means of Ap, 〈Ap〉_{τ=1yr}, the black line is the linearly regressed variation of 〈Pα〉_{τ=1yr} (〈Pα〉′_{τ=1yr} , the prime denoting it has been linearly regressed with 〈Ap〉_{τ=1yr}) and the red line is Pα′_{ann}. The extremely close agreement between the three is quantified by the nearunity correlation coefficients for cases A and B in Table 2. Underneath the green line, the variation of Ap for only data that is coincident with a valid Pα value, 〈Ap_{c}〉_{τ=1yr}, was also plotted; however, f is so close to unity throughout this interval that it cannot be seen. The way Ap_{c} was constructed was to linearly interpolate the 3hourly observed values of Ap into hourly values and then all of those hourly values that were not coincident with a valid Pα value (to within a propagation uncertainty of 15 min) were removed (“masked out”). The remaining values were then averaged over the year to give 〈Ap_{c}〉_{τ=1yr}.
Although 〈Ap〉_{τ=1yr} and 〈Ap_{c}〉_{τ=1yr} are essentially identical in Figure 5a but Figure 5c shows this is far from true for several years before 1996. For example, between 1988 and 1996 there is a considerable difference and although the green line shows the true variation of Ap, the blue line (which we must correlate the means of Pα with) is considerably different. Note that the means of 〈Ap_{c}〉_{τ=1yr} and 〈Pα〉′_{τ=1yr} (black line) are not shown for some years because f is too low to form a valid mean: the threshold availability for such means was set at f = 0.15, shown by the horizontal dashed line in 5d. The correlation between the blue and black lines in 5d, given by case E in Table 2, is exceptionally high. Repeating this correlation for the averagethencombine estimates, Pα_{ann}, very slightly increases the correlation, as it did for the post1996 period (case F). Lastly the red line shows the results of correlating the Pα_{ann} estimates with the Ap means, without first removing the noncoincident data (case D in Table 2). There are now enough years with f ≈ 1 to make the correlation very high but it is reduced and this is the only correlation that does not give a confidence level S that is not 100% to within 3 decimal places.
This analysis of the correlation of the power input to the magnetosphere, Pα, tuned with the coupling exponent α to give the best fit to the Ap geomagnetic data, shows that compiling Pα from annual means of N_{SW}, V_{SW}, B and θ does not lower the correlation obtained, although it is the less satisfactory approach. It does very slightly increase the optimum α from 0.46 to 0.53. This is important as it means we can use reconstructed annual means of these parameters to derive a valid reconstruction of annual means of Pα (using α = 0.53). The analysis also shows that the correlations found for the nearcontinuous Pα data after 1996 also apply to all the available data, provided data gaps in the Pα data series are also masked out in the Ap data series so one compares likewithlike.
Fig. 3 Correlation coefficients, r, as a function of α for the interval of nearcontinuous Pα data (1996–2016). Solid lines are for correlations between annual Pα and Ap and dashed lines are for correlations between Pα and the fraction of days when Ap exceeds the overall 95 percentile level, f_{[Ap>Apo]}. The black lines use annual means (τ = 1 yr) of Pα values computed from 5minute means of interplanetary data, 〈Pα〉_{τ=1yr} (i.e., the data are combined and then averaged) and the peak value (marked by the vertical solid black line) gives correlation A in Table 2. The red lines use Pα_{ann} which is compiled using annual means of all available N_{SW}, V_{SW}, B and sin^{4}(θ/2) data (i.e., the data are averaged and then combined) and the peak values gives correlation B in Table 2. The 95percentile of the overall distribution of all Ap values (observed since the data series began in 1932) is Apo = 36 nT. 
Details of correlations between the Ap geomagnetic index and power input to the magnetosphere Pα.
Fig. 4 Scatter plots of Ap_{c} as a function of normalised Pα values (for the best fit α of 0.46) based on 1hour resolution interplanetary data for 1964–2016. Po is the average of Pα for all the available data from 1964–2016, and Ap_{c} are hourly Ap values (linearly interpolated from the observed 3hourly Ap values) that are coincident with a valid hourly Pα value. The blue dots are for daily means (τ = 1 day) and the red dots for annual means (τ = 1 year). The black line is the linear regression fit to the τ = 1 day data and gives the correlation C in Table 2, the red line is the linear regression fit to the τ = 1 year data and gives the correlation D in Table 2. 
Fig. 5 Time series of observed Ap and bestfits to Ap using Pα estimates, both using annual averaging intervals. The fitted series use linear regressions and are denoted by a prime: tests for homoskedasticity, normal distribution of residuals, linearity are all readily passed because correlation coefficients are so high. (a) and (b) are for 1996–2016, (c) and (d) for 1964–2016. Panels (b) and (d) show the data availability, giving the fraction f of hours for which an hourly mean value of Pα is available. In (a) and (c) the green line gives the means of all the Ap data, 〈Ap〉_{τ=1yr} and the blue line gives the means of only the Ap data that are coincident with valid Pα data 〈Ap_{c}〉_{τ=1yr}. (Note that in (a) the blue line is essentially identical to the green and so is completely overwritten by it). In (a) the black line uses annual means (τ = 1 yr) of Pα values computed from 5minute means of interplanetary data, 〈Pα〉′_{τ=1yr} (i.e., the data are combined and then averaged) and is based on the correlation A in Table 2. The red line uses Pα_{ann} which is compiled using annual means of all available N_{SW}, V_{SW}, B and sin^{4}(θ/2) data (i.e., the data are averaged and then combined) and is based on correlation B in Table 2. In panel (d) the black line uses 〈Pα〉_{τ=1yr} and 〈Ap_{c}〉_{τ=1yr} and is based on correlation E in Table 2 whereas the red line uses Pα_{ann} and 〈Ap_{c}〉_{τ=1yr} and is based on correlation D in Table 2. 
5 The relation of distributions of daily values to annual means
Section 3 stresses how important averaging timescale τ can be by looking at the behaviour of the sin^{4}(θ/2) term in Pα. In this section we investigate the annual distributions of daily values (τ = 1 day) of both Ap and Pα and how they relate to the annual means (τ = 1 year).
The colour pixels in the middle panel of Figure 6 show the annual probability density function (pdf, evaluated in Ap bins 1 nT wide) of daily Ap values, 〈Ap〉_{τ=1day}, as vertical slices and as a function of year along the horizontal axis. The black line gives the annual means values, 〈Ap〉_{τ=1yr} (shown by the green lines in Figure 5). The fraction of time for which the Ap data are available (top panel) is unity for all years. The bottom panel shows the pdfs of 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} (evaluated in bins 0.1 wide). The black line shows the means of these normalised Ap distributions which, by definition, are always unity. What is striking is how uniform the distributions of 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} have been, despite the variation in 〈Ap〉_{τ=1yr}. There is a slight tendency for the mode value to increase slightly toward the mean value in the sunspot minimum years 1996 and 2009, but this is slight and overall these normalised distributions are remarkably similar. The implications of this will be discussed further and exploited in a second paper (“Paper 2”).
This is of interest here as the good correlations between Ap and Pα discussed in the previous section suggest the same should be true for the magnetospheric power input estimate Pα. This is investigated in Figure 7, which is the same as Figure 6 for Pα. The complication is that, unlike Ap, the data availability f is often much smaller than unity, and insufficient samples mean that the pdfs become very noisy. We here require that f (shown in the top panel), exceed 0.5 to construct a valid pdf. This is a compromise between the noise levels in the pdfs and the number of years available: the choice of f > 0.5 allows us to use data from 31 of the 53 years in the interval 1964–2016. The pdfs of 〈Pα〉_{τ=1day}/Po are shown in the middle panel of Figure 7 (evaluated for bins 0.1 wide) and those for 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} (also evaluated for bins 0.1 wide) in the bottom panel. Where sufficient data are available (f > 0.5) the distributions of the normalised daily Pα in the bottom panel of Figure 7 are remarkably constant in form. Indeed they are more so than for Ap as even the slight increase in the mode value for 2009 cannot be detected, although it is (just) detectable for 1996.
The grey histograms in Figure 8 present the overall distributions (all data from all years 1964–2016) of (a) 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} and (b) 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr}. In both cases a lognormal distribution has been fitted and is shown by the magenta line. By definition, both these distributions have arithmetic mean values of unity and so there is only one fit parameter, the variance, v. (Note that the mean and variance determine the other pair of logarithmic parameters, μ and σ, that can be used to define the lognormal distribution). It can be seen that both distributions are very close to lognormal. The distributions for individual years (the vertical slices in the bottom panels of Figs. 6 and 7) are very similar and give essentially the same variance v values as for the overall distributions. For Ap the overall distribution gives v = 0.934, whereas the mean v for the 53 years individually is 0.934 ± 0.031 (the uncertainty being the standard deviation). For Pα, the overall distribution gives v = 0.913, whereas the mean v values for the 31 years with f > 0.5 is 0.915 ± 0.025 (the uncertainty again being 1 standard deviation).
From this analysis we conclude that the distributions shown in Figure 8 apply, to a good approximation, to all years individually. This has been shown to be true for the high 〈Pα〉_{τ=1yr}/Po solarmaximum years, down to the lowest value seen since interplanetary measurement began, which was detected in 2009.
The nearconstancy of the shape of the distributions of 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} has an interesting implication: variations in the annual means will be reflected in the variations in the numbers of days in the year that Pα will exceed any fixed and large threshold that is used to define “large” events . The variation will not, in general, be linear but it will be monotonic. To test this implication, Figure 9 looks at the variation of the fraction of days f_{p}[〈Pα〉_{τ=1day} ≥ Pα_{pc}] on which 〈Pα〉_{τ=1day} exceeds or equals a threshold value Pα_{pc}. This is shown as a function of 〈Pα〉_{τ=1yr}/Po for two fixed thresholds: Pα_{pc} = 2.38Po (which defines the 95th percentile of the cumulative probability distribution of all available 〈Pα〉_{τ=1day} values from the interval 1964–2016); and Pα_{pc} = 3.77Po (which defines the 99th percentile of the same cdf). Only the 31 years with data availability f > 0.5 from the interval 1964–2016 are used, which gives a total of 10186 daily means (out of the total available for the full interval of 12198). Figure 9 shows the annual fraction of daily mean Pα values that are in the top 5% (solid points) and top 1% (open triangles) of all the 12198 available daily means. These data use the optimum α of 0.53. The solid line is a third order polynomial fit to the data points for Pα_{pc} = 2.38Po and the dashed line is that for Pα_{pc} = 3.77Po. Both reveal that fp shows an increasing trend with the annual mean value, but it is not, in general, linear. The smaller numbers associated with higher percentiles of the cdf mean that the scatter is increased. The variations shown in Figure 9 have an important implication because they mean that information about the annual averages, which can be reconstructed as described in Section 2, gives us information about the number of days of high power input into the magnetosphere that leads to them being disturbed space weather days.
Fig. 6 Analysis of Ap distributions over 1964–2016. (Top) The fraction of time in each year, f, for which a valid Ap is available. (Middle) Annual probability distribution function (pdfs) of dailymean Ap (〈Ap〉_{τ=1day} ) are colour coded and the black line gives the annual means of Ap, 〈Ap〉_{τ=1yr} . Probabilities are evaluated in bins of Ap that are 1 nT wide. (Bottom) Annual probability distribution function (pdfs) of normalised daily Ap, 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} , are colour coded and the black gives the annual means of this normalised Ap which are unity. Probabilities are evaluated in bins of normalised Ap that are 0.1 wide. 
Fig. 7 Same as Figure 5 for the estimated power input into the magnetosphere, Pα. (Top) The fraction of time in each year, f, for which a valid Pα value is available. (Middle) Annual probability distribution function (pdfs) of daily means of Pα/Po (〈Pα〉_{τ=1day}/Po ) are colour coded and the black line gives the annual means of 〈Pα〉_{τ=1yr}/Po . Probabilities are evaluated in bins of Pα/Po that are 0.1 wide and years for which f < 0.5 are omitted. (Bottom) Annual probability distribution function (pdfs) of normalised daily Pα, 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} , are colour coded and the black gives the annual means of this normalised Pα which is unity. Probabilities are evaluated in bins of normalised Pα that are 0.1 wide. 
Fig. 8 (Grey histograms) Average pdfs for all years of (a) 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} and (b) 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} . In each case the bestfit lognormal distribution is shown, with mean of unity (by definition) and a variance v = 0.934 in (a) and v =0.913 in (b). Distributions for individual years (not shown) are a bit noisier, particularly for 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} in years when data availability f is low, but give very similar best fit v values, as indicated by the horizontal uniformity of the bottom panels of figures (6) and (7): for the annual normalised Ap distributions, the mean ± 1 standard deviation of the fitted v values for the 53 years is 0.934 ± 0.031; for the annual normalised Pα distributions, v values for the 31 years with f > 0.5 is 0.915 ± 0.025. 
Fig. 9 Variation of the fraction f_{p} of daily means of Pα in one year that are in the top 5% (solid points) and top 1% (open triangles) of all the 12198 available daily means of Pα for the interval 1964–2016, shown as a function of the annual mean (〈Pα〉_{τ=1yr}/Po). These data use the optimum α of 0.53. The annual fraction f_{p}[〈Pα〉_{τ=1day} ≥ Pα_{pc}] is shown for two fixed thresholds: Pα_{pc} = 2.38Po (which defines the 95th percentile of the cumulative probability distribution of all available 〈Pα〉_{τ=1day} values from the interval 1964–2016 which number 12198); and Pα_{pc} = 3.77Po (which defines the 99th percentile of the same cdf). Points for these two thresholds are shown by solid circles and open triangles, respectively. Only the 31 years in which data availability f exceeds 50% are shown which account for 10186 samples out of the available total of 12198 (the other years show the same trend but the smaller number of samples increases the scatter). The solid line is a third order polynomial fit to the data points for Pα_{pc} = 2.38Po and the dashed line is that for Pα_{pc} = 3.77Po. Both reveal that fp shows an increasing trend with the annual mean value, but it is not, in general, linear. The smaller numbers associated with higher percentiles of the cdf mean that the scatter is increased. 
6 The variation of power input into the magnetosphere since 1612
The model reconstructions of Owens et al. (2017) give the annual mean values of N_{SW}, V_{SW} and B needed to evaluate Pα from equation (6). Figure 2 shows that on annual time scales we can use a constant value of 〈sin^{4}(θ/2)〉_{all} = 0.355 to an accuracy of a bit better than 5%. Other values are removed by normalising and we here divide by Po, the mean of Pα for the period 1964–2016 (as used in previous sections). The Earth's magnetic moment M_{E} is obtained from a spline of the IGRF and gufm1 models, as described earlier.
Figure 10 shows the time series of modelled annual values on nearEarth interplanetary parameters over 1612–2015 generated by Owens et al. (2017). Panel (a) shows the input sunspot number data series, R_{C}. The results do depend on which of the sunspot composite data series discussed in the introduction was adopted, but the study by Owens et al. (2017) shows that this sensitivity is not great.
The time series of annual Pα computed from equation (6) with these model values depends on the coupling function α used. Much of the previous sections was based on α = 0.53 as this gives Pα that is proportional to Ap and predicts the Ap index to within 1 nT on annual timescales. However, this does not mean the same value of α would apply to all terrestrial disturbance indices as they will have different responses to a given energy input into the magnetosphere. Hence we here study the implications of a range of α values in Figure 11.
Negative values of α have an obvious implication in that they generate solar cycle variations of Pα that are in antiphase with the known solar cycles. Thus we can immediately discount their use as predictors of space weather conditions. Figure 11a shows the variation for α = 0. From equation (6) this yields a Pα that varies as and , but is independent of the IMF B. This combination yields solar cycle variations that are almost flat, the main feature being a spike in the declining phase of the cycles when average solar wind speeds are enhanced by more frequent fast solar wind streams emanating from coronal holes at lower latitudes. The longterm variation is almost flat with only a slight depression during the Maunder minimum. Figure 11b is for α = 1/3. From equation (6) this yields a Pα that varies as , and B^{2/3}. This yields small but recognisable solar cycles and a clear longerterm variation with a drift since the Maunder minimum that exceeds the amplitudes of recent solar cycles. This drift is clear in both sunspot minimum and sunspot maximum values. Figure 11c is for α = 2/3 which yields a Pα that varies as and B^{4/3} but is independent of N_{sw}. This yields a longterm drift in solarcycle averages that is similar in magnitude to the amplitude of the recent cycles, but the drift in sunspot minimum values is smaller than that in sunspot maximum values. This dependence on α continues with increasing α as shown by Figure 11d for α = 1 (that gives variations with , and B^{2}), Figure 11e for α = 4/3 (that gives variations with V_{sw}, and B^{8/3}) and Figure 11f for α = 5/3 (that gives variations with , and B^{10/3}). In Figure 11f, the variation is much greater in the solar maximum values than the solar minimum values, such that the overall variation rather resembles that in sunspot numbers. Note that for the larger α values, the power input decreases with increased N_{sw} because the rise in energy density in the solar wind with N_{sw} is smaller than the effect of reduced crosssectional area of the magnetosphere intersecting the solar wind flow. These two effects balance at α = 2/3 which is somewhat greater than all of the bestfit α values shown in Table 2. Hence although values are close enough to 2/3 to make the dependence on N_{SW} weak, the dominant effect is that of the solar wind energy density and energy input to the magnetosphere (weakly) increases with increased solar wind number density.
The value of the coupling exponent derived here to fit Ap data of α = 0.53 gives factors of B^{1.06}, , and . This is very close to the dependence of found by Lockwood (2013) (without using Pα) for “range” geomagnetic indices, where n = 1.9 for the aa index, n = 1.8 for the Am index, and n = 1.6 for the Ap index. The black line in Figure 12 shows the centennial variation of 〈Pα〉_{τ=1yr}/Po for α =0.53. This is superposed on coloured pixels giving the corresponding pdf of 〈Pα〉_{τ=1day}/Po evaluated annually and in bins of 0.01. These are computed using the distribution of 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} shown in Figure 8b which in Section 5 was shown to be valid for individual years. The pdfs are evaluated using the fitted lognormal distribution for values of 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} that are 10^{−3} apart. These are then rescaled as 〈Pα〉_{τ=1day}/Po values using the 〈Pα〉_{τ=1yr} value for the year in question and then averages of the pdf taken over bins of 〈Pα〉_{τ=1day}/Po that are 0.01 wide. The high preponderance of quiet days during deep solar activity minima when 〈Pα〉_{τ=1yr} is low is shown by red and brown patches. The pale yellow pixels at high values show the variation of the probability of large space weather events when energy input into the magnetosphere is high.
Fig. 10 Time series of modelled annual values on nearEarth interplanetary parameters over 1612–2015 by Owens et al. (2017): (a) the input sunspot number data series, R_{C}; (b) the IMF field strength, B; (c) the solar wind speed, V_{SW}; (d) the solar wind number density, Nsw. Panel (e) shows the Earth's magnetic dipole moment, M_{E}, from a spline of the gufm1 (for before 1900) and IGRF (for after 1900) models . 
Fig. 11 Predicted variations of annual Pα/Po values for six values of the coupling exponent α: (a) α = 0; (b) α = 1/3; (c) α = 2/3; (d) α =1; (e) α = 4/3; and (f) α = 5/3. 
Fig. 12 Annual pdfs and mean values of the normalised power input to the magnetosphere, Pα/Po for 1612–2015 and for the bestfit coupling exponent for the Ap index of 0.53. The colour contours give the pdf, evaluated for bins in Pα/Po that are of width 0.01. The Po used here is the average value of Pα from spacecraft measurements over 1964–2016 (inclusive). 
7 Discussion and conclusions
We have presented the first analysis of the variation of power input into Earth's magnetosphere over the last 400 years. This analysis has reconstructed not only the annual mean values, but also the annual probability distribution of daily means. Hence the reconstructions give both the average space climate, but also the probabilities of various levels of space weather conditions in the past.
The power input is not given here in absolute units of Watts, but as ratio of the overall mean for all observations in the space age (1964–2016), Po, partly because this removes the requirement to quantify the constants k_{1} (the flow factor for a bluntnosed object) and k_{2} (the amplitude of the transfer function term), and partly because it cancels the dependence on the assumed constant average solar wind ion mass. We have shown how the derived 〈Pα〉/Po values depend on α but have concentrated on α = 0.53 which gives the best fit to the Ap geomagnetic index when annual mean data is used. Paper 2 in this series will carry this work forward and make reconstructions of geomagnetic activity indices and so look at the occurrence of geomagnetic storms and substorms over the past 400 years. We here note that the value of α = 0.53 is a close match to the dependence derived by Lockwood (2013) of a great many geomagnetic indices (specifically Ap, Kp, IHV, Am, AL and AE). These all show a dependence that is close to . Other indices, such as Dst, IDV and m, depend almost solely on B. Lockwood (2013) has explained the origin of the term for the first of these two sets of indices as being due to their sensitivity to the substorm current wedge and predicts that the total current disrupted from the nearEarth crosstail current and deflected down into the auroral electrojet in the current wedge will depend on the solar wind dynamic pressure and hence, at least approximately, on . Hence the expectation is that geomagnetic indices that respond strongly to substorm cycles of solar wind extraction of energy from the solar wind, its storage in the geomagnetic tail and its subsequent energy deposition in upper atmospheric currents, will correlate well with Pα for α somewhere between about 1/3 (which would give a dependence) and 1/2 (which would give a dependence). Paper 2 will study the long term variations in these geomagnetic indices, based on their high correlation with, and linear and proportional dependence on Pα. Other indices, in particular, the Dst index, do not show this proportionality with Pα and the implication for Dst and ring current energetic particles will be discussed in later papers.
Acknowledgements
The authors are grateful to the staff of Space Physics Data Facility, NASA/Goddard Space Flight Centre, who prepared and made available the OMNI2 dataset used. The data were downloaded from http://omniweb.gsfc.nasa.gov/ow.html. The geomagnetic data were obtained from a variety of data centres including the UK Solar System Data Centre at RAL (https://www.ukssdc.ac.uk/), The British Geological Survey (http://www.geomag.bgs.ac.uk/data_service/data/home.html), The World Data Centre WDC for Geomagnetism, Kyoto (http://wdc.kugi.kyotou.ac.jp/aedir/). The work of ML and MJO is supported by the SWIGS NERC Directed Highlight Topic Grant number NE/P016928/1 and the work of LAB, CEW and CJD by a STFC consolidated grant number ST/M000885/1. The editor thanks two anonymous referees for their assistance in evaluating this paper.
References
 Altschuler MA, Newkirk G. 1969. Magnetic fields and the structure of the solar corona. Sol Phys 9: 131–149. DOI:10.1007%2FBF00145734. [NASA ADS] [CrossRef] (In the text)
 Asvestari E, Usoskin IG. 2016. An empirical model of heliospheric cosmic ray modulation on longterm time scale. J Space Weather Space Clim 6: A15. DOI:10.1051/swsc/2016011. [CrossRef] [EDP Sciences] (In the text)
 Asvestari E, Usoskin IG, Cameron RH, Krivova NA. 2016. Semiempirical longterm reconstruction of the heliospheric parameters: validation by cosmogenic radionuclide records. In: Dorotovič I, Fischer CE, Temmer M, eds. GroundBased Solar Observations in the Space Instrumentation Era. ASP Conference Series, Vol. 504. Astronomical Society of the Pacific, http://aspbooks.org/custom/publications/paper/5040269.html. (In the text)
 Asvestari E, Usoskin IG, Kovaltsov GA, Owens MJ, Krivova NA, Rubinetti S, Taricco C. 2017. Assessment of different sunspot number series using the cosmogenic isotope 44Ti in meteorites. Mon Not R Astron Soc 467(2): 1608–1613, DOI:10.1093/mnras/stx190. (In the text)
 Babcock HW. 1953. The solar magnetograph. Astrophys J 118: 387–396, DOI:10.1086/145767. [NASA ADS] [CrossRef] (In the text)
 Balogh A, Beek TJ, Forsyth RJ, Hedgecock PC, Marquedant RJ, Smith EJ, Southwood DJ, Tsurutani BT. 1992. The magnetic field investigation on the Ulysses mission – instrumentation and preliminary scientific results. Astron Astrophys Suppl Ser 92(2): 221–223, ISSN: 03650138. (In the text)
 Biermann L, Brosowski B, Schmidt HU. 1967. The interaction of the solar wind with a comet. Sol Phys 1(2): 254–284, DOI:10.1007/BF00150860. [NASA ADS] [CrossRef] (In the text)
 Chapman S, Ferraro VC. 1931. A new theory of magnetic storms. Terr Mag Atmos Elect 36(2): 77–97, DOI:10.1029/TE036i002p00077. [CrossRef] (In the text)
 Clette F, Svalgaard L, Vaquero JM, Cliver EW. 2014. Revisiting the Sunspot Number: a 400year perspective on the solar cycle. Space Sci Rev 186(1–4): 35–103, DOI:10.1007/s1121401400742. [NASA ADS] [CrossRef] (In the text)
 Couzens DA, King JH. 1986. Interplanetary Medium Data Book – Supplement 3, 1977–1985. Greenbelt, MD: NSSDC/WDCA/R&S, 8604, NASA, http://www.archive.org/details/nasa_techdoc_19890001401. (In the text)
 Finch ID, Lockwood M. 2007. Solar windmagnetosphere coupling functions on timescales of 1 day to 1 year. Ann Geophys 25: 495–506, DOI:10.5194/angeo254952007. [CrossRef] (In the text)
 Finlay CC, Maus S, Beggan CD, Bondar TN, Chambodut A, Chernova, T. A., Chulliat A, Golovkov VP, Hamilton B, Hamoudi M, Holme R, Hulot G, Kuang W, Langlais B, Lesur V, Lowes FJ, Lühr H, Macmillan S, Mandea M, McLean S, Manoj C, Menvielle M, Michaelis I, Olsen N, Rauberg J, Rother M, Sabaka TJ, Tangborn A, TøffnerClausen L, Thébault E, Thomson AWP, Wardinski I, Wei Z, Zvereva TI. 2010. International Geomagnetic Reference Field: the eleventh generation. Geophys. J. Int., 183, 1216–1230, DOI: 10.1111/j.1365246X.2010.04804.x. (In the text)
 Feynman J, Crooker NU. 1978. The solar wind at the turn of the century. Nature 275: 626–627, DOI:10.1038/275626a0. [CrossRef] (In the text)
 Hale GE. 1908. On the probable existence of a magnetic field in sunspots. Astrophys J 28: 315–347, DOI:10.1086/141602. [NASA ADS] [CrossRef] (In the text)
 Hayakawa H, Tamazawa H, Kawamura AD, Isobe H. 2015. Records of sunspot and aurora during CE 960–1279 in the Chinese chronicle of the Sòng dynasty, Earth Planets Space 67: 82, DOI:10.1186/s406230150250y. [CrossRef] (In the text)
 Holappa L, Mursula K. 2015. Toward more reliable longterm indices of geomagnetic activity: correcting a new inhomogeneity problem in early geomagnetic data. J Geophys Res Space Phys, 120, 8288–8297, DOI: 10.1002/2015JA021752. [CrossRef] (In the text)
 Hoyt DV, Schatten KH. 1998. Group sunspot numbers: a new solar activity reconstruction. Sol Phys 181(2): 491–512, DOI:10.1023/A:1005056326158. [NASA ADS] [CrossRef] (In the text)
 King HC. 1955. The history of the telescope. Mineola, New York: Courier Corporation, Dover Publications, ISBN 0486432653. (In the text)
 Knape J, de Valpine, P. 2011. Effects of weather and climate on the dynamics of animal population time series. Proc R Soc Lond B: Biol Sci 278(1708): 985–992, DOI:10.1098/rspb.2010.1333. [CrossRef] (In the text)
 Knutti R, Meehl GA, Allen MR, Stainforth DA. 2006. Constraining climate sensitivity from the seasonal cycle in surface temperature. J Clim 19(17): 4224–4233, DOI:10.1175/JCLI3865.1. [CrossRef] (In the text)
 Jackson A, Jonkers AR, Walker MR. 2000. Four centuries of geomagnetic secular variation from historical records, Philos. T. Roy. Soc. A, 358, 957–990, DOI: 10.1098/rsta.2000.0569. (In the text)
 Lockwood M. 2003. Twentythree cycles of changing open solar flux. J Geophys Res 108(A3): 1128, DOI:10.1029/2002JA009431. [CrossRef] (In the text)
 Lockwood M. 2013. Reconstruction and prediction of variations in the open solar magnetic flux and interplanetary conditions. Living Rev Solar Phys 10(4), DOI:10.12942/lrsp20134. [CrossRef] (In the text)
 Lockwood M, Owens MJ. 2011. Centennial changes in the heliospheric magnetic field and open solar flux: the consensus view from geomagnetic data and cosmogenic isotopes and its implications. J Geophys Res 116: A04109, DOI:10.1029/2010JA016220. (In the text)
 Lockwood M, Owens MJ. 2014. Centennial variations in sunspot number, open solar flux and streamer belt width: 3. Modelling. J Geophys Res Space Phys 119(7): 5193–5209, DOI:10.1002/2014JA019973. [CrossRef] (In the text)
 Lockwood M, Stamper R, Wild MN. 1999. A doubling of the sun's coronal magnetic field during the last 100 years. Nature 399: 437–439, DOI:10.1038/20867. [NASA ADS] [CrossRef] (In the text)
 Lockwood R, Forsyth B, Balogh A, McComas DJ. 2004. Open solar flux estimates from nearEarth measurements of the interplanetary magnetic field: comparison of the first two perihelion passes of the Ulysses spacecraft. Ann Geophys 22: 1395–1405, DOI:10.5194/angeo2213952004. [NASA ADS] [CrossRef] (In the text)
 Lockwood M, Rouillard AP, Finch ID, Stamper R. 2006. Comment on “The IDV index: its derivation and use in inferring longterm variations of the interplanetary magnetic field strength” by Svalgaard and Cliver. J Geophys Res 111: A09109, DOI:10.1029/2006JA011640. (In the text)
 Lockwood M, Owens M, Rouillard AP. 2009a. Excess open solar magnetic flux from satellite data: 2. A survey of kinematic effects. J Geophys Res 114: A11104, DOI:10.1029/2009JA014450. (In the text)
 Lockwood M., Rouillard AP, Finch ID. 2009b. The rise and fall of open solar flux during the current grand solar maximum, Astrophys. J., 700(2), 937–944, DOI: 10.1088/0004637X/700/2/937. (In the text)
 Lockwood M, Nevanlinna H, Barnard L, Owens MJ, Harrison RG, Rouillard AP, Scott CJ. 2014a. Reconstruction of geomagnetic activity and nearearth interplanetary conditions over the past 167 years: 4. Nearearth solar wind speed, IMF, and open solar flux. Ann Geophys 32: 383–399, DOI:10.5194/angeo323832014. [NASA ADS] [CrossRef] (In the text)
 Lockwood M, Owens MJ, Barnard L. 2014b. Centennial variations in sunspot number, open solar flux, and streamer belt width: 1. Correction of the sunspot number record since 1874. J Geophys Res Space Phys 119, DOI:10.1002/2014JA019970. (In the text)
 Lockwood M, Owens J, Barnard L. 2014c. Centennial variations in sunspot number, open solar flux and streamer belt width: 2. Comparison with geomagnetic data. J Geophys Res Space Phys 119(7), 5183–5192, DOI:10.1002/2014JA019972. [CrossRef] (In the text)
 Lockwood M, Nevanlinna H, Vokhmyanin M, Ponyavin D, Sokolov S, Barnard L, Owens MJ, Harrison RG, Rouillard AP, Scott CJ. 2014d. Reconstruction of geomagnetic activity and nearearth interplanetary conditions over the past 167 years: 3. Improved representation of solar cycle 11. Ann Geophys 32: 367–381, DOI:10.5194/angeo323672014. [CrossRef] (In the text)
 Lockwood M, Owens MJ, Barnard ELA, Usoskin IG. 2016a. An assessment of sunspot number data composites over 1845–2014. Astrophys J 824: 54, DOI:10.3847/0004637X/824/1/54. [NASA ADS] [CrossRef] (In the text)
 Lockwood M, Owens MJ, Barnard LA, Usoskin IG. 2016b. Tests of sunspot number sequences: 3. Effects of regression procedures on the calibration of historic sunspot data. Sol Phys 291: 2829–2841, DOI:10.1007/s1120701508292. [NASA ADS] [CrossRef] (In the text)
 Lockwood M, Owens MJ, Barnard LA, Bentley S, Scott CJ, Watt CE. 2016c. On the origins and timescales of geoeffective IMF. Space Weather 14: 406–432, DOI:10.1002/2016SW001375. [CrossRef] (In the text)
 Mackay DH, Yeates AR. 2012. The Sun's global photospheric and coronal magnetic fields: observations and models. Living Rev Solar Phys 9(6), DOI:10.12942/lrsp20126. [CrossRef] (In the text)
 McCracken KG, Beer J. 2015. The annual cosmicradiation intensities 1391–2014; the annual heliospheric magnetic field strengths, 1391–1983; and identification of solar cosmic ray events in the cosmogenic record 1800–1983. Sol Phys 290(10): 3051–3069, DOI:10.1007/s112070150777x. [CrossRef] (In the text)
 Matthes K, Funke B, Andersson ME, Barnard L, Beer J, Charbonneau P, Clilverd MA, Dudok de Wit T, Haberreiter M, Hendry A, Jackman CH, Kretzschmar M, Kruschke T, Kunze M, Langematz U, Marsh DR, Maycock A, Misios S, Rodger CJ, Scaife AA, Seppälä A, Shangguan M, Sinnhuber M, Tourpali K, Usoskin I, van de Kamp M, Verronen PT, Versick S. 2016. Solar Forcing for CMIP6 (v3.1). Geosci Model Dev Discuss, in press, DOI:10.5194/gmd201691. (In the text)
 Nelder JA, Mead R. 1965. A simplex method for function minimization. Comput J 7: 308–313, DOI:10.1093/comjnl/7.4.308. [CrossRef] (In the text)
 North J. 1974. Thomas Harriot and the first telescopic observations of sunspots. In: J. Shirley, ed. Thomas Harriot: Renaissance scientist. Oxford Clarendon Press, pp. 129–165. (In the text)
 Owens MJ, Forsyth RJ. 2013. The heliospheric magnetic field. Living Rev Sol Phys 10(5), DOI:10.12942/lrsp20135. [CrossRef] (In the text)
 Owens MJ, Lockwood M. 2012. Cyclic loss of open solar flux since 1868: the link to heliospheric current sheet tilt and implications for the Maunder Minimum. J Geophys Res 117: A04102, DOI:10.1029/2011JA017193. [CrossRef] (In the text)
 Owens MJ, Arge CN, Crooker NU, Schwadron NA, Horbury TS. 2008. Estimating total heliospheric magnetic flux from singlepoint in situ measurements. J Geophys Res 113: A12103, DOI:10.1029/2008JA013677. (In the text)
 Owens MJ, Cliver E, McCracken K, Beer J, Barnard LA, Lockwood M, Rouillard AP, Passos D, Riley P, Usoskin IG, Wang YM. 2016a. Nearearth heliospheric magnetic field intensity since 1800. Part 1: Sunspot and geomagnetic reconstructions. J Geophys Res 121(7): 6048–6063, DOI:10.1002/2016JA022529. [CrossRef] (In the text)
 Owens MJ, Cliver E, McCracken K, Beer J, Barnard LA, Lockwood M, Rouillard AP, Passos D, Riley P, Usoskin IG, Wang YM. 2016b. Nearearth heliospheric magnetic field intensity since 1800. Part 2: cosmogenic radionuclide reconstructions. J Geophys Res 121(7): 6064–6074, DOI:10.1002/2016JA022550. [CrossRef] (In the text)
 Owens MJ, Crooker NU, Lockwood M. 2011. How is open solar magnetic flux lost over the solar cycle? J Geophys Res 116: A04111, DOI:10.1029/2010JA016039. [CrossRef] (In the text)
 Owens MJ, Lockwood M, Riley P. 2017. Global solar wind variations over the last four centuries. Nat Sci Rep 7: Article number 41548, DOI:10.1038/srep41548. (In the text)
 Parker EN. 1958. Dynamics of the interplanetary gas and magnetic fields. Astrophys J 126: 664–676, DOI:10.1086/146579. [NASA ADS] [CrossRef] (In the text)
 Parker EN. 2001. A history of early work on the heliospheric magnetic field. J Geophys Res 106(A8): 15797–15801, DOI:10.1029/2000JA000100. [CrossRef] (In the text)
 Rahmanifard F, Schwadron NA, Smith CW, McCracken KG, Duderstadt KA, Lugaz N, Goelzer ML. 2017. Inferring the heliospheric magnetic field back through Maunder minimum. Astrophys J 837(2), 165/1–165/14, DOI:10.3847/15384357/aa6191. [CrossRef] (In the text)
 Rouillard, AP, Lockwood M, Finch I. 2007. Centennial changes in the solar wind speed and in the open solar flux. J. Geophys. Res., 112, A05103, DOI: 10.1029/2006JA012130. (In the text)
 Schatten KH, Wilcox JM, Ness NF. 1969. A model of interplanetary and coronal magnetic fields, PFSS method. Sol Phys 6, 442–455, DOI:10.1007/BF00146478. [NASA ADS] [CrossRef] (In the text)
 Smith EJ, Balogh A. 1995. Ulysses observations of the radial magnetic field. Geophys Res Lett 22: 3317–3320, DOI:10.1029/95gl02826. [NASA ADS] [CrossRef] (In the text)
 Solanki SK, Schüssler M, Fligge M. 2000. Evolution of the Sun's largescale magnetic field since the Maunder Minimum. Nature 408: 445–447, DOI:10.1038/35044027. [NASA ADS] [CrossRef] [PubMed] (In the text)
 Solanki SK, Schüssler M, Fligge M. 2002. Secular variation of the Sun's magnetic flux. Astron Astrophys 383: 706–712, DOI:10.1051/00046361:20011790. [CrossRef] [EDP Sciences] (In the text)
 Stern D. 2002. A millennium of geomagnetism. Rev Geophys 40(3): 1007, DOI:10.1029/2000RG000097. [CrossRef] (In the text)
 Stenflo JO. 2015. History of solar magnetic fields Since George Ellery Hale. Space Sci Rev 1–31, DOI:10.1007/s112140150198z. (In the text)
 Svalgaard L. 2015. Correction of errors in scale values for magnetic elements for Helsinki. Annales Geopjys., 32(6), 633–641, DOI:10.5194/angeo326332014. (In the text)
 Svalgaard L, Cliver EW. 2005. The IDV index: its derivation and use in inferring longterm variations of the interplanetary magnetic field strength. J Geophys Res 110: A12103, DOI:10.1029/2005JA011203. [NASA ADS] [CrossRef] (In the text)
 Svalgaard L, Cliver EW. 2010. Heliospheric magnetic field 1835–2009. J Geophys Res 115: A09111, DOI:10.1029/2009JA015069. [NASA ADS] [CrossRef] (In the text)
 Svalgaard L, Schatten KH. 2016. Reconstruction of the sunspot group number: the backbone method. Sol Phys 291(9): 2653–2684, DOI:10.1007/s1120701508158. [NASA ADS] [CrossRef] (In the text)
 Usoskin IG. 2017. A history of solar activity over millennia. Living Rev Sol Phys 14(3), DOI: 10.1007/s4111601700069. [NASA ADS] [CrossRef] (In the text)
 Usoskin IG, Hulot G, Gallet Y, Roth R, Licht A, Joos F, Kovaltsov GA, Thébault E, Khokhlov A. 2014. Evidence for distinct modes of solar activity. Astron Astrophys 562: L10, DOI:10.1051/00046361/201423391. [CrossRef] [EDP Sciences] (In the text)
 Usoskin IG, Arlt R, Asvestari E, Hawkins E, Käpylä M, Kovaltsov GA, Krivova N, Lockwood M, Mursula K, O'Reilly J, Owens M, Scott CJ, Sokoloff DD, Solanki SK, Soon W, Vaquero JM. 2015. The Maunder minimum (1645–1715) was indeed a Grand minimum: a reassessment of multiple datasets. Astron Astophys 581: A95, DOI:10.1051/00046361/201526652. [CrossRef] [EDP Sciences] (In the text)
 Usoskin IG, Kovaltsov GA, Lockwood M, Mursula K, Owens MJ, Solanki SK. 2016. A new calibrated sunspot group series since 1749: statistics of active day fractions. Sol Phys 291: 2685–2708, DOI:10.1007/s1120701508381. [NASA ADS] [CrossRef] (In the text)
 Van Helden A. 1977. The invention of the telescope. Trans Am Phil Soc 67(4): 1–67, DOI:10.2307/1006276. http://www.jstor.org/stable/1006276. [CrossRef] (In the text)
 Vasyliunas VM, Kan JR, Siscoe GL, Akasofu SI. 1982. Scaling relations governing magnetospheric energy transfer. Planet Space Sci 30: 359–365, DOI:10.1016/00320633(82)900411. [CrossRef] (In the text)
 Vaquero JM, Vázquez M. 2009. The sun recorded through history. Springer, DOI:10.1007/9780387927909. (In the text)
 Vieira LEA, Solanki SK. 2010. Evolution of the solar magnetic flux on time scales of years to millennia. Astron Astrophys 509: A100, DOI:10.1051/00046361/200913276. [CrossRef] [EDP Sciences] (In the text)
 Wang YM, Lean J, Sheeley Jr NR. 2000. The longterm evolution of the Sun's open magnetic flux. Geophys Res Lett 27: 505–508, DOI:10.1029/1999GL010744. [NASA ADS] [CrossRef] (In the text)
 Watson F. 2004. Stargazer: the life and times of the telescope. Crows Nest: Allen & Unwin, ISBN 1865086584. (In the text)
 Willis DM, Easterbrook MG, Stephenson FR. 1980. Seasonal variation of oriental sunspot sightings. Nature 287: 617–619, DOI:10.1038/287617a0. [NASA ADS] [CrossRef] (In the text)
Cite this article as: Lockwood M, Owens MJ, Barnard LA, Scott CJ, Watt CE. 2017. Space climate and space weather over the past 400 years: 1. The power input to the magnetosphere. J. Space Weather Space Clim. 7: A25
All Tables
The development of reconstructions of annual means of nearEarth space conditions from observations of geomagnetic activity and from models using sunspot number observations as input
Details of correlations between the Ap geomagnetic index and power input to the magnetosphere Pα.
All Figures
Fig. 1 Variations of annual values of the signed open solar flux F_{S}, from various sources, showing how modern values (from magnetogram data via PFSS modelling and from the OMNI dataset of nearEarth interplanetary observations, panel a) are extrapolated back in time using geomagnetic observations (panel b), models based on sunspot numbers (panels c and d) and cosmogenic isotopes (panel d), Sunspot number sequences used as input to the models are shown in part e. Even/oddnumbered solar cycles are shaded white and grey and the Maunder minimum is shaded pink. (See text for further details). 

In the text 
Fig. 2 Distributions of the IMF orientation solar windmagnetosphere coupling function factor, sin^{4}(θ/2), for 1996–2016 (inclusive) for various averaging timescales, τ, where θ = tan^{−1}(B_{Y}/B_{Z}) where B_{Y} and B_{Z} are the Y and Zcomponents of the IMF in the Geocentric Solar Magnetospheric (GSM) reference frame: (a) τ = 5 min; (b) τ = 1 h; (c) τ = 6 h; (d) τ = 1 day; (e) τ = 27 days; (f) τ = 1 yr. N is the number of samples in each sin^{4}(θ/2) bin, the largest value of which for a given τ is N_{max}. In each panel the red dashed line is the overall mean for the whole dataset which, because the averaging intervals of length τ contain equal numbers of samples, equals the mean of each distribution (the “mean of means” or the “grand mean”). The raw B_{Y} and B_{Z} data used are of resolution 5 min and have been rounded up/down to 2 decimal figures. 

In the text 
Fig. 3 Correlation coefficients, r, as a function of α for the interval of nearcontinuous Pα data (1996–2016). Solid lines are for correlations between annual Pα and Ap and dashed lines are for correlations between Pα and the fraction of days when Ap exceeds the overall 95 percentile level, f_{[Ap>Apo]}. The black lines use annual means (τ = 1 yr) of Pα values computed from 5minute means of interplanetary data, 〈Pα〉_{τ=1yr} (i.e., the data are combined and then averaged) and the peak value (marked by the vertical solid black line) gives correlation A in Table 2. The red lines use Pα_{ann} which is compiled using annual means of all available N_{SW}, V_{SW}, B and sin^{4}(θ/2) data (i.e., the data are averaged and then combined) and the peak values gives correlation B in Table 2. The 95percentile of the overall distribution of all Ap values (observed since the data series began in 1932) is Apo = 36 nT. 

In the text 
Fig. 4 Scatter plots of Ap_{c} as a function of normalised Pα values (for the best fit α of 0.46) based on 1hour resolution interplanetary data for 1964–2016. Po is the average of Pα for all the available data from 1964–2016, and Ap_{c} are hourly Ap values (linearly interpolated from the observed 3hourly Ap values) that are coincident with a valid hourly Pα value. The blue dots are for daily means (τ = 1 day) and the red dots for annual means (τ = 1 year). The black line is the linear regression fit to the τ = 1 day data and gives the correlation C in Table 2, the red line is the linear regression fit to the τ = 1 year data and gives the correlation D in Table 2. 

In the text 
Fig. 5 Time series of observed Ap and bestfits to Ap using Pα estimates, both using annual averaging intervals. The fitted series use linear regressions and are denoted by a prime: tests for homoskedasticity, normal distribution of residuals, linearity are all readily passed because correlation coefficients are so high. (a) and (b) are for 1996–2016, (c) and (d) for 1964–2016. Panels (b) and (d) show the data availability, giving the fraction f of hours for which an hourly mean value of Pα is available. In (a) and (c) the green line gives the means of all the Ap data, 〈Ap〉_{τ=1yr} and the blue line gives the means of only the Ap data that are coincident with valid Pα data 〈Ap_{c}〉_{τ=1yr}. (Note that in (a) the blue line is essentially identical to the green and so is completely overwritten by it). In (a) the black line uses annual means (τ = 1 yr) of Pα values computed from 5minute means of interplanetary data, 〈Pα〉′_{τ=1yr} (i.e., the data are combined and then averaged) and is based on the correlation A in Table 2. The red line uses Pα_{ann} which is compiled using annual means of all available N_{SW}, V_{SW}, B and sin^{4}(θ/2) data (i.e., the data are averaged and then combined) and is based on correlation B in Table 2. In panel (d) the black line uses 〈Pα〉_{τ=1yr} and 〈Ap_{c}〉_{τ=1yr} and is based on correlation E in Table 2 whereas the red line uses Pα_{ann} and 〈Ap_{c}〉_{τ=1yr} and is based on correlation D in Table 2. 

In the text 
Fig. 6 Analysis of Ap distributions over 1964–2016. (Top) The fraction of time in each year, f, for which a valid Ap is available. (Middle) Annual probability distribution function (pdfs) of dailymean Ap (〈Ap〉_{τ=1day} ) are colour coded and the black line gives the annual means of Ap, 〈Ap〉_{τ=1yr} . Probabilities are evaluated in bins of Ap that are 1 nT wide. (Bottom) Annual probability distribution function (pdfs) of normalised daily Ap, 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} , are colour coded and the black gives the annual means of this normalised Ap which are unity. Probabilities are evaluated in bins of normalised Ap that are 0.1 wide. 

In the text 
Fig. 7 Same as Figure 5 for the estimated power input into the magnetosphere, Pα. (Top) The fraction of time in each year, f, for which a valid Pα value is available. (Middle) Annual probability distribution function (pdfs) of daily means of Pα/Po (〈Pα〉_{τ=1day}/Po ) are colour coded and the black line gives the annual means of 〈Pα〉_{τ=1yr}/Po . Probabilities are evaluated in bins of Pα/Po that are 0.1 wide and years for which f < 0.5 are omitted. (Bottom) Annual probability distribution function (pdfs) of normalised daily Pα, 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} , are colour coded and the black gives the annual means of this normalised Pα which is unity. Probabilities are evaluated in bins of normalised Pα that are 0.1 wide. 

In the text 
Fig. 8 (Grey histograms) Average pdfs for all years of (a) 〈Ap〉_{τ=1day}/〈Ap〉_{τ=1yr} and (b) 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} . In each case the bestfit lognormal distribution is shown, with mean of unity (by definition) and a variance v = 0.934 in (a) and v =0.913 in (b). Distributions for individual years (not shown) are a bit noisier, particularly for 〈Pα〉_{τ=1day}/〈Pα〉_{τ=1yr} in years when data availability f is low, but give very similar best fit v values, as indicated by the horizontal uniformity of the bottom panels of figures (6) and (7): for the annual normalised Ap distributions, the mean ± 1 standard deviation of the fitted v values for the 53 years is 0.934 ± 0.031; for the annual normalised Pα distributions, v values for the 31 years with f > 0.5 is 0.915 ± 0.025. 

In the text 
Fig. 9 Variation of the fraction f_{p} of daily means of Pα in one year that are in the top 5% (solid points) and top 1% (open triangles) of all the 12198 available daily means of Pα for the interval 1964–2016, shown as a function of the annual mean (〈Pα〉_{τ=1yr}/Po). These data use the optimum α of 0.53. The annual fraction f_{p}[〈Pα〉_{τ=1day} ≥ Pα_{pc}] is shown for two fixed thresholds: Pα_{pc} = 2.38Po (which defines the 95th percentile of the cumulative probability distribution of all available 〈Pα〉_{τ=1day} values from the interval 1964–2016 which number 12198); and Pα_{pc} = 3.77Po (which defines the 99th percentile of the same cdf). Points for these two thresholds are shown by solid circles and open triangles, respectively. Only the 31 years in which data availability f exceeds 50% are shown which account for 10186 samples out of the available total of 12198 (the other years show the same trend but the smaller number of samples increases the scatter). The solid line is a third order polynomial fit to the data points for Pα_{pc} = 2.38Po and the dashed line is that for Pα_{pc} = 3.77Po. Both reveal that fp shows an increasing trend with the annual mean value, but it is not, in general, linear. The smaller numbers associated with higher percentiles of the cdf mean that the scatter is increased. 

In the text 
Fig. 10 Time series of modelled annual values on nearEarth interplanetary parameters over 1612–2015 by Owens et al. (2017): (a) the input sunspot number data series, R_{C}; (b) the IMF field strength, B; (c) the solar wind speed, V_{SW}; (d) the solar wind number density, Nsw. Panel (e) shows the Earth's magnetic dipole moment, M_{E}, from a spline of the gufm1 (for before 1900) and IGRF (for after 1900) models . 

In the text 
Fig. 11 Predicted variations of annual Pα/Po values for six values of the coupling exponent α: (a) α = 0; (b) α = 1/3; (c) α = 2/3; (d) α =1; (e) α = 4/3; and (f) α = 5/3. 

In the text 
Fig. 12 Annual pdfs and mean values of the normalised power input to the magnetosphere, Pα/Po for 1612–2015 and for the bestfit coupling exponent for the Ap index of 0.53. The colour contours give the pdf, evaluated for bins in Pα/Po that are of width 0.01. The Po used here is the average value of Pα from spacecraft measurements over 1964–2016 (inclusive). 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.