Semi-annual, annual and Universal Time variations in the magnetosphere and in geomagnetic activity: 2. Response to solar wind power input and relationships with solar wind dynamic pressure and magnetospheric flux transport

,


Introduction
Paper 1 in this series (Lockwood et al., 2020, hereafter "Paper 1") reviewed the semi-annual variation in geomagnetic activity, as seen in a number of indices, and highlighted a number of puzzles that require explanation.We study these in the present paper empirically using the power input into the magnetosphere, P a , deduced from interplanetary measurements using the formula that was originally derived theoretically by Vasyliunas et al. (1982) from dimensional analysis.This formula is given by equation (2) of Paper 1 and the derivation has also recently been given, and expanded upon, by Lockwood (2019).Analysis by Finch & Lockwood (2007) showed that P a performed consistently better than a basket of other widely-used solar-wind/magnetosphere coupling functions on all averaging timescales tested (between 1 day and 1 year).An important consideration for all coupling functions was studied by Lockwood et al. (2019a), namely the effect of data gaps.These have often been ignored on the grounds that their effects average out, but this was shown not to be the case by Lockwood et al. who introduced synthetic data gaps into near-continuous data.This analysis revealed that data gaps add noise to the correlation studies and lead to erroneous fits.In particular, there is a problem with "overfitting", whereby the use of too many free fit parameters can generate seemingly good fits to the training data by fitting to the noise, leading to incorrect fits that have reduced, little or even, in extreme cases, zero predictive power when applied to data other than the training dataset.Overfitting is a problem that is well recognized in disciplines such as climate science and population studies but has not often been considered in space physics.A key point about P a is that it has just one free fit parameter, the coupling exponent a, and this minimises the risk of overfitting.At the same time, P a makes allowance for solar wind speed, V SW , number density, N SW , mean ion mass, m SW , and Interplanetary Magnetic Field (IMF) strength, B, as well as an orientation factor, A h .The test devised by Vasyliunas et al. (1982) shows that the optimum form of A h is sin 4 (h/2), where h is the IMF clock angle in the GSM frame (see the supporting information file to Lockwood et al., 2019b and Sect. 1.1).These factors could all be given their own fit exponent but this greatly increases the risk of overfitting; to avoid this P a uses the theory to make the exponent of each variable a fixed function of the one free fit parameter, a.
As also discussed in Paper 1, P a is based on the dominant energy flux in the solar wind, namely the bulk-flow kinetic energy flux of the particles, whereas some other coupling functions incorrectly use the Poynting flux in the solar wind, which is very small in comparison.The energy input to the magnetosphere across the magnetopause is in the form of Poynting flux, but most of that is generated from the bulk-flow solar wind kinetic energy by currents that flow in the bow shock and magnetosheath (Cowley, 1991;Lockwood, 2004Lockwood, , 2019;;Pulkkinen et al., 2016).Lockwood (2019) has recently added the minority solar wind Poynting flux to the Vasyliunas et al. (1982) formulation for P a at the expense of adding a second free fit parameter.The improvements to the correlations with geomagnetic activity are sometimes statistically significant but only on short timescales and they are always extremely small.We do not here include this additional term in solar wind Poynting flux because it increases the risk of overfitting.
One particular puzzle about the semi-annual variation in geomagnetic activity is the origin of the equinoctial pattern of geomagnetic response with time-of-year, F, and Universal Time, UT (de La Sayette & Berthelier, 1996;Cliver et al., 2000;Chambodut et al., 2013).This has been a matter of debate, with a large number of mechanisms proposed as to how the implied association with Earth's dipole tilt originates and it is not clear which, if any, of the proposed mechanisms is active (see review in Sect. 1 of Paper 1).
A related, unexplained feature, also highlighted by Paper 1, is that the semi-annual variation in geomagnetic activity is undoubtedly caused by the modulation of power input into the magnetosphere P a by the Russell-McPherron (R-M) effect (Russell & McPherron, 1973) but is amplified, being of larger fractional amplitude in most geomagnetic indices than in P a .The need for some such amplification was noted by Russell and McPherron in their original paper but the mechanism responsible for it has never been identified.
There is a third major puzzle concerning large storms.The R-M mechanism is based on the idea that the IMF lies close to its average orientation, parallel to the solar equatorial plane (the XY plane of the Geocentric Solar Ecliptic, GSEQ, reference frame) and that the component in the +Y or ÀY direction of GSEQ is converted into "geoeffective" southward field in the Geocentric Solar Magnetospheric (GSM) reference frame by the tilt of Earth's magnetic moment in the YZ plane.This tilt sets the consequent rotation of the GSM frame with respect to the GSEQ frame about their common X-axis.The GSEQ and GSM reference frames are organised with respect to the solar rotation axis and the Earth's magnetic dipole axis, respectively, and formal definition and discussion of both is given in Paper 1.The R-M effect predicts that, through the dipole tilt effects on the rotation between the GSEQ and GSM frames, the IMF field with [B Y ] GSEQ < 0 gives southward field in the GSM frame ([B Z ] GSM < 0) around the March equinox, whereas [B Y ] GSEQ > 0 generates southward field in GSM around the September equinox.Hence, as noted by Zhao & Zong (2012), splitting the data into these two polarities of [B Y ] GSEQ allows us to directly identify the contribution of the R-M effect by which of the two equinoxes is enhanced by which polarity of [B Y ] GSEQ .We here refer to the [B Y ] GSEQ polarity which, at a given equinox, increases/decreases the southward [B Z ] GSM (and hence increases/decreases geomagnetic activity) as the "favoured"/ "unflavoured" polarity.Paper 1 uses this dependence on the IMF [B Y ] GSEQ polarity to reveal that the R-M effect is at the heart of the equinox peaks of several parameters, including: mean power input into the magnetosphere, P a ; the mean of the am geomagnetic index; and of the occurrence of both P a and am values in the top 5% of their overall distributions, f[P a > q(0.95)] and f[am > q(0.95)], respectively.Interestingly, this means that the R-M effect is at the heart of even the equinoctial-like time-of-day (UT)time-of-year (F) patterns seen for geomagnetic data, which, as discussed in Paper 1, are distinct from the pattern that is predicted for the R-M effect.Hence, although the equinoctial patterns indicate that another mechanism is at work, the core cause of their semi-annual variation is undoubtedly the R-M effect.
The occurrence of large storms generates another puzzle in relation to the R-M effect because it has long been recognized that the largest storms are produced by interplanetary disturbances during which the IMF, at some stage at least, points strongly southward in the GSM frame (e.g., Taylor et al., 1994;Webb et al., 2000;Echer et al., 2005;Xie et al., 2006;Kilpua, et al., 2017;Li & Yao, 2020).Crooker et al. (1992) proposed a mechanism to give the observed increased occurrence of large storms at the equinoxes that is consistent with the R-M effect by arguing that in the CME sheath the IMF remains in the equatorial plane of the GSEQ frame but its component in the ±Y direction is enhanced by the compression due to the event.This enhanced equatorial [B Y ] GSEQ is then converted into enhanced southward field in the GSM frame by the R-M effect.However, Lockwood et al. (2016) and Paper 1 show unambiguously that large southward field in GSM is almost always associated with large southward field in GSEQ, and not the near-zero southward field that Russell & McPherron (1973) and Crooker et al. (1992) invoked in applying the R-M mechanism.Appendix B of Paper 1 shows that when there is a dominant southward component of the IMF in the GSEQ frame, far from enhancing the geoeffective southward component in GSM, the R-M effect actually reduces it.Paper 1 therefore highlights a paradox in that although the R-M effect is clearly at work to enhance average geomagnetic disturbance levels and the occurrence of small and moderate geomagnetic disturbances, large disturbances are driven by out-of-equatorial southward field in GSEQ (usually deflected out of the solar equatorial plane by a CME or CIR) for which the R-M effect actually tends to reduce the southward component in GSM and hence the geo-effectiveness of the event.Hence the observed equinoctial peaks in the occurrence of large storms are not attributable to the R-M effect, at least not directly, in the same way that the peaks for smaller disturbances and average conditions are.
This point is further emphasised here by Figure 1, which is an update of the plot by Lockwood et al. (2016).It compares the northward component of the IMF in the GSEQ frame ([B Z ] GSEQ ) to that in the GSM frame ([B Z ] GSM ), both averaged over 3-h intervals that 1 h ahead of the 3-h intervals in which the geomagnetic activity level is measured (here quantified by the am index).In other words, we have employed an average response lag of the am index to the IMF orientation of dt o = 1 h (see Sect. 3.3 below).The plots show the numbers of samples, N, in bins of width 2 nT in both [B Z ] GSEQ and [B Z ] GSM .The data cover 1995-2017, inclusive, the interval of near-continuous IMF observations.Results are shown for: (a) all data; (b) am exceeding its 90-percentile, am > q(90) (i.e., am is in the top 10% of all its measured values); (c) am exceeding its 95-percentile, am > q(95); and (d) am exceeding its 99-percentile, am > q(99).Figure 1 confirms the result presented in Paper 1, that large [B Z ] GSEQ < 0 is actually the dominant driver of large southward field in the GSM frame ([B Z ] GSM < 0) and the R-M effect gives relatively minor deviations from the diagonal mauve line that is at [B Z ] GSEQ = [B Z ] GSM .It can be seen that for larger geomagnetic events, the IMF data are increasingly in the [B Z ] GSEQ < 0, [B Z ] GSM < 0 sector.Note that there are some exceptions to this with a small number values in the [B Z ] GSEQ > 0, [B Z ] GSM > 0 sector: these can be attributed to the fact that using a 3-h prior IMF orientation averaged over 3 h with a lag dt o of 1 h is not always appropriate (see Sect. 3.3).The pure R-M effect invokes [B Z ] GSEQ = 0 and so points would lie along the vertical white line only: the small deviations from the diagonal show that the R-M effect is contributing much less to [B Z ] GSM < 0 than does [B Z ] GSEQ < 0 in most cases.In Figure 1a (for all data), the R-M effect causes both increases and decreases in [B Z ] GSM around [B Z ] GSEQ = 0, as expected and depending on the polarity of the IMF [B Y ] GSEQ .For small, moderate and some large geomagnetic disturbances , respectively we see more points below the mauve diagonal line at small [B Z ] GSEQ which demonstrates that the R-M effect is having an effect (by making [B Z ] GSM more negative during the favourable polarity of [B Y ] GSEQ and so enhancing the am index).However, for large disturbances (Fig. 1d) and strongly negative [B Z ] GSEQ there are actually more points above the diagonal mauve line than below it, showing the R-M effect is tending to reduce the southward field in GSM in these cases, as predicted in Appendix B of Paper 1.
1.1 The paradox of the R-M effect and of southward field in the GSEQ frame Figure 2 is a demonstration of the important point made in Appendix B of Paper 1.The graphs are taken from the supporting information to Lockwood et al. (2019b), in which their derivation is explained in greater detail.This plot demonstrates the test of IMF orientation factors, A h , in solar-wind magnetosphere coupling functions that was devised by Vasyliunas et al. (1982) and evaluates the various proposed forms for A h .The black dots are based on the geomagnetic SML index data (the SuperMAG version of the auroral AL index but compiled from a northern hemisphere network of over 100 stations) and show a linear regression of SML/G against h where G is the best-fit estimate of the power input into the magnetosphere from solar wind data but without an IMF orientation factor (i.e., G = P a /A h ) and h is the IMF "clock angle" in the GSM frame, h = arctan (|[B Y ] GSM |/([B Z ] GSM ).Note that this definition means that h is independent of the polarity of [B Y ] GSM and that h varies from zero for purely northward IMF in GSM ([B Z ] GSM = B YZ , where B YZ is the magnitude of the field in the YZ plane that is the same in the GSEQ, GSE, and GSM reference frames) to 180°for purely southward IMF in GSM ([B Z ] GSM = ÀB YZ ).It can be seen from Figure 2 that the optimum fit to the SML data is for A h = sin 4 (h/2) (the blue line).Almost identical plots to Figure 2 for the AE and am geomagnetic indices have been presented in Figure 4 of Bargatze et al. (1986) and Figure 9   (1973) used the average IMF orientation with [B Z ] GSEQ = 0, for which the rotation between the GSEQ and GSM frames will cause h to vary over a year (with fraction of the year F and UT) between 90 À 33.5 = 56.5°and90 + 33.5 = 123.5°(therange shaded light pink in Fig. 2).In addition, the area shaded dark pink shows the range for the annual effect of the Earth's orbit without the UT effect.In these ranges of h, the optimum A h is increasing in gradient (i.e., d 2 A h /dh 2 > 0).This is true for all the proposed coupling functions shown except sin 2 (h/2) (in orange), which we do not consider as it is not a good fit to the data (for all tested geomagnetic indices, the A h = sin 4 (h/2) formulation yields the best fit).The original R-M paper employed a "half-wave rectified" coupling function such as U(h) cos(h) (shown in black) where U(h) = 0 for [B Z ] GSM > 0 and U(h) = 1 for [B Z ] GSM 0 for which the change in slope (with d 2 A h /dh 2 > 0) all takes place at h = 90°.The nonlinearity of the A h (h) curve around h = 90°is vital to the R-M explanation of the semi-annual variation.This is because the distribution of h values in the GSM frame will be approximately symmetric around a mode value of 90°.However, the nonlinearity in A h means that increases in geomagnetic activity for h > 90°(caused by the "favourable" polarity of [B Y ] GSEQ at a given F) will be greater in amplitude than the decreases caused by h < 90°(caused by the other, "unfavourable" [B Y ] GSEQ polarity).Hence the net effect in this case is a rise in average activity and this provides the R-M explanation of the semi-annual variation even though the occurrence of favourable [B Y ] GSEQ is the same as that of unfavourable [B Y ] GSEQ .The same applies for all coupling function that show d 2 A h /dh 2 > 0 around h = 90°.However, contrast this to what happens if [B Z ] GSEQ is not zero but rather is strongly negative, as is the case for the solar wind transients that drive most major geomagnetic disturbances (as demonstrated by Fig. 1).At h near 120°all the A h coupling functions become approximately linear (d 2 A h /dh 2 % 0).This means that there is no longer an R-M effect because the effects of the "favourable" increases in h are cancelled by the effects of the "unfavourable" decreases in h.At the largest h, d 2 A h /dh 2 < 0.Here the dark and light orange regions are the ranges in h caused by the annual and the annualplus-UT variations in dipole tilt, respectively, for [B Z ] GSEQ = ÀB YZ (purely southward field normal to the solar equator).In this case, any effect of the dipole tilt reduces the value of A h and so has the opposite effect to the classical R-M effect (which relies on [B Z ] GSEQ being near zero).This point is made in greater detail in Appendix B to Paper 1.
The key point is that the non-linearity in A h , upon which the R-M effect depends, is present only for small [B Z ] GSEQ .A large [B Z ] GSEQ contribution to [B Z ] GSM during large geomagnetic events (as shown in Fig. 1) means that the coupling function is far removed from the part of the response curve with the required non-linearity.This leads to either a reduced effect, no effect or even an effect in the opposite sense to the R-M effect, as it is presently understood (which applies only when [B Z ] GSEQ is small).This means that, because of the increased role of non-zero [B Z ] GSEQ, the R-M effect on the semi-annual variation should be weaker or absent for the occurrence of larger events, the opposite of the behaviour that is observed (as shown in the original paper by Russell & McPherron, 1973 and, for example, by Paper 1).Hence Figures 1 and 2 present a considerable puzzle in terms of understanding the equinox peaks in the occurrence of large storms that are driven by IMF that deflected strongly southward by transient events such as CMEs and CIRs.

The timing of the peaks of the semi-annual variation
Large southward field in the GSEQ frame during large geomagnetic disturbances has another implication.As discussed in Paper 1, many studies attempt to use the precise timing of the equinox peaks in average geomagnetic activity levels, or in the occurrence of storms, as a potential discriminator of the various mechanisms (Russell & McPherron, 1973;Le Mouël et al., 2004).However, the random nature of events of large negative [B Z ] GSEQ hitting the Earth's magnetosphere means that data from a great many years are needed before a systematic peak can be observed such that the precise timing of the peak can be properly assessed (Russell & McPherron, 1973).

The role of solar wind dynamic pressure
Another factor that the literature indicates that we should consider is solar wind dynamic pressure (p where m SW is the mean ion mass, V SW the number density and V SW the speed of the solar wind).Caan et al. (1973) showed that the magnetic energy density in the near-Earth tail lobes was increased by both prior intervals of southward-pointing IMF (i.e., substorm growth phases) and by increased p SW .Midlatitude "range" indices (such as am) are strongly modulated by the auroral electrojet of the substorm current wedge and so have a strong correlation with indices such as AE, AL, SME and SML.Hence they are influenced by substorm expansion phases when the stored tail lobe energy is released (Adebesin, 2016;Lockwood et al., 2019aLockwood et al., , 2019d, Appendix A of Paper 1).These mid-latitude range indices also have a strong dependence on V 2 SW and hence on p SW (Lockwood, 2013;Lockwood et al., 2014).Using the standard deviation of geomagnetic variations, Finch et al. (2008) found that the nightside auroral electrojet was the source both of the equinoctial F-UT pattern and of the V 2 SW dependence.Lockwood (2013) pointed out that this implies that p SW influences the auroral electrojet and the equinoctial F-UT pattern by constraining the near-Earth tail such that on appending open flux to the tail lobes of the magnetosphere (during periods of southward IMF) the lobe field (and hence the stored energy density and magnetic shear across the near-Earth cross-tail current sheet) increases by a greater factor if p SW is large.This does not happen further down the tail where the magnetopause boundary becomes aligned with the solar wind flow: here the magnetic field, the energy density and the cross-tail current, are all set by the static pressure in interplanetary space with no influence of p SW .At these large negative X GSE , adding open flux just causes the tail to flare in cross-sectional area and the field in the lobes and the magnetic shear across (and hence total current in) the cross-tail current all remain constant for a given solar wind static pressure.
An important effect of p SW was demonstrated directly by Karlsson et al. (2000) who showed that near-Earth tail energy content was reduced if p SW decreased and that such sudden decreases caused quenching of any substorm expansion that had recently begun.Conversely, increases in p SW have been seen to trigger onsets of full substorm expansion phases (Schieldge & Siscoe, 1970;Kokubun et al., 1977;Yue et al., 2010).Various studies suggest that increased p SW enhances general magnetospheric convection and field-aligned current systems as well as geomagnetic activity (e.g., Lukianova, 2003;Lee et al., 2004;Palmroth et al., 2004;Boudouridis et al., 2005;Stauning & Troshichev, 2008): this is beyond and separate to the known generation of transient filamentary field aligned currents and travelling convection vortices by the boundary deformation (e.g., Lühr et al., 1996).Some of these observations are interpreted as being the result of enhanced magnetopause reconnection; however, in many cases the response delay appears to be too long for this to be the explanation.

The F-UT response of the am index and the ar indices
For studies of the F-UT pattern of geomagnetic activity, the am index (Mayaud, 1980) is by far the best geomagnetic index to employ because it is based on data from longitudinal rings of magnetometers in both hemispheres that are as uniform as possible (and deploys weighting functions to reduce the effects of necessary non-uniformity of the rings): this gives the am index an exceptionally uniform F-UT response pattern, especially at higher levels of geomagnetic activity, as demonstrated by the modelling of the index response by Lockwood et al. (2019d).As in Paper 1, we also make use of the ar indices (Chambodut et al., 2013) which are based on the same data as the am index but restrict the data used to stations that are within 6-h magnetic local time (MLT) sectors around dawn (03-09 h MLT, giving ar dawn ), noon (09-15 h MLT, giving ar noon ), dusk (15-21 h MLT, giving ar dusk ), and midnight (21-24 and 00-03 h MLT, giving ar midnight ).

The aims of the present paper
The main aim of the present paper is to establish some observed behaviours of the semi-annual variation.We start with an initial look in Section 2 at the relationship between annual and semi-annual variations of magnetospheric flux transport (convection) and geomagnetic activity.This is an initial and interim study of a limited dataset but still reveals some important behaviours.In Section 3 we investigate the geomagnetic response to power input into the magnetosphere, including an analysis of response delays as a function of activity level and time of year, F. In Section 4, we study annual variations in the geomagnetic response to events of large power input to the magnetosphere and to average power input, both with and without the separation by the polarity of the prevailing IMF [B Y ] GSEQ component that enables us to identify the R-M effect.We study variations with the solar wind dynamic pressure p SW in Section 5.However, we do not isolate the effects of solar wind mean ion mass m SW number density, N SW , nor speed, V SW , independently and we note that dependencies of p SW detected could arise from a dependence on different combinations of the parameters, potentially leading to different interpretations.The literature described in Section 1.3 strongly suggests that dynamic pressure, as a physical entity, plays a role in magnetospheric responses by squeezing the near-Earth tail; however, the discussion of such effects will be in a later paper that uses magnetopause and magnetospheric models to discuss the mechanisms in the light of the empirical results presented here.
2 Initial survey of semi-annual variations in power input into the magnetosphere and its response in geomagnetic activity and magnetospheric flux circulation In this section, we make use of a database of 20,430 polar cap traversals made by DMSP satellites during 2001 and 2002, analysed using the procedure of Lockwood et al. (2009).We use only passes that cross the dawnside convection reversal boundary at 02-10 MLT and the duskside convection reversal boundary at 14-22 MLT and most of the data come from the DMSP-F13 satellite that was in a suitable orbit.Figure 3 shows the semi-annual variations in various parameters in fully simultaneous data: in each data sequence, a data gap in any one parameter is introduced into all other data sequences and each case, the data have been averaged into 365 equal-sized bins of time-of year F (i.e., of one-day duration for the non-leap years used).This yields between 48 and 63 samples in each 1-day bin, with an average of just under 56.A running, boxcar mean was the taken over 27 bins to cover whole solar rotation intervals and give an even mix on the two polarities of the Y-component of the IMF.The dawn-to-dusk component, (E DD , in the +Y GSM direction) of the electric field in interplanetary space (E SW ¼ À Ṽ SW Â BIMF ), is shown in Figure 3a.This has been half-wave rectified such that all negative (i.e., dusk-todawn) E DD values are put to zero.Given that Figure 3 shows averages over 27-day intervals, we should expect steady-state to apply in which case, by Faraday's law, the electric field would be curl-free and map from interplanetary space into the ionosphere down the open polar cap field lines.The AU index (Fig. 3b) shows only a strong annual variation due to the dayside conductivity enhancement in summer, which is also present in AL (Fig. 3c) but to a much lesser extent and the semi-annual variation can be seen.The polar cap flux, W PC (Fig. 3d) and dawn-dusk transpolar voltage U PC (Fig. 3e) are derived from the convection reversal boundaries detected by the DMSP satellites using the procedure described by Lockwood et al. (2009):   in both cases the black line is for all passes and the red and blue lines are for northern and southern polar cap passes, respectively.The procedure applies a statistically-derived correction to the observed difference in potential between the dawnside and duskside convection reversal boundaries to normalise to the value U PC for an ideal 06-18 MLT orbit.Note that U PC is often referred to as the "cross-cap potential" but is, in reality, a potential difference and hence a voltage: we here use the term "transpolar voltage".Note also that, by Faraday's law, a voltage is physically identical to a magnetic flux transfer rate.The statistical model of Lockwood et al. (2009) also gives the variable polar cap shape which is used with the observed polar cap diameter to generate estimates of the polar cap flux, W PC .All three lines (for northern-and southern-hemisphere data separately and all data) are similar in Figure 3d, as indeed they should be given that by Maxwell's equation r: B ¼ 0, the total open flux in the two hemispheres is always identical.The small difference between the independent values for the two hemispheres gives strong support to the method used to compute W PC .In addition, for these 27-day averaging intervals, the total rate of flux transfer across the polar caps (the transpolar voltages) should also be very similar, as they are seen to be in Figure 3e.
(On shorter timescales non-steady-state effects can become apparent and the flux transfer rate across the central dawn-dusk diameters of the two polar caps does not have to be the same at any one instant).Because of the long averaging timescale, the reconnection efficiency can be taken to be the ratio of the voltages across the polar cap and the whole magnetosphere, g 15 = U PC /(2E DD R*) and in Figure 3f is computed assuming that the cross-sectional radius of the magnetosphere is constant at R* = 15 R E (where a mean Earth radius 1 R E = 6370 km).The value of g 15 derived is relatively constant over the year and does not show a semi-annual variation.Figure 3g shows the corresponding values of the am index (linearly interpolated from the 3-hourly data to the central time of each polar cap pass).
The bottom panel (Fig. 3h) plots the ratio am/U PC and shows that the semi-annual variation in am is amplified compared to that in magnetospheric convection.Figure 3 shows that the average semi-annual variations in E DD , U PC , W PC , ÀAL, and am are all very similar in waveform.Figure 4 demonstrates that the amplification of the semiannual variation in am, relative to U PC , is a general feature of the am index by plotting daily means of am as a function of daily means of U PC .Note that during each day there are typically 25 polar cap satellite passes and there are 8 am measurements that are linearly interpolated to the central times of the passes.The error bars give the standard errors in these daily means.The mauve line is the best-fit 3rd-order polynomial and the grey area is the 2-r uncertainty band in that fit.Appendix details the ensemble fitting procedure used and gives the polynomials for the best fit and its 2-sigma uncertainty limits.Appendix also gives the corresponding fits for U PC as a function of am, which could be useful for studies of magnetospheric flux transfer (i.e., convection) that wish to make use of the geomagnetic activity data.The coefficients for the U 3 PC and am 3 terms in the polynomial fits (for am and U PC , respectively) are very small and the fitted variations are very close to being quadratic in form.It can be seen that the average am increases monotonically with average U PC , but not linearly and this is consistent with the amplification of the semi-annual variation in am relative to that in U PC .Note that Figure 5 of Lockwood et al. (2019a) shows that daily means of the power input into the magnetosphere, P a , estimated from interplanetary measurements, have a linear relationship with daily means of the am index.If we assume that steady state applies to averages taken over one day, the magnetospheric power input, P a would be equal to the total daily power deposited in the magnetosphere-ionosphere-thermosphere system.This averaging timescale would cover several substorm cycles so should largely smooth these cycles out, which are the dominant oscillation of the storage/release magnetospheric system; however, we note that large storms often last longer than a day and so we should expect this relation to break down during such events.Hence together, Figure 5 of Lockwood et al. (2019a) and Figure 4 of the current paper strongly suggest that the total power deposited has an approximately U 2 PC dependence on transpolar voltage.
Figure 5 presents time-of-year/time-of-day (F-UT) plots for transpolar voltage U PC and polar cap flux W PC for this dataset.Data are sorted into 24 equal-width bins in F and 16 equalwidth bins of UT using the time of the centre of the polar cap crossing.The data are then smoothed with a 1-3-1 triangularweighting filter, applied in both the F and UT dimensions.The top panels are for passes of the northern polar cap, the lower panels are for passes of the southern polar cap.Figures 5d and 5h show that the number of samples in each F-UT bin, n bin , are high and relatively constant varying between 26 and 30.The transpolar voltage U PC (Figs. 5a and 5e) and, to a lesser extent polar cap flux, W PC (Figs. 5b and 5f) show the semi-annual variation but no hint of any equinoctial pattern.However, we cannot draw any firm conclusion from this  F13) and normalised for the satellite track to an ideal 06-18 MLT path using the procedure of Lockwood et al. (2009).The error bars are plus and minus one standard error in the means.The mauve line is the best-fit 3rd order polynomial fit and the grey area is bounded by the 2-sigma uncertainty level in that fit.The fitting procedure is given in Appendix to this paper, along with polynomial expressions for the best fit and the uncertainty band edges.These can be used to estimate the am level associated with a given U PC : Appendix also gives the corresponding 3rd order polynomials that allow computation of the U PC value associated with a given am value.
because neither does the am index for simultaneous data (Figs.5c and 5g).In contrast, Paper 1 demonstrated that the equinoctial pattern is clearly present in larger am datasets.This will be revisited in a later paper that deploys a much longer database of transpolar voltage observations.
The F-UT plots for U PC may not show an equinoctial variation, but they do show a fascinating UT variation with a major minimum at 0-10 UT at all times of year.Figure 5 shows this is seen in both hemispheres and it is also found in both of the two years when analysed in isolation (not shown).Paper 1 showed that in this UT band there is a persistent, but smaller, drop in geomagnetic indices and, notably, am with its highly uniform F-UT response.There is also an interesting UT variation in the polar cap flux seen in both polar caps, with W PC almost in antiphase with U PC .This behaviour is seen at all times of year except early in the year (F % 0.2).Again this is seen in both polar caps and in the data from both years when analysed separately.
For the expanding-contracting polar cap model of ionospheric convection excitation (Cowley & Lockwood, 1992), the transpolar voltage measured by a satellite that passes through the centre of the convection polar cap, in the approximation that the polar cap remains circular in shape, is given by (Lockwood, 1991): Equation ( 2 Hence the combined behaviour of a fall in U PC and a rise in W PC is qualitatively consistent with a fall in U N at around 22-05 UT. Figure 5 shows that the rise in W PC does lead the minimum in U PC by 2-3 h.This could potentially be explained by the initial motions in the ionospheric footprints of the reconnection lines and the consequent inductive decoupling of the voltages appearing along the X-lines and their ionospheric footprints, the "merging gaps".This means that ionospheric flows (and hence voltages) are not established straight away as the open/closed boundary moves (Morley & Lockwood, 2006;Lockwood et al., 2006).However, the rates of change in dW PC )/dt correspond to voltage differences of between about 5 and 10 kV only, whereas the drop in U PC is of order 25 kV so quantitatively the two are not in such good agreement with a common cause of a drop in U N .This may arise from the limitations in the method used to derive W PC or may point to another factor contributing to the observed minimum in U PC at 0-10 UT.There is a systematic variation with UT of the orbit path of the main spacecraft employed (DMSP-F13) and that may have contributed to the 0-10 UT minimum in U PC .In theory, this should have been corrected for by the procedure of Lockwood et al. (2009) that normalises U PC measured for the actual orbit path to an ideal 06-18 MLT pass: however, if the correction is too small it could give an overestimation of the depth of the 0-10 UT minimum in U PC .
To summarise this section, we have presented an initial, limited study of the semi-annual variation in transpolar voltage and flux transfer.We find there is a clear semi-annual variation in transpolar voltage that mirrors the waveform of that in geomagnetic activity closely but it is not as large in amplitude.This is consistent with the latter showing a non-linear (quadratic) variation with transpolar voltage.We find that during the persistent minimum of the UT variation in geomagnetic activity defined in Paper 1 there is a persistent decrease in transpolar voltage which may be, at least in part, consistent with a decrease in reconnection voltage in the cross-tail current sheet.Confirmation of these findings using a larger dataset of transpolar voltage measurements will be presented in a later paper in this series, but the great consistency with which we find the features described (in both polar caps and in each of the three years studied here) leads us to conclude they are probably real phenomena.3 The geomagnetic response to power input to the magnetosphere

Estimation and distributions of power input
As in Paper 1, we use the power input to the magnetosphere, P a , computed from interplanetary measurements using the Vasyliunas et al. (1982) theoretical formulation.To remove some constants in the formulation we divide by the overall mean for the interval studied which is 1995-2017, P o = <P a > all .The correlations between P a /P o and the am index were studied for the near continuous data after 1995 and are very high, being 0.978 for annual means, 0.932 for means over Carrington rotation intervals (%27 days), 0.908 for daily means, and 0.842 for 3-hourly means (Lockwood, 2019).
Figure 6 shows the cumulative distribution functions (c.d.f.s) of P a /P o .The P a /P o data are divided into 288 bins, 36 equalsized bins in fraction of year F and the same eight 3-h Universal Time (UT) bins over which the range am index is evaluated.For each F-UT bin the c.d.f. of P a /P o is coloured according to the mean value for that bin, <P a /P o > F,UT .Figure 6 corresponds to the plot for am presented in Figure 8b of Paper 1, covering the same interval , so these two families of c.d.f.s can be directly compared and are found to be very similar.The origin of the distributions (which are log-normal in form) has been explained by Lockwood et al. (2019bLockwood et al. ( , 2019c) ) and arises from the effect of averaging timescale on the distribution of the IMF orientation factor sin 4 (h/2) factor at high (1 min) time resolution, where h is the IMF clock angle.It should be stressed here that the 4th power exponent in this sin 4 (h/2) factor is not a free fit parameter but arises from the analysis proposed Vasyliunas et al. (1982).As discussed in the introduction to this paper and in Paper 1 (Appendix B), the IMF orientation factor is critical to the analysis of the semi-annual variation presented here because it gives rise to the R-M effect.Lockwood et al. (2019b) and Lockwood (2019) have shown that this sin 4 (h/2) factor performs better than all suggested alternatives, including the "half-wave rectified" southward field which was the basis of the original R-M theory (Russell & McPherron, 1973).The sin 4 (h/2) factor preserves the non-linearity in the geomagnetic response which results in the R-M effect giving the semi-annual variation, but avoids the discontinuity in slope at [B Z ] GSM = 0 (h = p/2).The formulation also allows for a continued, lower rate of magnetopause reconnection opening field lines, even when the IMF is northward, as has been deduced in a number of studies, including observations of ionospheric O + ions escaping the magnetosphere on open field lines (Chandler et al., 1999).However, note that although the sin 4 (h/2) factor is used here and other studies have found it to be optimum, other forms for the IMF orientation factor will generate the R-M effect and the results may vary in detail, but not in the basic principle.
We use 3-hourly means of the power input P a /P o to compare with the 3-hourly values of the am index, even though the latter are not mean values over the 3 h, being based on the range (maximum minus minimum) of variation within the 3-h interval, as seen at the various magnetometer stations employed by the index.Because over fixed 3-h intervals the mean value of P a /P o is proportional to the integrated value, this is consistent with the storage/release concept of magnetospheric behaviour, whereby integrated energy input into the magnetosphere is stored in the geomagnetic tail and released to give bursts of geomagnetic activity that set the range values detected by each the am magnetometers.There is an issue as to what is the most appropriate response lag between the 3-hourly power input means and the geomagnetic response and that is analysed in Section 3.3 below.Confirmation that it is appropriate to compare 3-hourly means with three-hourly geomagnetic range values (such as am) is provided by Appendix A of Paper 1 which compares the am index data with both the averages of the auroral electrojet AE and ÀAL indices and with their maxima over the 3 h intervals and finds equally good correlations.
Figure 7 stresses the excellent correlations between am and P a /P o for large averaging timescales, s, but also shows there is structure in the scatter.Part (a) is for annual means (s = 1 year).The black squares are for all data, to which the green line is the best linear regression fit.The coloured squares are for the data separated into the 8 UT ranges over which the am index is computed, the squares are coloured according to the centre value of those UT ranges using the scale given at the top of the panel.There is a persistent pattern with a fixed P a /P o value giving the lowest am for the 00-06 UT data and the highest am for the 21-24 UT data.This is consistent with the UT variation discussed in Paper 1 and in Section 2 above.Figure 7b shows the scatter plot for s = 10 days.The points are here colour-coded by separation in time from the closest equinox of the centre of the 10-day averaging interval, Dt eq .There is again a consistent pattern with a fixed P a /P o value giving the lowest am around the solstices (which are at Dt eq = 0.25) and highest am at the equinoxes (at which Dt eq = 0).The green line is the least-squares regression line for all data and is very similar to that for annual data shown in Figure 7a, and the orange and black lines are the linear regression fits to all data taken in intervals of length Fig. 6.Cumulative probability distributions (c.d.f.s) of the power input to the magnetosphere, P a , as a ratio of its mean value P o = <P a > all for the full interval of near-continuous interplanetary data (1995-2017, inclusive).The P a /P o data are divided into 288 bins: 36 equal-sized bins in fraction of year F and the same eight 3-h Universal Time (UT) bins over which the range am index is evaluated.For each F-UT bin the c.d.f. of P a /P o as a ratio of the mean value for that bin <P a /P o > F,UT was plotted and colored according to the value of that ratio.The black line is for all P a /P o data, the orange line for the F-UT bin giving the largest <P a /P o > F,UT and the green line for the F-UT bin giving the smallest <P a /P o > F,UT .0.25 yr around the solstice and equinoxes.The best-fit coefficients and their 2-sigma errors for the ordinary least squares (OLS) linear regression are given in Table 1.The slopes of the fit for the equinox data is significant smaller than for the solstices (difference equal to À3.7 Â 10 À3 ± 0.3 Â 10 À3 ) so it takes a lower P a /P o to generate a given am value at the equinox.This demonstrates that there is a significant difference between the solstices and equinoxes in the geomagnetic response to a given power input P a /P o .From the linear regression coefficients for the 6-month intervals around the equinoxes and solstices used in Table 1, this amplification of the equinox am value over the solstice am value for a given P a /P o is by a factor of 1.085.However, this figure is obtained by dividing all data in the year into either solstice and equinox and will greatly underestimate the amplitude of the semi-annual variations seen in higher time-resolution data.This was investigated in Paper 1, using 36 equal-width bins of time-of-year, F (each just over 10 days long) and for am the semi-annual variation amplification factor was found to be about 2 at this 10-day time resolution.The important point about Figure 7b and Table 1 is that it establishes that the ratio am/P a varies with time of year and is statistically higher at the equinoxes than at the solstices.Similarly Figure 7a shows that am/P a is reduced at 00-06 UT.This means that at least some of both the semi-annual variation and the UT variation arises not from the solar windmagnetosphere energy input but the response of the magnetosphere-ionosphere-thermosphere system to that energy input.In Paper 1 it was shown that the variation of ar midnight / <ar midnight > over the year is 0.317 ± 0.040 whereas for P a /P o the corresponding amplitude is 0.136 ± 0.035.Thus the semiannual variation in the ar midnight index is significantly amplified compared to that in power input to the magnetosphere and the best estimate of the factor is 0.317/0.136% 2.33.The ar midnight index shows the strongest semi-annual variation, ar dawn and ar dusk show similar amplitude variations to am and ar noon shows the weakest (with no significant amplification over P a /P o ).For both ar dawn and ar dusk the best estimate of the amplification factor is 2.01, for ar noon it is 1.08, and for the am index it is 2.08.Hence all the am and ar indices, apart from ar noon , show amplification of the semi-annual variation in input power into the magnetosphere and this, as for the equinoctial F-UT patterns reported by Chambodut et al. (2013) is almost non-existent at noon and strongest at midnight.The studies of McPherron et al. (2013) and Chu et al. (2015) using the AL auroral electrojet AL index and a mid-latitude nightside "bay" index find that the R-M effect on solar wind/magnetosphere coupling is about 40% of the geomagnetic response which yields an amplification factor of 2.5, in good agreement with the amplification factor found here for ar midnight .

Response lag of the am index to power input
In order to study shorter averaging timescales than the approximately 10-day intervals used in Figure 7, we evaluate the correlation between P a /P o and the am index as a function of F, UT and activity level, allowing for an optimum response lag of am, dt o .To do this, we make 1-min values of P a /P o using the procedure of Lockwood et al. (2019a).This employs 1-min interplanetary observations for 1995-2017 (inclusive), downloaded from the Omni database which is compiled and maintained by the Space Physics Data Facility at NASA's Goddard Space Flight Center.These Omni interplanetary data have been lagged so that the timings apply to the conditions arriving at the nose of Earth's bow shock.One-minute mean ion mass data is taken from the highest resolution available (which is often hourly) using Piecewise Cubic Hermite it can be seen that am is persistently a little smaller than the average response at 0-9 UT and persistently a little larger at 15-4 UT.In (b) the points are colored by the separation of time-of-year F from the value at the closest equinox, Dt eq .The linear correlation for all data is r = 0.96 and the best fit linear regression is again the green line.The am response at low Dt eq (around the equinoxes) is persistently greater than at high Dt eq (around the solstices) showing that there is a contribution to the semi-annual variation in am is not associated with that in (P a /P o ) and hence not directly attributable to the R-M effect on solar wind-magnetosphere coupling.The black and orange lines are, respectively, the best-fit linear regressions for around the equinoxes (Dt eq 0.125) and around the solstices (Dt eq > 0.125).The ordinary least squares (OLS) linear regression coefficients and errors are given in Table 1.
Interpolating Polynomial (PCHIP) interpolation.This interpolation procedure was tested using available 64-s observations of the solar wind mean ion mass from the Advanced Composition Explorer (ACE) which were averaged into hourly values to which the same procedure was then applied.It was found the distribution of differences was Gaussian with mode and mean value at essentially zero and a standard deviation that was 2.5% of the overall average of the mean ion mass.We employ the optimum coupling exponent a of 0.44 found by Lockwood et al., (2019b).These 1-min P a /P o values are then combined into hourly means, rejecting values where the predicted error in P a /P o due to data gaps exceeds 5%, using the criterion established by Lockwood et al. (2019a) from the introduction of synthetic data gaps.Note that interpolation, such as used above to obtain 1-min mean ion mass values, cannot be used to fill data gaps in the P a /P o sequence because of the much greater variability introduced the IMF orientation factor A h (Lockwood et al., 2019b(Lockwood et al., , 2019c)).These hourly values are then averaged into 3-hourly values, keeping data only when all three 1-h values are available.This procedure was repeated many times to generate 3-hourly values centred on time t am -dt, where t am are the times of the centres of the three-hourly intervals over which am is computed and the response lag dt was varied between À10 min and +300 min in steps of 1 min.
All 310 of the series of three-hourly P a /P o data obtained this way were then subdivided into the 288 F-UT bins used in Figure 3.The total number of am data points in the interval (1 January 1995-1 September 2019) is 75,008 and dividing into 288 F-UT bins gives an average of 260.4 F-UT samples per bin.To make the lag correlations we use one day of data (i.e., eight values around each F-UT sample at times À9 h, À6 h, À3 h, 0, +3 h, +6 h, +9 h and +12 h relative to the UT of the bin) giving an average of 2083.3 observations per bin for all the am data.The data were then further subdivided into nine quantile ranges of geomagnetic activity, as quantified by am.We use the notation that, for example, the 50% quantile of am (i.e., the median) is q(0.5).The quantile ranges used are: q(0) < am < q(1) (i.e., all data, the red points in Fig. 8 for which there are 2083.3samples in each correlation analysis); five non-overlapping bands each containing 20% of the data (i.e., an average of 416.7 am samples in each correlation analysis), q(0) < am < q(0.2) (orange points), q(0.2) < am < q(0.4) (pink points), q(0.4) < am < q(0.6) (light green points), q(0.6) < am < q(0.8) (cyan points), and q(0.8) < am < q(1) (dark green points).We also study and the largest 10% of am values q(0.9) < am < q(1) (blue points, each based on 208.3 samples on average), the largest 5% q(0.95) < am < q(1) (the mauve points, each based on 104.2 samples on average) and q(0.99) < am < q(1) (the black points, each based on just 20.8 samples on average).In fact some data sub-divisions had fewer data points to correlate than this because of data gaps in the P a data series.In each of these 288 Â 9 = 2592 cases, a lag correlogram was generated and the peak correlation, r p , and the lag giving it, dt p , were determined.Figure 8 is a scatter plot of r p against log 10 (dt p ). Points are colour coded by the quantile range using the key given (as described above, the red points are for all data and the black point for the largest 1%).Points are plotted only where the number of data points exceeded 16 and the p-value of the null hypothesis (that the correlation was zero) was less than 0.05.As expected, the lowest correlations are for the lower-activity quantiles, the higher quantiles generally giving values that exceed the value of 0.842 that is obtained for all 3-hourly means.The values of dt p are generally between 15 min and 2 h, although we note peak correlation is sometimes found at larger dt p for the largest 1% of am Fig. 8. Peak correlation r p between am and 3-hourly means of P a /P o as a function of the logarithm of the lag dt p giving that peak correlation, for data from 1995-2017, inclusive.One-minute P a /P o data were averaged into hourly and then 3-hourly intervals, centred on the mid-points of the am data intervals, minus a response lag dt that was varied between À10 min and 5 h in steps of one minute.At each dt the correlation between P a (t À dt)/P o and am(t) is evaluated for each of the data subsets studied.The data are divided and colourcoded into 9 quantile ranges of am given by the legend.We use the notation that 20% of all the data have am values lower than q(0.2).
valuesthese are long duration storms where the activity level remains high for a day or longer.Note also that these values for the largest am are the least reliable because they are based on the smallest number of samples, although the p-values of points used always meet the 95% significance threshold.The Omni data have been lagged to the nose of the bow shock and so dt p includes allowance for propagation time across the magnetosheath to the magnetopause, the time for open flux to be appended to the tail lobe and the growth phase accumulation of sufficient open flux in substorm growth phases to drive the onset of expansion phase activity and then a further delay as am rises to its peak response.
Figure 9 plots the distributions of 288 dt p values (one for each of the F-UT bins) for various quantile ranges of am.The distributions are taken in 1-min ranges and then smoothed with a 10-min running mean.The upper panel shows the distribution for all values (q(0) am q(1), the red points in Fig. 8) and the lower panel shows the distributions for the eight other quantile ranges, using the same colour scheme as for the points in Figure 8.It can be seen in the lower panel that for lower am the lag is somewhat longer (peaking around 75 min).For am values above q(0.8)there are hints of two peaks to the distribution, one around 65 min, the other near 40 min and as am is further increased, the relative magnitude of these two peaks changes appears to change.It is not certain that there really are two peaks to these distributions, but the net effect of a decrease in the mode value at higher quantile levels is clear.For the largest 1% of am values there is a very broad distribution with a single peak near 45 min.The top (red) curve shows that the optimum single lag is dt o = 60 min.As expected there are almost no values at negative lags and there are very few at lags below 10 min which would be quasi-instantaneous responses.These cases may be chance occurrences or could be the effects of a second pulse of reconnection, as modelled for ionospheric convective flow by Morley & Lockwood (2006).We searched for a consistent pattern in F and UT in this distribution of dt p lags but found none (the lack of variation with UT in particular being expected because values for a whole day were selected around each UT).
In summary, in this section we have studied the response of geomagnetic activity, as quantified by the am index, to power input into the magnetosphere.We find no consistent variation in the response delay with time-of-year, F, and the use of one-day intervals around nominal UTs (to keep sample numbers high and correlations significant) precluded the detection of any UT variation.The mode of the distribution consistently moves to slightly lower lags with increased activity levels and the width of the distribution increases.However, we note that the lower number of samples for the highest am values means that these results are the least reliable.Overall, a lag of 1 h is optimum for the whole dataset (dt o = 60 min), but there is considerable spread with the 2-sigma points being near 10 and 100 min.For all studies in the remainder of this paper that compare the am index with any interplanetary parameters (for example power input into the magnetosphere, P a /P o , solar wind dynamic pressure, p SW , or the IMF [B Y ] GSEQ component) we make allowance for this mean lag by taking the interplanetary data dt o = 60 min before the time of the am value (which is the mid-point of the 3-h interval over which the range and hence the am value is determined).
4 Year-to-year fluctuations in the semi-annual variations in power input to the magnetosphere and the geomagnetic response As discussed in Paper 1 and in the introduction to the present paper, the semi-annual variation generates a paradox because the R-M effect, which is undoubtedly active, enhances the average magnetic activity levels and the occurrence of small and moderate storms.However, because the largest storms are driven by southward pointing field in the GSEQ frame, the R-M effect should actually, on average, decrease the occurrence of large storms at the equinoxes, the opposite of what is observed.Hence, in this section we investigate the occurrence of large geomagnetic disturbances.Panels (d)-(i) of Figure 10 are "F-year spectrogram" plots, and allow us to study the year-to-year fluctuations in the time-of-year variation in the am index and in the ar indices and compare to the corresponding behaviour of P a /P o .Each panel gives the normalized mean value in the 36 equal-sized bins of time-of-year F as a function of year and F: hence, for example, the pixels in Figure 10e are coloured according to the value of <am> F /<am> 1yr .We normalize by dividing by the annual means, <am> 1yr , so that the timeof-year variations can always be seen, even at sunspot minimum.The top row of plots shows the solar cycle variations by plotting the variations in the annual means as a ratio of the overall mean value: Panel (a) is for power input into the magnetosphere, <P a > 1yr /P o ; (b) is for the am index, <am> 1yr / <am> all (and also shows in green the number of substorm onsets N o derived from the SML index, <N o > 1yr /<N o > all as used in Paper 1); (c) shows the annual means for the four ar indices, <ar> 1yr /<ar> all .The solar-cycle variations of annual means q(0) < am q(1) 0 0.005 0.01 0.015 p.d.f q(0) < am q(0.2) q(0.2) < am q(0.4) q(0.4) < am q(0.6) q(0.6) < am q(0.8) q(0.8) < am q(1) q(0.9) < am q(1) q(0.95) < am q(1) q(0.99) < am q(1) The F-year patterns in parts (d)-(i) are also very similar indeed, in terms of both average levels and individual large events (storms), which show up as orange and yellow pixels.The colour scale used in all panels is the same and one can see the amplitude in the am and ar response patterns is larger than for P a /P o , particularly during the larger events.Because the results for am and the four ar indices are so similar, we hereafter show only the results for am.
As in Paper 1, in Figure 11 we subdivide the data for P a /P o and am into times when the [B Y ] GSEQ was positive (middle column) or negative (right-hand column), in order to identify the role of the R-M effect.We allow for the optimum lag dt o = 60 min between the IMF conditions at the nose of the bow shock and the am response that was derived on Section 3.3.The left-hand column repeats the variations for all data for comparison.The top row  shows the variations in annual means of P a /P o (in black) and am (in mauve).These solar cycle variations are all very similar, although the amplitude is slightly smaller for [B Y ] GSEQ < 0 than for [B Y ] GSEQ > 0 for the years covered by this study.The middle row shows <P a /P o > F /<P a /P o > 1yr and the bottom row <am> F /<am> 1yr .The R-M effect is clearly evident with persistent increases at the September and March equinoxes for [B Y ] GSEQ > 0 and [B Y ] GSEQ < 0, respectively.The annual variations for the two [B Y ] GSEQ polarities separately are, again, slightly greater for am (Figs.11h and 11i) than for P a /P o (Figs.11e and 11f).
For both P a /P o and am, Figure 11 demonstrates a clear R-M effect in mean values, detected by the dependence of which is the favoured equinox for a given IMF [B Y ] GSEQ .However, the picture is not so clear in the occurrence of large storms, particularly in am.Taking a threshold of <am> F /<am> 1yr = 1.7 (yellow pixels), we can see in Figure 11h that for [B Y ] GSEQ > 0, 13 events occurred at the favoured March 5 at the unfavoured September equinox, and 4 around the solstices.In Figure 11i we see that for [B Y ] GSEQ < 0 there were 10 events at the favoured September equinox, 3 at the unfavoured March equinox, and 6 around the solstices.Hence the tendency for events to cluster around the favoured equinox is present as predicted for the R-M effect, but Paper 1 (and Appendix B in particular) has explained why we might expect the R-M effect to, if anything, be reducing the magnitude of the largest storms (and hence the number of events over a large threshold) at the favoured equinox.
One additional point to note about the unseparated means (Figs.11d and 11g) is that we are averaging over roughly 10-day intervals and, given that for a 2-sector, 3-sector and 4-sector structures in the heliospheric field, we remain in one sector for, on average, 13.5 days, 9 days and 6.75 days, this means that the averages are usually taken for one dominant polarity of IMF [B Y ] GSEQ , and there is, in most cases, relatively minor cancellation of effects that depend on the polarity of [B Y ] GSEQ .This is no longer true if we average over all years (instead of one year at a time as in Fig. 11  The results now show a significant difference in the behaviour for am and for P a /P o .For P a /P o (Fig. 12b) both the [B Y ] GSEQ > 0 and [B Y ] GSEQ < 0 cases have almost sine-wave forms that are in antiphase.In fact, the positive deflection (at the equinox for which the [B Y ] GSEQ polarity is favoured) is only very slightly larger than the negative deflection (at the equinox for which the [B Y ] GSEQ polarity is unfavoured) and so the average for all data has only a very weak semi-annual variation, because the nearperfect asymmetry between the variations for the two polarities means that they almost cancel and the net semi-annual variation is small.Contrast this with the corresponding variations for am (Fig. 12a) for which the positive deflections at the favoured equinox in the single-polarity [B Y ] GSEQ curves are considerably larger than the negative deflections at the unfavoured equinox.As a result, the variation for all data (thick black line) has a much more marked semi-annual variation.It is this much larger difference between the results at any one equinox between the "favourable" and "unfavourable" polarities of IMF [B Y ] GSEQ that causes the amplification of the semi-annual variation in the am geomagnetic activity index, compared to that in P a /P o .This demonstrates that although the R-M effect is working, it is not working in quite the way that is commonly thought, which is the way that was envisaged by Russell & McPherron (1973) in their seminal paper.If the semi-annual variation were working solely through the modulation of solar wind-magnetosphere coupling by the non-linearity of the response of power input to the magnetosphere to IMF orientation (as envisaged in the original paper), we would expect am and P a /P o to show the same behaviour in Figures 12a and 12b.This is clearly not the case.Figure 12 shows that for average P a /P o , the R-M effect is effective, but the resulting semi-annual variation is much smaller than that in average am.The am index reflects the increase in P a /P o for the favourable polarity of [B Y ] GSEQ but does not reflect the almost equal magnitude decrease in P a /P o for the unfavourable polarity of [B Y ] GSEQ .Hence it is not the non-linearity of the solar wind-magnetosphere energy coupling that gives us most of the R-M effect, rather it is the non-linearity of the geomagnetic response to that energy input.Hence separating the two polarities of the IMF [B Y ] GSEQ component reveals that it is the non-linearity of the am response to P a /P o , rather than just that of the P a /P o response to IMF orientation, that is causing the two subsets for [B Y ] GSEQ > 0 and [B Y ] GSEQ < 0 to combine to give such a strong semi-annual variation in average am.
Figures 12a and 12b deal only with average values and the lower panels of Figure 12 look at the corresponding behaviour in the occurrence of large events.The thick blue, orange and mauve lines in Figures 12c-12f plot the number of events, N, in which a 3-hourly am exceeds the 90, 95 and 99 percentile, respectively, i.e., am>q(0.90),am>q(0.95), and am>q(0.99). .Analysis of the differences between semi-annual variations in the am index (left-hand panels) and in power input to the magnetosphere, P a /P o (right-hand panels).In all panels a thick line denotes the variation for all data, a thin line connecting open circles is for IMF [B Y ] GSEQ > 0 and a thin line connecting filled triangles is for [B Y ] GSEQ < 0. The lines in parts (a and b) are mean values whereas (c-f) show the average number of days per year N when the daily means of am (panels c and e) or of P a /P o (panels d and f) exceed their 90% quantile (blue lines), 95% quantile (orange lines) and 99% quantile (mauve lines).Parts (g) and (h) plot N/<N>, the number of days per year N normalized by their average value.Note that the variations of N in (e) and (f) for the separate IMF [B Y ] GSEQ polarities are plotted on a scale that is different to that for all data in (c) and (d).The 90% quantile levels, q(0.9), are 2.031 for am/<am> all and 2.081 for P a /P o ; the 95% quantile levels, q(0.95), are 2.556 for am/<am> all and 2.588 for P a /P o ; the 99% quantile levels, q(0.99), are 4.193 for am/<am> all and 4.400 for P a /P o .The close similarity between these pairs quantile values arises from the similarity of the normalised c.d.f.s shown in Figure 6 of the present paper and Figure 8b of Paper 1.
events above the thresholds by plotting N/<N> for both polarities.By definition, these variations in N/<N> are always about unity and the amplitudes of the semiannual variations in event occurrence can be compared.It can be seen in Figures 12g and 12h that for both am and P a /P o , the fractional amplitude of the semi-annual variation in storm occurrence is larger for the larger storms.Comparing each line in Figure 12g with its corresponding line in Figure 12h, we can see the amplification of the semiannual variations in the occurrence of large am events compared to the corresponding variation for P a /P o .However, this also shows the amplification is greatest for the smaller events and decreases with a higher event threshold, such that for the top 1% of events, the amplitude of the semi-annual variation in P a /P o is approaching that in am and the amplification is not much greater than unity.This runs counter to the Russell & McPherron (1973) explanation of large storms which was that small variations in P a /P o induced by the R-M effect caused a large increase in the number of great storms through the amplification effect.Figure 12 shows that in fact the amplification factor is smallest for the largest storms.This variation in the amplification in the am response to P a /P o variations is puzzling.We know it is not a general feature of the overall variation of am with P a /P o as that is linear on all timescales.However, there is scatter about that linear variation and the effect must be hidden in that scatter.We will return to this point in Section 7. Figure 13 clarifies how the behaviour noted in Figure 12 is occurring.The upper plots show the variation of the probability distribution functions (p.d.f.s) of P a /P o with F for (left) IMF [B Y ] GSEQ < 0 and (right) [B Y ] GSEQ > 0. These are consistent with the means and occurrence frequencies for P a /P o shown in the right hand panels of Figure 12.The bottom panels of Figure 13 shows the am amplification, expressed as (am/<am> all )/(P a /P o ), as a function of F and for the same P a /P o bins as for the p.d.f.s in the upper panels.What is noticeable is that the amplification of am, relative to P a , is greater at the equinoxes and is particularly effective in raising am at low P a /P o .It is this effect that causes increases in am for the unfavourable [B Y ] GSEQ polarity which means that the decreases for the unfavourable [B Y ] GSEQ polarity cancel the increases for the favourable [B Y ] GSEQ polarity to a relatively small extent only.This cancellation occurs to a much greater the extent for P a /P o , giving a smaller semi-annual variation.
A key point about Figure 13 is that it shows that the mechanism responsible for this amplification in am is only working at the equinoxes because we do not see any such amplification at the solstices.
To summarise this section, we have shown that the pattern of variations in F-year spectrogram plots in geomagnetic activity and power input into the magnetosphere are very similar in both average values and the occurrence of large events and that the R-M effect is at the heart of both; however, there is a non-linear amplification of the semi-annual variation in the geomagnetic response.We show that there is only a very small asymmetry in the increase in power input into the magnetosphere P a /P o at a given equinox for the "favourable" polarity of [B Y ] GSEQ (via the R-M) effect, compared to the decrease for the "unfavourable" polarity of [B Y ] GSEQ but that this difference is much greater in the am response and it is this that amplifies the semi-annual variation in am.We conclude that there is a second mechanism at work at the equinoxes that amplifies the am response to a given P a /P o .

Variations with solar wind dynamic pressure
As discussed in the introduction, there is evidence that geomagnetic activity depends on both the amount of magnetospheric open flux and on the solar wind dynamic pressure, p SW .Given that the dominant mechanism that allows power input into the magnetosphere is the generation of open flux, this implies that geomagnetic activity should be increased by both solar wind power input to the magnetosphere and the solar wind dynamic pressure.This is confirmed by Figure 14 which colourcodes the average am in bins of width 0.15 in both P a /P o (along the x axis) and p SW /<p SW > all (along the y axis).Figure 14 shows that am increases both with increased P a at a fixed p SW and with increased p SW at a fixed P a .(For both p SW and P a /P o , values are taken a time dt o before the am data, where dt o = 60 min.is the optimum response lag, as derived in Sect.3.3).The grey points in Figure 15a are a scatter plot of 3-hourly am data as a function of the 3-hourly P a /P o values (again using the optimum response lag dt o of 60 min) and Figure 15b studies the extent to which the scatter around the best regression is explained by p SW .In Figure 15a the mauve line is the linear regression fit to the 3-hourly data, given by Higher-order polynomial fits were carried out but differences were always negligible.The orange points are mean values in 1 percentile ranges of P a /P o and the errors bars are plus and minus one standard deviation.The correlation coefficient between am and P a /P o for this 3-h timescale is r c = 0.867 which means P a /P o is explaining 100 Â r 2 c = 75.2% of the variation in am.The fit residuals for this linear regression, given by Áam ¼ am À am fit ; were then evaluated and are plotted in Figure 15b as a function of normalized solar wind dynamic pressure p SW /<p SW > all .
As expected from Figure 14, the average values of Dam increase with p SW /<p SW > all but there is extremely large scatter, indicating that although p SW is a factor, it is certainly not the only factor influencing these fit residuals.However, Figures 14 and 15 show that the amplification of the am response to a given P a /P o , identified in Figures 12 and 13 as a key component of the semiannual variation, increases with increased solar wind dynamic pressure, p SW .Figure 16 plots the fit residuals Dam as a function of F and UT.Given that am shows an equinoctial pattern and that P a /P o does not, it is not surprising that the fit residuals Dam form an equinoctial pattern.This can be seen in Figure 16a which is for all data.Figures 16b-16d subdivide the data into the three terciles of solar wind dynamic pressure, p SW .These plots all show the equinoctial pattern and comparing them reveals that this pattern grows in amplitude as p SW increases, revealing that solar wind dynamic pressure is a key factor in the generation of the equinoctial pattern as well as a contributor to the scatter in the relationship between P a /P o and the am index.
Figure 17 studies this relationship in more detail using 20 quantile ranges in p SW that each include 5% of the data.It shows that the amplitude of the equinoctial F-UT pattern, here quantified by the standard deviation of the 288 data points that go into each pattern, r(<Dam> F,UT ), varies almost linearly with p SW .The alternate grey and white bands define the quantile ranges in p SW that were used.
To summarise this section, we have confirmed the results of previous studies discussed in the introduction (Schieldge and Siscoe, 1970;Caan et al., 1973;Kokubun et al., 1977;Karlsson et al., 2000;Yue et al., 2010) that indicated enhanced solar wind dynamic pressure enhances geomagnetic activity.We have quantified the effect and show that it is responsible for the equinoctial pattern of the geomagnetic response which increases in amplitude with enhanced solar wind dynamic pressure.
6 The time-of-year and Universal Time variation Figure 18 analyses the contributions to the (a) time-of-year F and (b) UT variations in am.The orange lines are am fit variations, derived from the P a /P o data using the linear regression equation (1).As expected from Figure 12, the semi-annual variation with F (averaged over all UT) in am fit is much smaller amplitude than that in am (black line).The mauve, blue and cyan lines show the means of the fit residuals Dam from equation (4) for the upper, middle and lower tercile ranges of the solar wind dynamic pressure, p SW .Figure 18a shows that the semi-annual variation is present for the lower and middle terciles of p SW but the largest contribution comes from the  The grey points form a scatter plot of am against 3-hourly means of power input into the magnetosphere, P a /P o , generated by averaging over 3 h intervals that are shifted forward in time by the derived best lag of dt o = 60 min (see Fig. 7a) relative to the three-hourly intervals in which am is evaluated.The orange points are the mean values averaged in 1% quantile ranges of P a /P o , i.e., q(0) <P a /P o > s=3h < q(0.01), q(0.01) <P a /P o > s=3h < q(0.02), up to q(0.99) <P a /P o > s=3h < q(1).The black error bars are the plus and minus one standard deviation in those means.The mauve is the best-fit OLS linear regression to the 3-hourly data.(b) the grey points are a scatter plot of the fit residuals Dam for the fit shown in (a): this is the difference between each three-hourly am value and the best fit linear regression value based on the corresponding <P a /P o > s=3h value (Dam = am À am fit ), plotted as a function of the simultaneous three-hourly mean of the normalized solar wind dynamic pressure <p SW > s=3h /<p SW > all .The orange points in (b) are means in 1% quantile ranges of <p SW > s=3h and error bars are plus and minus one standard deviation in the mean Dam.The mauve line is the best linear regression to the 3-hourly values.highest p SW .Figure 18b is the corresponding plot for the UT variations.In this case, there is no UT variation in P a /P o and hence am fit .The UT variation, like the F variation, is seen in all of the average variations of Dam but again is strongest for the larger tercile of p SW .This indicates that p SW also plays a role in generating the UT variation.
This section has provided further tests of theoretical and model interpretations of the variations with dynamic pressure.It shows that not only does the amplification of the semi-annual variation increase with increased solar wind dynamic pressure, so do the amplitudes of the equinoctial pattern and of the UT variation.

Discussion
We have presented a purely empirical analysis of the response of the am geomagnetic index (and its 4 MLT sector sub-indices, ar dawn , ar noon , ar dusk , and ar midnight ) to interplanetary conditions arriving at the nose of Earth's bow shock as a function of time of year, F, Universal Time, UT, and geomagnetic activity level, using one-minute interplanetary data averaged into 3-h windows.We have made use of the analysis of the effect of data gaps by Lockwood et al. (2019a) to ensure that The mean Dam for the lower tercile of the simultaneous solar wind dynamic pressure, <Dam> F,UT for q(0) <p SW > s=1h < q(0.33).(c) The mean Dam for the middle tercile of the simultaneous solar wind dynamic pressure, i.e., <Dam> F,UT for q(0.33) <p SW > s=1h < q(0.67).(d) The mean Dam for the upper tercile of the simultaneous solar wind dynamic pressure, <Dam> F,UT for q(0.67) <p SW > s=1h < q(1).Note that in parts (b-d) the same color scale is used to emphasize that the equinoctial pattern, although present for all three ranges of p SW , is of amplitude that increases with p SW and is much larger in amplitude for the largest p SW values.This relationship is further studied by Figure 17.am for q(0.67)<pSW q(1) am for q(0.33)<pSW q(0.67) am for q(0)<p SW q(0.33) b).(orange lines) am fit , the best-fit of power input to the magnetosphere, P a /P o to am; (mauve lines) the fit residual Dam for the upper tercile of the solar wind dynamic pressure, q(0.67)<pSW q(1); (blue lines) the fit residual Dam for the middle tercile of the solar wind dynamic pressure, q(0.33) < p SW q(0.67); and (cyan lines) the fit residual Dam for the lower tercile of the solar wind dynamic pressure, q(0) < p SW q(0.33).
the error they cause to estimates of power input into the magnetosphere is below 5%.In order to compare the interplanetary data with the geomagnetic data we need to know the best response lag to employ to ensure we are comparing the most relevant interplanetary data.In Section 3.3, we used averages over same interval durations as used to compile the am geomagnetic data (3-hourly) but the averaging intervals were shifted in steps of one minute in order to determine the best response lag (taken to be the lag giving peak correlation).We show that there is considerable variability in response times between about 10 min and about 100 min and there is some systematic variation in the derived distribution of response lag with activity level -the mode of the overall distribution being close to 60 min.Hence using a single value for the response lag is an approximation; however, in this context it must be remembered that using an ideal individual lag for each case would also be an approximation because the driving power input may be relevant over an interval that could be longer, or shorter, than the 3 h used.However, using 3-h interplanetary data and neglecting such a response lag completely (by using simultaneous data) would result in, on average, one third of each interplanetary sample not being the most relevant data and using an overall average lag is hence better than using simultaneous data.
We have presented a number of empirical analyses that will be used to constrain interpretation in a later paper in this series which will employ both empirical and MHD models of the magnetosphere.
The results presented in the current paper confirm that the R-M effect is active and the key part of the semi-annual variation in geomagnetic activity.In the midnight sector we find that 42.9% of the semi-annual variation in geomagnetic activity can be explained as the semi-annual variation in the power input into the magnetosphere, P a , which is almost entirely due to the Russell-McPherron effect.This agrees well with the results of McPherron et al. (2013) who employed the AL auroral electrojet and a simpler interplanetary coupling function (the halfwave rectified dawn-to-dusk interplanetary electric field E DD = V SW B S ) and the linear prediction filter formalism which enabled them to allow for systematic variations in the relationship between E DD and the AL index (for example with time of year because AL is a northern hemisphere index only and so has a strong seasonal response).At the same time, we find that in 3-hourly means the linear correlation coefficient between the am index and P a is 0.867, meaning that 75.2% of the variation of am is explained by P a .Furthermore, in this paper and in Paper 1, by sorting the data by [B Y ] GSEQ we have shown that the R-M effect dominates the am response in both average levels and the occurrence of large events.The patterns in the F-year spectrogram plots presented in Figure 10 for P a, am and the four ar indices are very similar, it is just the amplitude of the pattern varies with the amplification factor for the geomagnetic index in question.Furthermore, sorting by the polarity of the IMF [B Y ] GSEQ component confirms the patterns originate from the R-M effect.Hence we conclude that it is not that the R-M effect is partially responsible for the semiannual variation, rather it is almost wholly responsible.However, Figures 12 and 13 show us how the amplification of a very small semi-annual variation in P a to give a much larger semiannual variation in am occurs.The R-M effect only causes a slight asymmetry in how much average P a (or the occurrence of large P a events) is enhanced for the "favourable" polarity of the IMF [B Y ] GSEQ at a given equinox compared to how much it is decreased by the "non-favourable" polarity of the IMF [B Y ] GSEQ .However, because of a non-linearity in the am response this drives a larger asymmetry in geomagnetic activity.We note that in their original paper, Russell & McPherron (1973) did comment on the need for this amplification; however, the discussion was conflated with that on the occurrence of large storms which, as discussed in the introduction, raises the involvement of large, negative [B Z ] GSEQ which, when it becomes dominant over |[B Y ] GSEQ | (i.e., clock angles in GSEQ become greater than about 120°), make the R-M have the opposite effect to that for the traditional R-M effect as proposed by Russell & McPherron (1973).
However, there is still a puzzle to resolve here because when we look at a scatter plot of all am data as a function of P a (as for the 3-hourly means in Fig. 15a) we see only a very slight nonlinearity but not one large enough to generate the contrasting the behaviours of am and P a seen in Figure 12.
Figures 14 and 15b both demonstrate that am depends on both P a and the solar wind dynamic pressure, p SW .These two factors have common elements but differ in the terms that depend on the IMF and its orientation.For the best-fit coupling exponent used here of a = 0.44, as derived for all averaging timescales by Lockwood et al. (2019b), P a varies as m 0:227 SW N 0:227 SW V 1:453 SW B 0:88 sin 4 ðh=2Þ where m SW is the mean solar wind ion mass, N SW is the number density, V SW is the solar wind speed, B the IMF magnitude and h is the IMF clock angle in the GSM frame.On the other hand, p SW ¼ m SW N SW V 2 SW and so the two factors are different but also have terms in common and so there will be some systematic dependence of one on the other.Figure 12b shows that the fit residuals of the best fit of P a /P o to am depends on p SW implying there is an additional dependence of am on p SW .Interestingly the fit residuals (Fig. 16), show the equinoctial F-UT pattern (Cliver et al., 2000) which increases in amplitude linearly with increased p SW (Fig. 17).Given that the equinoctial pattern is present in am but absent in P a this strongly implies that p SW introduces and additional variation to geomagnetic activity that includes the equinoctial F-UT pattern.Naturally, because these fit residuals show an equinoctial F-UT pattern, they will also show an additional semi-annual variation that is not accounted for by P a on its own.Figure 18a confirms this and that the amplitude of the semiannual variation increases with and p SW .An interesting finding is presented in Figure 18b that shows that these fit residuals also show the UT variation which also increases in amplitude as p SW increases.
Figure 19 investigates further the complex inter-relationship between p SW and P a and its implications.The combined am, p SW and P a /P o data are averaged into 36-equal width bins of F and divided into the two polarities of IMF [B Y ] GSEQ (only to show there is no significant difference between the two) and 5 quantile ranges of p SW .The averages of am are plotted as a function of the corresponding averages of P a /P o for each of the 5 quantile ranges of p SW : the [B Y ] GSEQ polarities are distinguished by the symbols used, and the p SW quantile ranges by the colors used.The black lines are 3rd-order polynomial fits to the data and are drawn only between the largest and smallest mean value of P a /P o for the quantile range of p SW in question: (note that, because of the common factors in p SW and P a , that range varies with p SW ).These polynomial fit lines are re-plotted in Figure 19c, using the same colour scheme as for the data points in part (a), and superposed on a scatter plot of averages for all data in time bins of 1/36 yr (the grey points).It can be seen that at each constant p SW value, am has non-linear dependence on P a /P o which is not apparent in the overall scatter plot.Figures 19a and 19c show the variation of am with P a /P o in 5 equal sized quantile ranges of p SW .Figures 19b and 19d show the corresponding variation of am with p SW in 5 equal sized quantile ranges of P a /P o : they show that am also depends non-linearly on p SW at a fixed P a /P o .
The key point about Figure 19 is that when we look at a constant solar wind dynamic pressure p SW we do see a highly non-linear increase in geomagnetic activity (am) with the power input into the magnetosphere P a , of the kind needed to explain the amplification of the geomagnetic response (and hence of the semi-annual variation) to a relatively small R-M effect in P a .This is lost in the overall scatter plots of P a against am (which appear linear with some scatter) because of the common factors in p SW and P a .However, Figure 19 strongly suggest that p SW has a distinct and different physical effect from P a and this is supported by the findings presented in the current paper that the amplification of the am response, the equinoctial pattern and the UT variation arise from the effect of p SW rather than the effect of P a .The concept of a distinct and different influence of p SW on the nightside magnetosphere and the substorm phenomenon has been reported several times in the literature.Increases in p SW have been reported to trigger onsets of full substorm expansion phases by, for example, Schieldge & Siscoe (1970), Kokubun et al. (1977) and Yue et al. (2010) and Caan et al. (1973) showed that, statistically, the magnetic energy density in the near-Earth tail lobes (that powers substorm expansion phases) was increased by both by prior intervals of southwardpointing IMF and solar wind dynamic pressure, p SW .An effect of p SW was directly demonstrated directly by Karlsson et al. (2000) who showed that near-Earth tail energy content was reduced if p SW decreased and that sudden decreases caused quenching of any substorm expansion that had recently begun.Lockwood, 2013;Lockwood et al., 2014) and Finch et al. (2008) have noted that indices that are strongly influenced by the substorm current wedge have a strong dependence on V 2 SW and hence p SW .Finch et al. (2008) noted that only magnetometer stations closest to the nightside auroral electrojet showed both the equinoctial F-UT pattern and the V 2 SW dependence.Lockwood (2013) argues that this implies that p SW influences the auroral electrojet and the equinoctial F-UT pattern by constraining the near-Earth tail such that the appending of open flux to the tail lobes of the magnetosphere increases the field and hence the stored energy density and magnetic shear across the cross-tail current sheet.This does not happen further down the tail where the magnetopause boundary becomes aligned with the solar wind flow and the magnetic pressure in the lobes, and hence the field, is set by the balance with static pressure in interplanetary space.Hence, at these large negative X GSE , adding open flux just causes the tail to flare in cross-sectional area and the field in the lobes and the magnetic shear across, and hence total current in, the cross-tail current all remain constant.Thus there is considerable evidence for a separate role of solar wind dynamic pressure p SW in modulating the storage and release of the energy extracted from the solar wind, P a in the near-Earth tail of the magnetosphere.

Conclusions
This paper has presented a collection of empirical results concerning the semi-annual variation in geomagnetic activity.Interpretation will largely be left to a subsequent paper in this series which will make use of empirical and numerical MHD models of the magnetosphere.In this section we bring together the summary conclusions of each section of this paper.
We have presented an initial study of 2 years' data on the semi-annual variation in transpolar voltage and flux transfer.We find there is a semi-annual variation in transpolar voltage but it is not as large in amplitude as that in geomagnetic activity which is consistent with the latter showing a non-linear (quadratic) variation with transpolar voltage.We find that during the persistent minimum of the UT variation in geomagnetic activity defined in Paper 1 there is a persistent decrease in transpolar voltage which is potentially consistent with a decrease in [B Y ] GSEQ <0 [B Y ] GSEQ >0 q(0) p sw < q(0.2) q(0.2) p sw < q(0.4) q(0.4) p sw < q(0.6) q(0.6) p sw < q(0.8) q(0.8) p sw < q(1) q(0) P < q(0.2) q(0.2) P < q(0.4) q(0.4) P < q(0.6) q(0.6)P < q(0.8) q(0.8)P < q(1) Fig. 19.Analysis of the effects on am of (left column) power input to the magnetosphere P a /P o at a given solar wind dynamic pressure p SW and of (right column) the effects of p SW at a given P a /P o .Values of am/<am> all are plotted as a function of corresponding normalised values of (left) P a and (right) p SW .In (a) data are averaged into 5 quantile ranges of the normalised solar wind dynamic pressure p SW /<p SW > all , 36 bins of time-of-year F, and by the polarity of the IMF Y component in GSEQ, [B Y ] GSEQ : the means of am/<am> all are plotted against the corresponding means of P a /P o .In (b) data are averaged into 5 quantiles of normalised power input to the magnetosphere P a /P o , 36 bins of time-of-year F, and by the [B Y ] GSEQ polarity and the means of am/<am> all are plotted against the corresponding means of p SW /<p SW > all .The black lines in the upper panels are 3rd-order polynomial fits in each case, fitted to data for all [B Y ] GSEQ drawn over the range of the averaged data for the quantile range in question.The points are coloured according to the quantile range as given by the key.The lines are also plotted (this time using the same colours as the points in the upper panels) in the corresponding lower panel.The grey points show a scatter plot of all 10-day average values.
reconnection voltage in the cross-tail current sheet.Confirmation of these findings using a larger dataset of transpolar voltage measurements will be presented in a later paper in this series, but the great consistency with which we find the features described (in both polar caps and in each of the two years studied here) leads us to believe that they are real effects.
We have studied the response of geomagnetic activity to power input to the magnetosphere, estimated using interplanetary data from 1995 onwards which is relatively free of data gaps.This is important because previous work has demonstrated that large errors are introduced into studies of solar wind magnetosphere coupling by data gaps (Lockwood et al., 2019a).We find no consistent variation in the response delay with time-of-year, F. There is a 2 sigma spread in the lag values of approximately 10-100 min and some systematic change in the distribution with activity level but, overall, a lag of 60 min is optimum for the whole dataset.Using this lag we have shown that the pattern of variations in F-year spectrogram plots in geomagnetic activity and power input into the magnetosphere are very similar, both for average values and the occurrence of large events, and that the R-M effect is at the heart of both.However the effect on power input into the magnetosphere is very small and there is a non-linear amplification of the semi-annual variation in the geomagnetic response such that a very small asymmetry in power input into the magnetosphere P a between the "favourable" and "unfavourable" polarities of [B Y ] GSEQ generates a greatly amplified geomagnetic semi-annual response.
The origin of this amplification will be discussed in full in a later paper.However, as discussed in the last section, the analysis presented here indicates strongly that it is associated with solar wind dynamic pressure and its role in squeezing the near Earth tail and so modulating the storage and release of energy extracted from the solar wind in the near-Earth tail where the tail radius is still flaring with the ÀX GSE coordinate and so dynamic pressure is a factor.In this paper we have shown that the equinoctial pattern is found in the residuals of fits of P a to the am index and that the amplitude of these equinoctial patterns in the am fit residuals increases linearly with solar wind dynamic pressure p SW .Similarly, the UT variation in am is found in these fit residuals and also increases in amplitude with solar wind dynamic pressure.This strongly suggests the solar wind pressure is a distinct influence of geomagnetic activity.In a later paper we will use empirical and numerical MHD models of the magnetosphere to study the role of solar wind dynamic pressure as a function of dipole tilt and see if the models can produce the influence of solar wind dynamic pressure on the equinoctial F-UT pattern, on the UT variation, and on the amplification of the variation of power input that is inferred here.One finding that needs explanation is why the amplification of the am response is greater at the equinoxes, as shown by Figure 13.With understanding gained about these effects we aim to revist the paradox of large geomagnetic storms and the fact that their occurrence peaks at the equinoxes, yet they are often driven by large southward field in the GSEQ frame, which should make them less common at the equinoxes.

Appendix Relationships between the am index and the transpolar voltage
Figure 4 of the main text presents a scatter plot of daily means of the transpolar voltage U PC against the means of the am index for the same day.The U PC values are derived from observations in 2001-2002 by the DMSP satellites and are normalised to an ideal satellite path along the 06-18 MLT polar cap diameter using the procedure of Lockwood et al. (2009).The error bars in the plot are plus and minus one standard error in the means.The mauve line is the best-fit 3rd order polynomial fit and the grey area is bounded by the 2-sigma uncertainties in that fit.
The procedure employed to make the fit was as follows.The data points were fitted 10,000 times using a 3rd-order polynomial fit.For each fit, every data point was shifted in both am and U PC , by an error drawn at random from a Gaussian distribution with mean value equal to the observed mean value for the point in question and of standard deviation equal to the observed standard deviation around that mean.The c.d.f. of the ensemble of fitted am values at each value U PC was then computed and the best-fit for that U PC then taken to be the median value of this distribution.The maximum and minimum uncertainties were taken to be the 5% and 95% percentiles of these distributions (i.e. the upper and lower 2-sigma points of the ensemble).
The best fit (ensemble median) is the ordinate (along the vertical axis).The fit procedure described above was repeated but this time the c.d.f. of U PC was taken at each fitted am value and the best fit, and its 2-sigma uncertainties, derived in the same way.This gives the corresponding polynomial fits to allow us to convert am into the corresponding U PC value.
Figure2is a demonstration of the important point made in Appendix B of Paper 1.The graphs are taken from the supporting information toLockwood et al. (2019b), in which their derivation is explained in greater detail.This plot demonstrates the test of IMF orientation factors, A h , in solar-wind magnetosphere coupling functions that was devised byVasyliunas et al. (1982) and evaluates the various proposed forms for A h .The black dots are based on the geomagnetic SML index data (the SuperMAG version of the auroral AL index but compiled from a northern hemisphere network of over 100 stations) and show a linear regression of SML/G against h where G is the best-fit estimate of the power input into the magnetosphere from solar wind data but without an IMF orientation factor (i.e., G = P a /A h ) and h is the IMF "clock angle" in the GSM frame, h = arctan (|[B Y ] GSM |/([B Z ] GSM ).Note that this definition means that h is independent of the polarity of [B Y ] GSM and that h varies from zero for purely northward IMF in GSM ([B Z ] GSM = B YZ , where B YZ is the magnitude of the field in the YZ plane that is the same in the GSEQ, GSE, and GSM reference frames) to 180°for purely southward IMF in GSM ([B Z ] GSM = ÀB YZ ).It can be seen from Figure2that the optimum fit to the SML data is for A h = sin 4 (h/2) (the blue line).Almost identical plots to Figure2for the AE and am geomagnetic indices have been presented in Figure4ofBargatze et al. (1986) and Figure9ofLockwood (2019), respectively.The original formulation of the R-M effect byRussell & McPherron

Fig. 1 .
Fig. 1.The southward interplanetary magnetic field (IMF) in the Geocentric Solar Magnetospheric (GSM) frame, [B Z ] GSM as a function of its value in the Geocentric Solar Equatorial (GSEQ) frame, [B Z ] GSEQ from a survey of 24 years' interplanetary and geomagnetic activity data (1995-2017, inclusive).The number of samples in 0.2 nT-by-0.2nT bins, N, as a ratio of the total number, RN, is colour contoured as a function of [B Z ] GSM and [B Z ] GSEQ .The data are averages over 3-h intervals for 1995-2017, inclusive, which are the intervals over which each am value is compiled, shifted by an average optimum am response lag dt o = 1 h (see Sect. 3.3).(a) Is for all data whereas (b-d) are for am exceeding, respectively, its 90%, 95% and 99% quantile for the years studied.The diagonal mauve lines are [B Z ] GSEQ = [B Z ] GSM .

Fig. 2 .
Fig. 2. The test devised by Vasyliunas et al. (1982) of proposed IMF orientation factors, A h (h), where h is the IMF clock angle in the GSM frame: (orange line) sin 2 (h/2); (cyan line) sin 3 (h/2); (blue line) sin 4 (h/2); (mauve line) sin 5 (h/2); and (black line) U(h)cos(h), where U(h) = 0 for [B Z ] GSM > 0 and U(h) = 1 for [B Z ] GSM 0. The black dots are s(SML/G) + c where s and c are the best-fit linear regression coefficients, SML is the SuperMAG westward electrojet index, and G = P a /A h , where P a is the best fit estimate of the power input into the magnetosphere.The dark pink shaded area is the range over which h varies due to the R-M effect for [B Z ] GSEQ = 0 (i.e., IMF lying in the solar equatorial plane) and the range of variation of the dipole tilt over a full year caused by Earth's rotational axis tilt.The light pink area is for the full range caused by the annual variation plus the diurnal variation due to the offset of Earth's rotational and magnetic axes.The light and dark orange regions are the corresponding ranges for [B Z ] GSEQ = B (i.e., purely southward field normal to the solar equatorial plane).

Fig. 3 .
Fig. 3. Simultaneous data from the years 2001 and 2002.In each case the data have been averaged into 365 equal-sized bins of time-of year F (one day long for these non-leap years) and then a running mean taken over 27 bins to cover whole solar rotation intervals.From top to bottom: (a) the half-wave rectified dawn-to-dusk electric field, E DD , in interplanetary space; (b) the AU index; (c) the AL index; (d) the polar cap flux, W PC , from the DMSP satellite data; (e) the transpolar voltage, U PC , from the DMSP satellite data; (f) the reconnection efficiency assuming that the cross-sectional radius of the magnetosphere is 15 R E (where a mean Earth radius 1 R E = 6370 km), g 15 ; (g) the am geomagnetic index; and (h) the ratio of the am index to the transpolar voltage, am/U PC .In panels (d) and (e) for W PC and U PC , the red and blue lines are for the northern and southern polar caps, respectively.

Fig. 4 .
Fig. 4. Scatter plot of daily means of the am index as a function of the corresponding daily mean transpolar voltage, U PC , as determined for 2001-2002 by DMSP satellites (principallyF13) and normalised for the satellite track to an ideal 06-18 MLT path using the procedure ofLockwood et al. (2009).The error bars are plus and minus one standard error in the means.The mauve line is the best-fit 3rd order polynomial fit and the grey area is bounded by the 2-sigma uncertainty level in that fit.The fitting procedure is given in Appendix to this paper, along with polynomial expressions for the best fit and the uncertainty band edges.These can be used to estimate the am level associated with a given U PC : Appendix also gives the corresponding 3rd order polynomials that allow computation of the U PC value associated with a given am value.
where U D is the voltage along the reconnection X-line (or X-lines) in the dayside magnetopause where open flux is generated and U N is the voltage along the reconnection X-line (or X-lines) in the cross-tail current sheet where open flux is destroyed.We can apply equation (1) because the U PC data have been normalised to be along the 06-18 MLT line by using the procedure ofLockwood et al. (2009).By Faraday's law, applied to the open-closed field line boundary, the rate of change of open flux in the polar cap flux (W O ) is ) is effectively a continuity equation for the open flux, W O .The equality with the rate of change of W PC is only approximate because the polar cap flux determined from the spacecraft data is the flux inside the convection polar cap, which is generally greater than the open flux W O ; however the difference is generally small and the rate of change of the difference even smaller.By equation (2), the rise in W PC (dW PC /dt > 0) seen at about 22-05 UT implies either that U D has increased or that U N has decreased, but the former would, by equation (1), cause a rise in U PC whereas the latter would cause a fall.

Fig. 5 .
Fig. 5. Time-of-year/time-of-day (F-UT) plots for transpolar voltage and polar cap flux for 2001-2002, inclusive from the DMSP satellites.Data are sorted into 24 equal-width bins in F and 16 equal width bins of UT using the central time of the polar cap traversal.The top panels (a-d) are for passes of the northern polar cap, the lower panels (e-h) are for passes of the southern polar cap.(a and e) Show the transpolar voltage U PC ; (b and f) show the polar cap flux, W PC ; (c and g) show the simultaneous am index value (linearly interpolated from the three-hourly values to the time of the centre of the polar cap crossing) and (e and h) shows the number of data points, n bin in each F-UT bin.

3. 2
Amplification of the semi-annual variation and the equinoctial F-UT pattern Paper 1 used the four ar indices, compiled byChambodut et al. (2013) from stations in four 6-h magnetic local time (MLT) intervals around 06, 12, 18 and 24 MLT, to show that all display a semi-annual variation and an equinoctial F-UT pattern.It is interesting to note how the index compilation has influenced behavior.The ar indices are based on the range (between maximum and minimum) of the horizontal field component detected in a 3-h intervals and are all, like the am index, dominated by substorm expansion phases and sawtooth events; in other words, by the unloading part of the storage-release magnetospheric response.

Fig. 7 .
Fig. 7.The relationship of am and normalized power input to the magnetosphere P a /P o for large averaging timescales s: (a) s = 1 year and (b) s = 10 days.In (a) the black circles are for all data, for which the linear correlation coefficient is r = 0.97 and the best fit OLS (Ordinary Least-Squares) linear regression, shown by the green line, is (P a /P o ) = sÁam + c.Averages for the 8 UTs of the am index are shown by the squares, colored by the scale shown at the top of the figure:it can be seen that am is persistently a little smaller than the average response at 0-9 UT and persistently a little larger at 15-4 UT.In (b) the points are colored by the separation of time-of-year F from the value at the closest equinox, Dt eq .The linear correlation for all data is r = 0.96 and the best fit linear regression is again the green line.The am response at low Dt eq (around the equinoxes) is persistently greater than at high Dt eq (around the solstices) showing that there is a contribution to the semi-annual variation in am is not associated with that in (P a /P o ) and hence not directly attributable to the R-M effect on solar wind-magnetosphere coupling.The black and orange lines are, respectively, the best-fit linear regressions for around the equinoxes (Dt eq 0.125) and around the solstices (Dt eq > 0.125).The ordinary least squares (OLS) linear regression coefficients and errors are given in Table1.

Fig. 9 .
Fig.9.Distribution of response lags for the 288 F-UT bins studied for the response of the am index to the power input to the magnetosphere P a /P o for 1995-2017, inclusive.(Top) The red line shows the distribution of the lags giving peak correlation dt p between P a /P o and am for all the data.(Bottom) The same analysis but the am data have been further subdivided into the 8 other quantile ranges used in Figure8.The distributions are taken in 1-min ranges and then smoothed with a 10-min running mean.

Fig. 10 .
Fig. 10.Top panel: plots of annual means of an index or variable x, <x> 1yr , for the years 1995-2017, inclusive, where x is: (a) the power input into the magnetosphere, P a ; (b) the am index (in black) and the number of substorm onsets derived from the SML index, N o (in green); (c) the ar indices for dawn (orange), noon (cyan), dusk (mauve) and midnight (blue).Middle and lower panels: year-F spectrogram plots of the means in 36 equal-sized bins of F, divided by the mean for that year, <x> F /<x> 1yr , where x is: (d) P a ; (e) am; (f) ar midnight ; (g) ar dawn , (h) ar noon , and (i) ar dusk .The data have been smoothed by applying a 3-point running mean to the time series of means in the bins of width (1/36) yr.

Figure 12 .
Figure 12.The left-hand plots in Figure 12 are for am and the right-hand plots are for P a /P o .The top row shows the variations with F of the means for all years for all data (thick black line), for data with [B Y ] GSEQ > 0 (thin line joining open circles) and for data with [B Y ] GSEQ < 0 (thin line joining solid triangles).The results now show a significant difference in the behaviour for am and for P a /P o .For P a /P o (Fig.12b) both the [B Y ] GSEQ > 0 and [B Y ] GSEQ < 0 cases have almost sine-wave forms that are in antiphase.In fact, the positive deflection (at the equinox for which the [B Y ] GSEQ polarity is favoured) is only very slightly larger than the negative deflection (at the equinox for which the [B Y ] GSEQ polarity is unfavoured) and so the average for all data has only a very weak semi-annual variation, because the nearperfect asymmetry between the variations for the two polarities means that they almost cancel and the net semi-annual variation is small.Contrast this with the corresponding variations for am (Fig.12a) for which the positive deflections at the favoured equinox in the single-polarity [B Y ] GSEQ curves are considerably larger than the negative deflections at the unfavoured equinox.As a result, the variation for all data (thick black line) has a much more marked semi-annual variation.It is this much larger difference between the results at any one equinox between the "favourable" and "unfavourable" polarities of IMF [B Y ] GSEQ that causes the amplification of the semi-annual variation in the am geomagnetic activity index, compared to that in P a /P o .This demonstrates that although the R-M effect is working, it is not working in quite the way that is commonly thought, which is the way that was envisaged byRussell & McPherron

Fig. 11 .
Fig. 11.Identification of the Russell-McPherron effect using the polarity of the IMF [B Y ] GSEQ component.The left-hand column is for all data, the middle column for [B Y ] GSEQ > 0 and the right-hand column for [B Y ] GSEQ < 0. The top row shows annual means of am and P a divided by their overall means for the whole interval: respectively, <am> 1yr /<am> all (mauve lines) and <P a > 1yr /P o (black lines).The middle row shows the variations of P a /P o as a function of fraction-of-year, F and year, <P a > F /<P a > 1yr and the bottom row the same for am, <am> F /<am> 1yr .As for Figure 10, data are for 1995-2017 and have been averaged into 36-equal width bins of F in each year, and a 3-point running mean applied to the time series to smooth the data.

Fig. 13 .
Fig. 13.Plots of (top) the probability distribution functions of (P a /P o ) as a function of F and (bottom) the am amplification factor, (am/<am> all )/(P a /P o ), as a function of F and in the same (P a /P o ) bins as the p.d.f.s in the top panels.The left hand panels are for IMF [B Y ] GSEQ < 0, the right hand panels are for IMF [B Y ] GSEQ > 0.

Fig. 15 .
Fig.15.Analysis of 3-hourly am data.(a) The grey points form a scatter plot of am against 3-hourly means of power input into the magnetosphere, P a /P o , generated by averaging over 3 h intervals that are shifted forward in time by the derived best lag of dt o = 60 min (see Fig.7a) relative to the three-hourly intervals in which am is evaluated.The orange points are the mean values averaged in 1% quantile ranges of P a /P o , i.e., q(0) <P a /P o > s=3h < q(0.01), q(0.01) <P a /P o > s=3h < q(0.02), up to q(0.99) <P a /P o > s=3h < q(1).The black error bars are the plus and minus one standard deviation in those means.The mauve is the best-fit OLS linear regression to the 3-hourly data.(b) the grey points are a scatter plot of the fit residuals Dam for the fit shown in (a): this is the difference between each three-hourly am value and the best fit linear regression value based on the corresponding <P a /P o > s=3h value (Dam = am À am fit ), plotted as a function of the simultaneous three-hourly mean of the normalized solar wind dynamic pressure <p SW > s=3h /<p SW > all .The orange points in (b) are means in 1% quantile ranges of <p SW > s=3h and error bars are plus and minus one standard deviation in the mean Dam.The mauve line is the best linear regression to the 3-hourly values.
Fig. 16.F-UT pattern plots of the fit residuals Dam in Figure 15b for 1995-2017, inclusive.In each case the plots are constructed using hourly means of the solar wind dynamic pressure p SW and the threehourly fit residuals Dam are interpolated to the mid-point of those hourly intervals.All plots are then smoothed with a 1-3-1 triangular weighting filter in both the F and UT dimension.(a) The overall mean value of Dam.(b)The mean Dam for the lower tercile of the simultaneous solar wind dynamic pressure, <Dam> F,UT for q(0) <p SW > s=1h < q(0.33).(c) The mean Dam for the middle tercile of the simultaneous solar wind dynamic pressure, i.e., <Dam> F,UT for q(0.33) <p SW > s=1h < q(0.67).(d) The mean Dam for the upper tercile of the simultaneous solar wind dynamic pressure, <Dam> F,UT for q(0.67) <p SW > s=1h < q(1).Note that in parts (b-d) the same color scale is used to emphasize that the equinoctial pattern, although present for all three ranges of p SW , is of amplitude that increases with p SW and is much larger in amplitude for the largest p SW values.This relationship is further studied by Figure17.

Fig. 17 .
Fig. 17.The amplitude of the equinoctial pattern as a function of normalised solar wind dynamic pressure.The pattern amplitude is quantified by the standard deviation of the mean values of Dam in the 288 F-UT bins used to construct patterns like those shown in Figure10, r(<Dam> F,UT ).Patterns were constructed for 20 quantile ranges of the hourly mean solar wind dynamic pressure p SW and r(<Dam> F,UT ) for each is plotted as a function of <p SW > q /<p SW > all , where <p SW > q is the mean dynamic pressure in the quantile range and <p SW > all is the mean for all the data (which are for 1995-2017).The grey and white bands define the 20 quantile ranges of p SW employed.There are a total of 192,719 valid hourly means of p SW in the dataset and so each of the 20 quantile ranges contains 9636 samples.

Fig. 18 .
Fig. 18.Analysis of the contributions to the (a) semi-annual and (b) UT variations of the am index observed in the interval 1995-2017, inclusive.Mean values are shown as a function of (a) time-ofyear F and (b) UT of: (black lines) the observed am data;(orange lines) am fit , the best-fit of power input to the magnetosphere, P a /P o to am; (mauve lines) the fit residual Dam for the upper tercile of the solar wind dynamic pressure, q(0.67)<pSW q(1); (blue lines) the fit residual Dam for the middle tercile of the solar wind dynamic pressure, q(0.33) < p SW q(0.67); and (cyan lines) the fit residual Dam for the lower tercile of the solar wind dynamic pressure, q(0) < p SW q(0.33).
427U PC À 2:13; ðA:1Þ where U PC is in kV and am is in nT.The upper 2-sigma level is ½am max ¼ 9:15 Â 10 À5 in Figure A.1 we present the same data but plotted with am as the abscissa (along the horizontal axis) and U PC as

Fig. A. 1 .
Fig. A.1.Scatter plot of daily means of the transpolar voltage U PC , as determined for 2001-2002 by the DMSP satellites and normalised for the satellite track to an ideal 06-18 MLT path using the procedure of Lockwood et al. (2009), as a function of the means of the am index for the same day.The error bars are plus and minus one standard error in the means.The mauve line is the best-fit 3rd order polynomial fit of U PC and the grey area is bounded by the 2-sigma uncertainty level in the fit.

Table 1 .
Coefficients and their 2-sigma errors for the ordinary least squares (OLS) linear regression fits shown in Figure 4, for 10-day means of P a /P o and am, where P a /(P o ) fit = sÁam + c.Error in s, n s Intercept, c Error in c, n c Fig.14.Mean values of am as a function of normalised power input into the magnetosphere, P a /P o (horizontal axis) and normalised solar wind dynamic pressure, p SW /<p SW > all (vertical axis).The mean values of am are evaluated in bins of width 0.15 in both P a /P o and p SW /<p SW > all and pixels colour-contoured according the scale shown.Only bins containing at least 5 samples are considered.
The best fit is Cite this article as:Lockwood M, McWilliams KA, Owens MJ, Barnard LA, Watt CE, et al. 2020.Semi-annual, annualand Universal Time variations in the magnetosphere and in geomagnetic activity: 2. Response to solar wind power input and relationships with solar wind dynamic pressure and magnetospheric flux transport.J. Space Weather Space Clim. 10, 30.