A homogeneous aa index: 2. hemispheric asymmetries and the equinoctial variation

Paper 1 [Lockwood et al., 2018] generated annual means of a new version of the $aa$ geomagnetic activity index which includes corrections for secular drift in the geographic coordinates of the auroral oval, thereby resolving the difference between the centennial-scale change in the northern and southern hemisphere indices, $aa_N$ and $aa_S$. However, other hemispheric asymmetries in the $aa$ index remain: in particular, the distributions of 3-hourly $aa_N$ and $aa_S$ values are different and the correlation between them is not high on this timescale ($r = 0.66$). In the present paper, a location-dependant station sensitivity model is developed using the $am$ index (derived from a much more extensive network of stations in both hemispheres) and used to reduce the difference between the hemispheric $aa$ indices and improve their correlation (to $r = 0.79$) by generating corrected 3-hourly hemispheric indices, $aa_{HN}$ and $aa_{HS}$, which also include the secular drift corrections detailed in Paper 1. These are combined into a new, 'homogeneous' $aa$ index, $aa_H$. It is shown that $aa_H$, unlike $aa$, reveals the 'equinoctial'-like time-of-day/time-of-year pattern that is found for the $am$ index.


The aa and am indices
As discussed in Paper 1 , the aa index was devised by Mayaud (1971Mayaud ( , 1972Mayaud ( , 1980 to give a continuous, well-calibrated and homogeneous record of geomagnetic activity that extends back to 1868. It uses just two stations at similar geomagnetic latitudes, one in each hemisphere, and averaging the data from them, to a large extent, gives cancellation of the seasonal variation in the geomagnetic response to solar forcing that is seen at either one of the stations individually. Figure 1a shows that this is effectively achieved for the ''classic aa'' (i.e., the official aa index generated by EOST (École et Observatoire des Sciences de la Terre), as available from ISGI (International Service of Geomagnetic Indices, http://isgi.unistra.fr/) and other data centers around the world. The aa indices show the well-known semi-annual variation in geomagnetic activity (Cortie, 1912;Chapman & Bartels, 1940;Cliver et al., 2002;Le Mouël et al., 2004), with equinoctial peaks in average values: this can be seen in Figure 1a for the northern hemisphere index, aa N (in red), for the southern hemisphere index, aa S (in blue) and for the average of the two, aa (in black). The average annual variation is, however, different in aa S and aa N , such that in northern-hemisphere winter (i.e., around time-of-year F = 0 which is defined to be at midnight between 31 December and 1 January and so is also F = 1), haa N i and haa S i (and therefore haai) are very similar. However, in northern-hemisphere summer (F around 0.5), haa N i is considerably greater than haa S i. This difference is averaged out in haai, such that only in haai is the December minimum the same depth as the June minimum, showing that the annual and seasonal variations have been averaged out leaving only the semi-annual variation, with its peaks near the equinoxes.
All plots in Figure 1 are restricted to data from the years 1959-2017 so that they can be compared to the am indices, the average variations of which are shown in the second row of the figure. The am index (Mayaud, 1980) is, like aa, a 3-hourly range index (i.e., based on the range of variation in each 3-hour interval) but compiled using area-weighted means of data from rings of mid-latitude stations, currently with 11 in the northern hemisphere and 10 in the southern. It is also compiled by ISGI (and collaborating institutes) who make available the northern hemisphere index, an, the southern hemisphere index, as, and am = (an + as)/2 for 1959 to the present day. The annual variations of hami, hani and hasi are shown in Figure 1c (in black, red and blue, respectively), which shows that the behaviour is very similar indeed to that for the aa indices in Figure 1a. Therefore, at least in terms of its variation over the year, the aa index certainly succeeds in its aim of replicating an equivalent index derived using a more extensive array of observatories.
The two aa stations are also roughly 10 h apart in local time and it was hoped in the construction of aa that this would largely cancel out the diurnal variation at the two stations. Comparison of Figures 1b and 1d shows that this is considerably less well achieved for aa than it is for am. The distribution of am stations with longitude in each hemisphere is not ideal which introduces a small spurious UT variation; however, this is very much smaller than for aa which has only one station in each hemisphere. The diurnal variations in haa N i and haa S i are not quite in antiphase, nor are they exactly the same in amplitude or waveform: as result, haai shows considerable average diurnal variation (Fig. 1b). On the other hand, the use of rings of longitudinally-spaced stations to construct am has suppressed the diurnal variations in both hani and hasi (Fig. 1d) such that average am is almost constant with UT.

Time-of-day/time-of-year response patterns
The mean values for a given time-of-year (F) in the lefthand plots of Figure 1 are averaged over all times of day (UT), and the mean values at a given UT in the right-hand plots are averaged over all F. In general, we are concerned with the full time-of-day/time-of-year (UT-F) patterns of variation of the geomagnetic responses. The top row of Figure 2 shows the three main UT-F patterns predicted from geometric considerations of solar-terrestrial interactions and the bottom row of Figure 2 shows an example of each type of pattern, as seen in averages of observations, either in near-Earth space or in geomagnetic activity (after Lockwood et al., 2016).
All three patterns arise from the geometrical considerations associated Earth's orbit around the Sun, combined (in the first two cases, at least) with the effects of Earth's rotation. The ''Russell-McPherron'' (R-M) pattern ( Fig. 2a) arises from considering the changes in the angle between the GSM (Geocentric Solar Magnetospheric) and GSE (Geocentric Solar Ecliptic) reference frames (Russell & McPherron, 1973); the equinoctial pattern ( Fig. 2b) arises from considering the angle between the solar wind direction and Earth's magnetic axis (Bartels, 1925;McIntosh, 1959) and the axial pattern ( Fig. 2c) arises from the variation in Earth's heliographic latitude (Cortie, 1912) and also from the annual variation of the angle between the heliocentric Radial-Tangential-Normal (RTN) and geocentric GSE reference frames (Lockwood et al., 2016). All three predict peaks in geomagnetic activity at or near the equinoxes (but different UT dependencies). Figure 2d demonstrates that the R-M effect is seen in the average (half-wave rectified) southward component of the IMF in the GSM frame (O'Brien & McPherron, 2002), which is well understood to be the main driver of geomagnetic activity. However, neither of the geomagnetic indices shown in Figure 2, am and Dst, display the R-M pattern. The idea behind axial effect is that near the  equinoxes, Earth is at slightly higher heliographic latitudes, which increases the probability of it leaving the streamer belt and encountering the fast solar wind (Hundhausen et al., 1971), especially at solar minimum (McComas et al., 2008): hence in this case there is no effect of Earth's rotation and so no UT variation. There is a second annual geometric effect associated with the variable difference between the GSE and heliocentric RTN reference frames: this effect is in antiphase with the heliographic latitude effect, favouring solstices over the equinoxes in terms of giving southward IMF and hence geomagnetic activity. It also has no UT variation but is relatively small. The axial effect appears to be present in the Dst index (as shown in Fig. 2f, where Dst has been corrected for the longitudinal inhomogeneity in the ring of equatorial stations using the procedure of Takalo & Mursula, 2001). However, Lockwood et al. (2016) point out that Dst is not responding to the variation in Earth's heliographic latitude, rather the long duration of large Dst responses (storms) to southward IMF (in the GSM frame) smooths out the UT variations seen in Figure 2d, giving an axial-like behaviour. The UT-F pattern seen in the am index in Figure 2e has similarities to the equinoctial pattern in Figure 2b, although it is not an exact match and there are elements of all three patterns in the am response (Cliver et al., 2000;Chambodut et al., 2013). The equinoctial element indicates that the tilt of the Earth's rotational and/or magnetic axes towards or away from the Sun has an influence, introducing differences between the two solstices and between 4 UT and 16 UT which are not predicted by the R-M effect (O'Brien & McPherron, 2002). There have been a number of explanations proposed for this observed equinoctial pattern. These include tilt-induced changes in the ionospheric conductivity within the nightside auroral electrojet of substorm current wedge and the postulate (as yet unproven and somewhat counter-intuitive) that electrojet currents are stronger when conductivities caused by solar extreme ultraviolet (EUV) are low in both midnight-sector auroral ovals (Lyatsky et al., 2001); tilt influence on the magnetopause reconnection voltage (Crooker & Siscoe, 1986;Russell et al., 2003); the effect of tilt on the proximity of the ring current and auroral electrojet (Alexeev et al., 1996); and tilt effects on the stability of the cross-tail current sheet (Kivelson & Hughes, 1990;Danilov et al., 2013). Finch et al. (2008) used a global network of geomagnetic stations to show that the equinoctial behaviour originates during substorm expansion phases and in the substorm current wedge and is not a feature of dayside currents and flows during the substorm growth phase. (These authors showed that the dayside currents do not depend on UT and vary only with season, being greater in summer when conductivities are higher). The results of Finch et al. (2008) therefore strongly support the explanations of the equinoctial effect invoking nightside magnetospheric or ionospheric effects rather than those that postulate modulation of the magnetopause reconnection voltage. Note also that indices influenced by the substorm current wedge also depend where m SW is the mean ion mass, N SW the number density and V SW the speed of the solar wind), because it compresses the near-Earth geomagnetic tail and so modulates the near-Earth cross-tail current there for a given open magnetic flux content in the tail (Lockwood, 2013): Finch et al. (2008) showed that a V SW 2 dependence was present in the equinoctial pattern response but not in the directly-driven dayside response. As discussed in Paper 1, mid-latitude range indices respond primarily to the substorm current wedge, and so the results of Finch et al. (2008) explain why it is the am index that displays the equinoctial pattern most clearly.

The aims of the present paper
In the present paper, we employ the concept introduced by Finch (2008) of the sensitivity S o of a mid-latitude geomagnetic observatory to solar wind forcing, which depends on its location (geomagnetic and geographic latitudes), its Magnetic Local Time (MLT) (and hence the UT) and on the time of year, F. Our motivation is to remove effects caused by the geomagnetic and geographic coordinates of the site and so homogenise the aa index on sub-annual timescales, such that aa N and aa S are more highly correlated and have distributions of values that are more alike.
This also allows us to evaluate how the equinoctial time-ofyear time-of-day pattern should appear in the aa data one the station location effects are accounted for. Chambodut et al. (2013) have mapped the am index data into 4 MLT sectors and they show that the equinoctial pattern is present in the am data from each one. It is weakest in the noon sector, particularly at 0-9 UT. However, that it can be detected at such a wide range of MLT and UTs, indicates that some information about the equinoctial variation should be available in the 2-station aa index, if the spurious diurnal variation caused by having only one station in each hemisphere can be removed. We develop Finch's numerical model of the stations' sensitivities by comparing the time-of-day/time-of-year pattern of response for various stations to that of the am index. Our aim (as in Paper 1) is to reduce the difference between the hemispheric aa indices on 3-hourly and daily timescales so that we have greater confidence that the average of the two is a representative index of the response of the global magnetosphereionosphere-thermosphere system to events of enhanced solar wind forcing. This would make the 150-year record of major storms from aa data much more reliable and give a more reliable rank order of the severity of major geomagnetic disturbance events. As a test of this, in the present paper we study the extent to which aa can reproduce the equinoctial variation that is found in equivalent range indices from more extensive and evenly-distributed networks of observatories. We show that allowing for this modelled station sensitivity can (along with the long-term recalibration described in Paper 1) effectively remove the spurious diurnal variation and known hemispheric asymmetries in the classic aa index and reveals the equinoctial pattern in the aa index. The bottom row in Figure 1 shows the corresponding averages of the new, ''homogenized'' aa indices (aa HN , aa HS , and aa H = (aa HN + aa HS )/2) that are developed in the subsequent sections of the present paper. It can be seen that the difference in the average annual variation of aa N and aa S has almost been eliminated (leaving only a small seasonal variation with summer means slightly greater than winter ones at both solstices and not just around the June solstice), as has most of the difference in their average diurnal variations, such that average aa H is almost independent of UT, even though it is compiled from just two stations. Finch (2008) introduced the concept of the locationdependent magnetometer station sensitivity, S o , defined for a given type of single-station geomagnetic activity measure by

Methodology
where G A is the geomagnetic activity measure in question and I S is a measure of the input solar forcing, which includes the effects of both induced currents in near-Earth space and of conductivity changes due to variations in the ionizing EUV and X-ray radiations from the Sun or particle precipitations. Finch considered S o to be a function of the instrument co-ordinates only because instrument and local site characteristics are accounted for by other inter-calibration procedures. By taking ratios of G A seen simultaneously at many pairs of different stations, the I S factor is cancelled and the ratios of the station sensitivities are known. If the data from different stations are combined into a geomagnetic index using linear mathematics (such as taking an average) then the sensitivities are similarly combined. From comparisons of these ratios for many pairs of stations, Finch (2008) derived a functional form for computing the sensitivity of a station as a function of its geographic coordinates, date, time-of-year and time-of-day: where A and B are constants, v is the solar zenith angle, T is the MLT of the station (in hours), F is the fraction of the year and F = F 1 at the spring equinox (taken to be 100/365.25 for the northern hemisphere and 283/365.25 for the southern hemisphere). Lastly, m is a normalising factor that ensures that the average value of S o , over all times-of-day (UT) and all times-of-year (F), is unity for a given station and year: it is used to retain calibrations that allow for instrument characteristics and local site effects. The first term on the right of equation (2) allows for the effect of solar zenith angle v on the ionospheric conductivity over the station due to solar EUV and X-ray radiation and thus depends on the station's geographic latitude, the time-of-day and the time-of-year (see discussion at the end of this section about the choice of computing v at the location of the magnetometer station). If the Sun is below the horizon, v is set to (p/2): hence the coefficient A controls the extent to which the effect of dayside conductivity at a given v is enhanced over residual nightside values. Note that there are small changes to the precise formulation of Finch (2008), who used a cos 0.5 (v) dependence, as predicted by Chapman production-layer theory and also used in a great many prior applications. However, Ieda et al. (2014) show that a conductivity dependence on cos 0.7 (v) fits better with observations and is also predicted by theory when the upward gradient of the neutral atmospheric scale height is accounted for.
The second term on the right of equation (2) is the station's sensitivity due to its distance from the location of peak response, which is at an MLT of T* in the midnight sector. The sine term in equation (3) is used to model the known earlier onset of enhanced substorm activity in summer (which is likely to also be a conductivity effect). Equation (3) yields T* of 1 h MLT and 22 h MLT for the winter and summer solstices, respectively. This is based on the survey of mid-latitude station responses to substorm expansion phases by Finch (2008) and agrees well with the results of Liou et al. (2001), who found substorm onset was typically at T = 22 h in summer but 23.5 h in winter. Similar behaviour was deduced by Wang et al. (2007). We note that we are most interested in the MLT where auroral electrojet currents have peak effect on midlatitude K indices: this is close to, but not the same as, the MLT of onset (Clauer & McPherron, 1974;Chu et al., 2014). Finch (2008) assumed that the factors A and B were constants and had considerable success in modelling the average response of different stations and indices. However, there are reasons to also think that the relative importance of the two terms in equation (2) might change systematically with the level of geomagnetic activity. Firstly, particle precipitation fluxes are higher during enhanced activity over a wide range of locations (including mid-latitudes; e.g., Shiokawa et al., 2005), which could lead to the relative contribution of photon-induced conductivity, and hence the dependence on cos 0.7 (v), becoming weaker: hence the factor A might be reduced at higher activity levels. Secondly, the auroral oval expands equatorward when activity is enhanced, making the second factor (associated with the spatial proximity of the auroral electroject) more important. This could have a number of effects. The factor B sets the amplitude of the diurnal variation seen by the station because of the variation in its proximity to the peak of the substorm current wedge. For these reasons the factors A and B are here treated as functions of the geomagnetic activity level.
Note that the Finch (2008) model employs the photoninduced conductivity above the station, which may not be the most appropriate location given that the majority of the current flows along the auroral oval in the auroral electrojet. We investigated this using three different locations at which the solar zenith angle v was evaluated, namely: (A) the nominal auroral oval latitude at the same MLT as the station; (B) the location of the station; and (C) midway between these two. The goodnessof-fit metric (the root-mean-square (r.m.s.) deviation, D RMS ) was very similar in all three cases (for the aa index D RMS was 0.122, 0.116, and 0.118 nT, for options (A), (B) and (C) respectively, whereas assuming the station sensitivity was a constant gave D RMS = 0.132) but the small differences give a preference ranking order of (B), (C), then (A). We here use option (B), largely because it avoids using a nominal latitude of the auroral oval rather than because it gives a better fit (the differences between the three cases being minimal and not statistically significant). This can be understood physically by thinking about the extremes of conductivity production in the auroral oval and considering the auroral electrojet to be linked to a pair of filamentary field-aligned currents (upward and downward at its westward and eastward ends, respectively) in the current wedge. If the conductivity were purely due to particle precipitation, the current along the oval would be a pure Cowling current (i.e., the Hall current is suppressed) along the oval where the precipitation is enhancing the conductivity. In this case, there would be no solar zenith angle dependence. If the conductivity were purely generated by solar photons, it would be enhanced both inside and outside the auroral oval. In this case, the Pedersen current (and therefore electric field and Hall current) would spread out in latitude from the line connecting the two filamentary currents (see, for example, Fig. 3 of Southwood, 1987). If the conductivity were spatially homogeneous, this spreading would be symmetric to the north and to the south of the oval; however, in reality it will be preferentially on the low-latitude side of the oval where v is lower. Add to this the distance-squared decrease in the effect on the field at the station that is inherent in the Biot-Savart law, it is clear that the most relevant photon-induced conductivity would be equatorward of the oval and so closer to the station than is the auroral oval. Hence using the location of the station (option B) is a reasonable way of quantifying the photon-enhanced conductivity effect.

Derivation of the coefficients A and B for sensitivity modelling of the aa stations
To complete the set of equations used in this paper to compute S o for a given station at a given F, UT and year, we derive empirical expressions for A and B, here quantifying the level of geomagnetic activity by the aa index (after implementation of the corrections for the effect of the secular change in the geomagnetic field, as detailed in Paper 1) so that we can use the equations to correct all aa values back to 1868.
Our approach is to assume that the spatial distribution of the am observing stations is ideal, so we neglect any influence of limitations to the am network on the UT-F pattern shown in Figure 2e. This is an assumption, but as am is by far the most homogeneous and most global range-based index that we have, it is an assumption that has been made, often tacitly, in a great number of previous studies. This being the case, the UT-F pattern for the ratio of any index, divided by the simultaneous am value, reveals (at least to first order) the UT-F pattern for the sensitivity of that index, the solar forcing term in equation (1) having been cancelled. Figures 3a-3c show the average UT-F patterns for aa 0 N /am, aa 0 S /am and aa 0 /am. The prime denotes that the correction for the secular drift, as developed in Paper 1, has been applied (so aa 0 where aa N and aa S are the ''classic'' hemispheric aa indices, s(d) is the time-dependent scaling factor for the station in question, and a c and b c are daisy-chained calibration factors that make all corrected aa values consistent with the am index for [2002][2003][2004][2005][2006][2007][2008][2009]. Note also that in Paper 1, annual means of aa 0 N , aa 0 S and aa 0 were referred to as aa HN , aa HS and aa H , respectively. This nomenclature applies to the annual means because in the present paper we amend 3-hourly aa 0 N , aa 0 S and aa 0 values (to 3-hourly values we call aa HN , aa HS and aa H ) in such a way that their annual means remain unchanged. Figure 3d gives the pattern for the north-south anisotropy in aa 0 , ðaa We use all available data between 1959 and 2017 to keep the numbers of samples in each (UT, F) bin as high as possible. The data used to generate the example shown in Figure 3 are for all data points (since 1959) giving an aa 0 index value that was relatively large (in the range 70 aa 0 < 110 nT, for which the mean aa 0 is 88.15 nT). Figure 4 shows the UT-F patterns of the best-fit to Figure 3 of the modelled sensitivity from the implementation of the Finch (2008) model used here, as given by equations (2) and (3). In Figure 4, the solar zenith angle and MLT are evaluated for the relevant station location, UT and F and for the year 1998 which is the midpoint of the data interval used in Figure 3. The coefficients A = 0.11 and B = 0.28 were derived by iteration using the Nelder-Mead search method to minimise the mean square of the deviation of all 640 pixels in Figure 4 from its corresponding pixel in Figure 3. The number 640 arises from the use of the 8 UT bins of the aa and am indices with our choice of 20 F bins, so each panel contains 160 pixels and there are 4 panels. Note that all pixels in all four patterns are given equal weight by this procedure. The relatively low value of A in this case means that the peak associated with the conductivity term in equation (2) is modest: this peak appears at the minima of the solar zenith angle v, which is near 17 UT and F = 0.5 for the northern hemisphere aa station and near 7 UT and F = 0 (and hence also F = 1) for the southern hemisphere aa station. The main effect in Figure 4 is the diurnal variation caused by the station's daily journey in MLT and hence the main feature is the UT variation in its sensitivity, the amplitude of that variation being set by B. Comparison of At lower average aa 0 values, the peak sensitivity at minimum solar zenith angle becomes much more pronounced. This can be seen in Figure 5, which is the same as Figure 3, but for the range 10 aa 0 < 20 nT (for which the mean aa 0 is 14.42 nT). The best-fit model patterns for this case are shown in Figure 6, which are for A = 0.58 and B = 0.33.
It was possible to keep enough samples in each bin to see the average sensitivity patterns by dividing the full range of aa 0 into 8 bins: 0 aa 0 < 10 nT; 10 aa 0 < 20 nT (the example presented in Figs. 5 and 6); 20 aa 0 < 30 nT; 30 aa 0 < 40 nT; 40 aa 0 < 60 nT; 50 aa 0 < 90 nT; 70 aa 0 < 110 nT (the example presented in Figs. 3 and 4); and aa 0 ! 100 nT. Figure 7 gives a scatter plot of the modelled sensitivities in each of the 160 UT-F pixels of the relevant pattern (using the best-fit A and B values for each aa 0 bin) and all 8 aa 0 -bins (giving 1280 data points in total), as a function of the index ratio that they are fitted to. If the model were perfect, then all the points would lie on the diagonal red line. It can be seen that the model has captured that trend. However, there is scatter around the line. We can compare the use of the modelled sensitivity factor to that assumed for the corrected aa, aa 0 , which is always unity (S o = 1, which means that points would all lie on the blue line). If we take the r.m.s. deviation of all the modelled sensitivities from observed ratios (1280 values, being 160 UT-F pattern pixels in each of 8 aa 0 range bins), D RMS , the ideal value would be zero in each case. (However, remember such a result would render all but one of the network of mid-latitude geomagnetic observatories redundant for space science studies as instead we could use the one station in conjunction with the model). For the aa 0 N index, assuming the model is reducing r.m.s. uncertainties (compared to not considering the station sensitivity) by 35% in this case. For the aa 0 index, assuming the sensitivity was constant at unity gives a D RMS = 0.132, whereas using the fitted model value gives D RMS = 0.116. Therefore the model is reducing r.m.s. uncertainties (compared to not considering the station sensitivity) in aa 0 by 11%. It is not surprising that the model improves the agreement with the am pattern by much less for aa 0 , because the averaging of aa N 0 and aa N 0 to give aa 0 is carried out precisely to also achieve this error reduction. These improvements are all quite modest. However, they are not the most important point. One of the key objectives in introducing the model is to bring the northern and southern hemisphere aa indices into better agreement with each other, and so give us greater confidence that the average of the two is meaningful on timescales less than 1 year. If we consider the north-south anisotropy, (aa N 0 À aa S 0 Þ=aa N 0 þ aa S 0 Þ, assuming S o = 1 in both hemispheres gives a D RMS = 0.989, whereas using the fitted model value gives D RMS = 0.112. Therefore, in this case the model is reducing uncertainties (compared to not considering the station sensitivity) by 90%. The improvement is proportionally much greater in this case of the hemispheric anisotropy because there is self-consistent improvement to both aa N 0 and aa S 0 . Hence the model gives improvements to both the hemispheric aa 0 indices, but they are relatively modest (35-40%), and improvements are quite small for aa 0 (~10%). However, the model can be very significant in reducing the asymmetry between northern and southern hemisphere indices.
The best-fit values of A and B for the 8 aa 0 bins used are plotted as black dots and mauve circles, respectively, in Figure 8. These points are plotted at the mean aa 0 value for the (overlapping) aa 0 bins which are shown by the cyan and grey bars at the top of the plot. The figure shows how the conductivity term is relatively more important at low aa 0 levels but the MLT term becomes more important at higher aa 0 levels. The black and mauve lines are simple ad-hoc fits to the points, given by B ¼ 0:38ð1 À fð40 À aa 0 Þ=70g 2 Þ for aa 0 < 75:9 nT; B ¼ 0:28 for aa 0 ! 75:9 nT; Equations (4) and (5) can be used to give the A and B coefficients for a given aa 0 value, which can be used in equation (2) to compute the station sensitivity.
Note that the UT-F patterns of S o will vary with secular change in the geomagnetic field because the MLT, T, of the station at a given UT will vary. As in Paper 1, we use a spline of the IGRF-12 (Thébault et al., 2015) and gufm1 (Jackson et al., 2000) models to predict the MLT at a given UT for each date and station, and so allow for this effect. This is an additional secular change to that considered in Paper 1 (associated with the latitude of the station and the closest proximity of the average auroral oval) but only influences the time-of-year/time-ofday variation around the annual mean and not the annual mean itself. The UT-F maps of S o that we generate show that, over the centennial timescales of the aa index, this can have a significant effect on the patterns of S o for a given station. The solar zenith angle calculation is made for a given site at the UT, F and year at the centre of each of the 438,296 3-hourly aa index intervals during 1868-2017, allowing for all variations in the Sun's declination.
In this paper, we use the computed sensitivities to correct 3-hourly aa N 0 and aa S 0 values (i.e., after time-dependent scaling factors s(d) that allow for the effects of the secular drift in the geomagnetic field, as derived in Paper 1, have been applied). This gives corrected hemispheric indices: where S N and S S are the station sensitivities (computed from Eqs. (2)-(5) for, respectively, the northern and southern hemisphere aa station in use at the time. The factors f N and f S ensure that the annual (calendar-year) means derived in Paper 1 are not altered by the allowance for the variations of station sensitivity on timescales less than a year. Hence And Note that, even though S N and S S are normalised to be unity when averaged over all UT and times-of-year, the factors f N and f S will still, in general, differ from unity because of nonuniformity of activity occurrence within the year (for example, if in any one year, more geomagnetic activity happened, by chance, to occur when S N > 1 than when S N < 1, then f N will be less than unity). The homogenised hemispheric indices,  (4) and (5), respectively.
aa HN and aa HS , are then averaged to give the corrected 3-hourly aa index:

Comparison of the new hemispheric indices
A measure of the degree of success of this procedure would be the extent to which the corrected aa HN and aa HS are similar, compared to the classic hemispheric indices, aa N and aa S . Complete success would mean that aa HN and aa HS were identical (but, as noted above, this would also mean that the sensitivity model was so good that we could dispense with a full array of magnetometer stations and have just one, used in conjunction with the model). There is a limit to how much this approach can achieve. Consider a situation where S N is large (>1) and S S small (<1) such as around 20 UT and F = 0.5. Dividing aa N 0 by S N should give a reliable value of aa HN , but if S S is small enough, the required signal may have fallen below the noise level and so dividing aa S 0 by S S (<1) increases both the noise and the signal and a reliable value of aa HS is not obtained. This effect was often noted at times of low activity when making comparisons of the am index with the signal from a single station at a time when its S o value was low. The point is that an am value is the average of the data from a number of stations which gives addition of the signal and cancellation of the noise and so am has greater sensitivity to small fluctuations than does aa or aa 0 There are other limitations which are discussed in Section 6. Figure 9 plots the occurrence of combinations of the classic aa N and aa S values in the upper panels and of the new corrected aa HN and aa HS values in the lower panels, with left-hand plots being for 3-hourly values and the right-hand plots being for daily means. The plot is for all available data . The diagonal mauve lines are the ideal case, with equal values for the two hemispheres. The pixels are logarithmically-sized and the number samples, N, in each pixel is color-coded. Figure 9a stresses the quantised nature of the classic aa indices and that, especially at average and low aa, almost any value of aa S is possible for a given aa N , and vice-versa. The correlation coefficient, r, between the full sequence (1868-2017) of 3-hourly aa N and aa S values is 0.66 and the r.m.s. deviation of the two from the aa value, as a ratio of that aa value, is d = 0.53. Figure 9c is the corresponding plot for the new indices, aa HN and aa HS and is very different in character. The values have been moved toward the diagonal and have become continuous in nature (although there is still some clustering of data points around the allowed combinations of the classic aa indices). The correlation between aa HN and aa HS is increased to r = 0.79 and d reduced to 0.45. For the daily means of the classic aa indices, Aa N and Aa S (Fig. 9b), r is 0.92 and d is 0.28 and for the daily means of the new indices (Fig. 9d), Aa HN and Aa HS , r is (very slightly) increased to 0.93 and d further reduced to 0.22. Hence the allowance for the station sensitivities has succeeding in increasing the agreement between the two hemispheric indices. Figure 10 gives further comparisons of the hemispheric agreement for the classic and new aa indices. The lower panel shows the coefficients of determination r 2 (where r is the correlation coefficient) between southern and northern hemisphere indices, evaluated in calendar year intervals. The green and black lines are for the 3-hourly indices, green being for classic aa indices (aa N and aa S ) and the black lines being for the new homogenized indices (aa HN and aa HS ). It can be seen that in all years the new indices have been brought into closer agreement with r 2 typically raised from around 0.4 to 0.55. The value is always higher for the new indices but there are a small number of years for which r 2 is high (%0.7) in both the classic and the new indices: this appears to be a limit to how far the hemispheric indices can agree when they are compiled from just a single station. The red and blue lines are for daily mean data, red being for Aa N and Aa S and blue for Aa HN and Aa HS . Again agreement is always slightly better for the new indices but the improvement is small for daily averages. The upper panel shows the annual means of aa H, and it can be seen that the long term trend does not influence the hemispheric agreement. There is a tendency for the years of high agreement to occur one year before solar cycle minima. In all cases, there are slightly lower levels of agreement before 1880 which appears to indicate that there were increased measurement errors in one, or both, of the magnetometers at this time.
The left hand plots of Figure 11 compare the cumulative probability distributions (c.d.f.s) of the classic and new indices with the classic indices in the upper panel and the new indices underneath. The quantised nature of aa N and aa S (and to a lesser extent aa) can be seen in the upper panel. Note that larger southern hemisphere values are consistently more common than northern hemisphere values for aa < 58 nT, but the opposite is true for aa > 58 nT. These asymmetries in the classic aa index distributions were pointed out by Bubenik & Fraser-Smith (1977) and by Love (2011). The lower panel shows the corresponding c.d.f.s for the new indices aa HN , aa HS and aa H . It can be seen that the new indices are essentially continuous, rather than quantised, and that the major asymmetry between the northern and southern distributions has been removed. The agreement of the aa HN and aa HS distributions is not perfect, but it is much better than for aa N and aa S . The right-hand plots of Figure 11 show scatter plots of the old and new values. The upper panel shows aa HN as a function of aa N as red dots and aa HS as a function of aa S as blue dots. The vertical spreads reflect the range of sensitivity values applied. Careful inspection reveals that the corrections are not independent of the index value. For example at large aa N , aa HN is more often reduced compared to aa N rather than the other way round. In other words, high aa N tends to be recorded at times of high station sensitivity, S N , which amplifies the value detected. However, this is not always the case and the plot also shows cases where high aa N was recorded at times of S N < 1. The tendency is reflected in the change to the CDF. The lower-right plot is the scatter plot between aa H and aa. The increased effect of the conductivity term in the sensitivity model at low aa can be seen as a very slight non-linearity in the plot. In general, aa H values are lower than aa by between about 0 and 30%. The average decrease of about 15% is mainly due to the calibration of the new indices against am data over the interval 2002-2009 (see Paper 1).
Quantile-quantile (q-q) plots are a standard method for testing if two populations share the same form of distribution, because points lie along the diagonal if they do. Figure 12a is the q-q plot for 3-hourly values of aa N and aa S : the quantization of aa N and aa S is evident, and the scatter of points away from the green diagonal show distributions are not closely matched at all levels, as also shown by Figure 11a. Figure 12c is for 3-hourly values of aa HN and aa HS . It can be seen these distributions are continuous and similar up to the 99.97 percentile (the orange point). For the largest 0.03% of 3-hourly values (above the orange point) we see some divergence of the two distributions with the occurrence of large events being slightly lower for the southern hemisphere (although the distributions agree around the 99.99 percentile). Figures 12b and  12d are the same comparison for daily mean values. Above about the 99.5 percentile, the tails of the Aa N and Aa S distributions are not generally well matched, with quantiles for Aa S slightly, but persistently, at lower values than for Aa N although they do agree better near the 99.97 percentile (the orange point). For daily means Aa HN and Aa HS , the distributions agree well all the way up to the 99.97 percentile but they disagree above this percentile with quantiles for Aa HS again persistently at lower values than for Aa HN . Thus the homogenization has resulted in the northern and southern index distributions being of more similar shape, except for the extreme values above the 99.97 percentile (the orange points), which is where the rarity of events is likely to make their occurrence in the two hemispheric indices more dissimilar. As discussed in Section 5, there are combinations of UT and MLT when the southern station records lower values, possibly because of a UT variation in geomagnetic activity, but at no time is the northern station subject to this effect (because it is at a different longitude). Hence the divergence of the extreme event tails of these q-q plots (with generally fewer events seen in the southern hemisphere) appears to be a real physical effect, associated with the longitudes of the stations, and not due to measurement error and noise.   Figure 13a shows the pattern for am, and reveals the quasi-equinoctial pattern discussed in Section 1. Figure 13b   shows the pattern for the classic aa index for the same years as are available for am : it can be seen that the spurious diurnal variation caused by having just one station in each hemisphere has seriously disrupted the pattern, with a marked minimum in the response in aa at 1-8 UT, and an excessive response at 12-23 UT that appears more axial than equinoctial in form. Figure 13c shows the pattern for aa H for 1959-2017. It can be seen that this pattern in the new index is more equinoctial and quite similar to that for am. This means that the shrinking of the difference between the annual variations of the new hemispheric indices, seen in Figure 1e and the flattening of the UT variation in Figure 1f have been achieved in a self-consistent way in the new homogenized aa indices. Figure 13d shows before the start of the am index, i.e., for 1868-1958. It can be seen that it too shows an equinoctial-like pattern. As discussed in Section 1, we are not yet certain of the physical origin of the equinoctial pattern but none of the proposed mechanisms offer any reason why it should not be present before 1959 as well as after and Figure 12d shows that aa H reveals that it is. Note that the colour scale on which aa H is plotted in Figure 13d has been reduced by the ratio of average aa H values before and after 1959. Figure 14 shows a much more stringent and revealing test of the new indices by looking to see if the equinoctial pattern is present in the new hemispheric indices on their own. Figures 14a and 14b show the UT-F patterns for the classic aa indices (respectively, aa N and aa S ). It can be seen that the pattern is dominated by the MLT variation of the station in both cases, with strong peaks at all times of year around 21 UT in the aa N and around 11 UT in aa S . The effect of the semi-annual variation in the solar wind forcing of geomagnetic activity can also be seen, with peaks at the equinoxes, but the pattern is very far from equinoctial. Figures 14c and 14d show the UT-F patterns for the new homogenised hemispheric aa indices (respectively, aa HN and aa HS ). It can be seen that, remarkably, the equinoctial pattern has partially emerged in both cases, although in neither case is the variation for am (shown in Fig. 12a) perfectly reproduced. Figures 14e and 14f show the UT-F patterns for the hemispheric am indices (respectively, an and as). The equinoctial pattern is again seen but, as for aa HN and aa HS , neither is an exact replica of the am variation. Some of the anomalous features in the new patterns for the new aa indices are seen in the hemispheric am indices: for example, the minimum in the response of the southern hemisphere indices at around 5 UT is present in both as and aa HS . Figure 14d is interestingly consistent with the results of Chambodut et al. (2013) who found the equinoctial pattern in am almost disappeared in the noon MLT sector at roughly 0-9 UT: the southern hemisphere aa station is at about 10.6-19.6 MLT in this interval and so has passed through the noon sector. Thus the loss of the equinoctial pattern in the aa HS data occurs at the time that we would expect from the results of Chambodut et al. (2013). On the other hand, the northern hemisphere aa station is at 23.7-8.7 h MLT in the 0-9 UT interval and so in the midnight and dawn sectors, for which Chambodut et al. (2013) strongly detect the equinoctial pattern at all UT. Hence we see no such gap in the equinoctial pattern in aa HN . This appears to reflect a genuine UT variation in solar windmagnetosphere coupling and/or in the response of the magnetosphere-ionosphere-thermosphere system. This is an issue that we will return to in a later paper.

Conclusions
By using a model of the sensitivity of a geomagnetic observing site that takes account of its solar zenith angle and its MLT (and to a small extent the geomagnetic activity level), we have generated a new ''homogenised'' data series of 3-hourly and daily-mean values: aa HN , aa HS and aa H . These also make use of the long-term recalibration of stations and the allowance for the change in the geomagnetic field that was implemented in Paper 1. The new indices are generated by a fixed algorithm that is the same for all magnetometer stations and show a number of improvements over the classic aa indices, namely: 1. The long-term drift of the northern and southern hemisphere indices is the same (see Paper 1). 2. The distributions of index values are continuous and not quantised. 3. The distributions of values for the northern and southern hemisphere indices are very similar. 4. The differences between simultaneous 3-hourly northern and southern hemisphere index values is reduced. 5. The correlation between the 3-hourly northern and southern hemisphere index values is increased (overall and in all individual years). 6. The correlation between the daily means of northern and southern hemisphere index values is slightly increased (overall and in all individual years). 7. The mean annual variation in the northern and southern hemisphere indices is very similar and that difference is consistent with a seasonal effect only. 8. The difference between the mean diurnal variations in the northern and southern hemisphere indices is greatly reduced and there is almost no residual diurnal variation in the new aa H index. 9. The equinoctial time-of-day/time-of-year pattern in the new aa H index matches that in am. 10. The equinoctial time-of-day/time-of-year pattern appears in the new hemispheric indices (but does not exactly replicate those in the hemispheric am indices).
We note that there are limits to how much a station sensitivity model can do in terms of correcting a one-station hemispheric range index such as aa N and aa S into a more representative global index made from a longitudinal ring of stations, such as an and as. As discussed above, one reason is that sensitivity at one station could be low enough for the signal to fall below the noise level. In such cases, dividing by the low (<1) sensitivity amplifies both the noise and the signal and will not recover the signal that was seen in the other hemisphere. Furthermore, Caan et al. (1978) found that, in addition to the amplitude of the response to a substorm varying with the MLT of the station, the waveform of the response varies also. This means that some substorms will cause a large range measurement in one 3-hour interval, but another station, at a different MLT, might detect the largest range measurement in the previous or the next 3-hour interval. Correction using division by the modelled sensitivity could not correct for such an occurrence. Hence the method has its limitations. However, in all the ways that we have tested the new homogenized indices, they out-perform the classic aa indices and so application of the sensitivity model has made improvements. Essentially, factors that ideally would average out when taking the mean of the two hemispheric indices but in practice do not exactly cancel, have here been allowed for, at least to some extent. The fact that there is a UT range when, because of its longitude, the southern hemisphere station sees lower values and this never occurs for the northern hemisphere station, will have been accounted for in average values of our new index because we calibrated both hemispheric data series against the am index.
There are two possible objections to using the modelled sensitivity to improve aa to aa H that we can foresee. The first is that we are correcting scaled and quantized K values that were generated against a fixed (K) scale. However, we note that this is already done in the generation of the classic aa indices because of the use of the station scaling factors. For the classic aa, these are constants for the station location, whereas here we are using scale factors based on the station location but that change with time because of secular change in the geomagnetic field and because of Earth's orbit and rotation. We allow for such effects using a repeatable algorithm that can be evaluated by anyone. The second objection is that we are using model values to adjust observations. Again, the principle of this is already inherent in the classic aa index because the range thresholds that define the K-index bands are set by a model (specifically, they are set by a model of the dependence of the observed range value on the separation of the station and the auroral oval). In addition, we note that the IHV geomagnetic index (Svalgaard & Cliver, 2007) assumes the equinoctial model of the UT-F dependence in its construction. Hence we think that in neither case are we doing something that is not already inherent in the generation of the classic aa indexwe are just implementing a more complex scheme to correct for the limitations of the original K-index scaling.
In a later paper we will study how these station sensitivity considerations influence more complex range indices (such as ap, Kp, am, an and as) and how they have been influenced by changes to the network of stations from which they have been compiled. In the present paper we have restricted our attention to the aa index and through the ten improvements listed above, shown that applying the sensitivity model to the stations can greatly improve how the index quantifies geomagnetic activity, even though it only employs two stations. Note that in correcting the aa data series we have allowed for the effects of observatory location on sensitivity (including the observatory changes), secular drifts in the geomagnetic field and the intercalibration of the instrumentation and local site characteristics at the different observatories.
In another subsequent paper we will study large and extreme events in both the daily and 3-hourly data of the homogenised index. The 3-hourly classic aa values have been used to rank historic geomagnetic storms since 1868 by Vennerstrom et al. (2016), but the rank order for the new 3-hourly aa H values vary considerably from that for the 3hourly classic aa values.
The annual means of the new indices (aa HN , aa HS and aa H ) are supplied in the supplementary material attached to Paper 1. The three-hourly values of the new indices (aa HN , aa HS and aa H ) and their daily averages (Aa HN , Aa HS and Aa H ) are supplied in separate files in the Supplementary material attached to the present paper. For some applications it may be useful to use 3-hourly or daily indices that have been corrected for the effects of secular field change and recalibrated but have not been further modified using our sensitivity model: these are also given in the supplementary material (3-hourly aa/ s, aa N /s and aa S /s values and their daily means haa/si, haa N /si, and haa S /si).