Issue 
J. Space Weather Space Clim.
Volume 8, 2018



Article Number  A53  
Number of page(s)  27  
DOI  https://doi.org/10.1051/swsc/2018038  
Published online  03 December 2018 
Research Article
A homogeneous aa index: 1. Secular variation
^{1}
Department of Meteorology, University of Reading, Whiteknights Campus Earley Gate, PO Box 243, Reading
RG6 6BB, UK
^{2}
Institut de Physique du Globe de Strasbourg, UMR7516; Université de Strasbourg/EOST, CNRS, 5 rue René Descartes, 67084
Strasbourg Cedex, France
^{3}
British Geological Survey, Edinburgh
EH14 4AP, UK
^{4}
International Service of Geomagnetic Indices, 5 rue René Descartes, 67084
Strasbourg cedex, France
^{*} Corresponding author: m.lockwood@reading.ac.uk
Received:
7
April
2018
Accepted:
25
September
2018
Originally complied for 1868–1967 and subsequently continued so that it now covers 150 years, the aa index has become a vital resource for studying space climate change. However, there have been debates about the intercalibration of data from the different stations. In addition, the effects of secular change in the geomagnetic field have not previously been allowed for. As a result, the components of the “classical” aa index for the southern and northern hemispheres (aa _{S} and aa _{N}) have drifted apart. We here separately correct both aa _{S} and aa _{N} for both these effects using the same method as used to generate the classic aa values but allowing δ, the minimum angular separation of each station from a nominal auroral oval, to vary as calculated using the IGRF12 and gufm1 models of the intrinsic geomagnetic field. Our approach is to correct the quantized a _{ K }values for each station, originally scaled on the assumption that δ values are constant, with timedependent scale factors that allow for the drift in δ. This requires revisiting the intercalibration of successive stations used in making the aa _{S} and aa _{N} composites. These intercalibrations are defined using independent data and daily averages from 11 years before and after each station change and it is shown that they depend on the time of year. This procedure produces new homogenized hemispheric aa indices, aa _{HS} and aa _{HN}, which show centennialscale changes that are in very close agreement. Calibration problems with the classic aa index are shown to have arisen from drifts in δ combined with simpler corrections which gave an incorrect temporal variation and underestimate the rise in aa during the 20th century by about 15%.
Key words: Space climate / Space weather / Geomagnetism / Space environment / Historical records
© M. Lockwood et al., Published by EDP Sciences 2018
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1 Introduction
1.1 The derivation of the classic aa index
In his book (Mayaud, 1980), PierreNoël Mayaud attributes the origins of the idea for the aa index to the 1969 IAGA (International Association of Geomagnetism and Aeronomy) meeting in Madrid, where a request for an effort to extend geomagnetic activity indices back in time was made by Sydney Chapman on behalf of the Royal Society of London. Mayaud’s subsequent work resulted in an index somewhat different from that which Chapman had envisaged, but which covered 100 years between 1868 and 1967 (Mayaud, 1971) and has become a key component of research into space climate change. This index, termed aa, was adopted at the 1975 IAGA meeting in Grenoble (IAGA, 1975). It was made possible by the availability of magnetic records from two old observatories, Greenwich in southern England and Melbourne in Australia. These two stations are almost antipodal, roughly at the same geomagnetic latitude and 10 h apart in local time. To make a full data sequence that extends from 1868 to the present day, it is necessary to use 3 stations in each hemisphere. In England they are: Greenwich (IAGA code GRW, 1868–1925, geographic latitude 51.477°N, 0.000°E), Abinger (ABN, 1926–1956, 51.185°N, 359.613°E), and Hartland (HAD, 1957–present, 50.995°N, 355.516°E). In Australia they are: Melbourne (MEL, 1868–1919, −37.830°N, 144.975°E), Toolangi (TOO, 1920–1979, −37.533°N, 145.467°E) and Canberra (CNB, 1980–present, −35.315°N, 149.363°E).
The aa index is based on the K values for each station, as introduced by Bartels et al. (1939). These are derived from the range of variation observed at the station in 3hour intervals. The formal procedure for deriving K is: the range (between minimum and maximum) of the irregular variations (that is, after elimination of the regular daily variation) observed over a 3hour interval in either of the horizontal components (X northward or Y eastward, whichever gives the larger value) is ranked into 1 of 10 classes (using quasilogarithmic band limits that are specific to the observatory) to which a K value of 0–9 is assigned. The advantage of this procedure is that the scale of threshold values used to convert the continuous range values into the quantized K values is adjusted for each station to allow for its location and characteristics such that the K value is a standardized measure of the geomagnetic activity level, irrespective of from where it is measured. In practice, the range limits for all K bands are all set by just one number, L, the lower limit of the K = 9 band because the same relative scale is used at all stations and so the thresholds for the K bands 1–8 are scaled from L, the lower limit for the K = 0 band being set to zero (Menvielle & Berthelier, 1991). The derivation of the K values (and from them the a _{ K } value and aa _{N} and aa _{S}) is illustrated schematically in Figure 1.
Fig. 1 Schematic illustration of the generation of K and a _{ K } indices. Illustrative variations of the two orthogonal horizontal field components measured at one site are shown, X (toward geographic north, in blue) and Y (toward geographic east, in orange). These variations are after the regular diurnal variation has been subtracted from the observations. In the fixed 3hour UT windows (00–03 UT, or 03–06 UT, and so on up to 21–24 UT), the range of variation of both components between their maximum and minimum values is taken, ΔX and ΔY. The larger value of the two is kept and scaled according to a standard, quasilogarithmic scale (illustrated by the black and mauve bands to the right) for which all Kband thresholds are set for the site in question by L, the threshold range value for the K = 9 band. The value of L for the site is assigned according to the minimum distance between the site and a nominal (fixed) auroral oval position. The K value is then converted into the relevant quantised value of a _{ K } (in nT) using the standard “midclass amplitudes” (K2aK) scale. In the schematic shown, ΔX > ΔY, thus the X component gives a K value of 8 (whereas the Y component would have given a K of 5). Thus for this 3hour interval, a _{ K } value would be 415 nT. In the case of the classic aa indices, the hemispheric index (aa _{N} or aa _{S}, for the observatory in the northern or southern hemisphere, respectively) is f × a _{ K }, where f is a factor that is assumed constant for the observing site. 
The value of L used for a station is set by its closest proximity to a nominal auroral oval. To understand this, we note that midlatitude range indices respond most strongly to the substorm current wedge (e.g. Saba et al., 1997; Lockwood, 2013), resulting in very high correlations with auroral electrojet indices such as AE and AL (e.g. Adebesin, 2016). For example, the correlation coefficient between the available coincident 50 annual means of the standard auroral electrojet AE(12) index and the ap index (based on the K values from a network of stations) is 0.98 (significant at the 99.99% level), and the correlation between the 17461 coincident daily means of AE(12) and Ap (Ap being daily means of ap) is 0.84 (significant to the same level). This means that the range response of a station is greatest in the midnight Magnetic Local Time (MLT) sector (Clauer & McPherron, 1974). As well as the response being smaller away from midnight, the typical time variation waveform also varies with MLT (Caan et al., 1978). The range variation in a substorm is generally greatest in the auroral oval and decreases with decreasing latitude. This is mainly because the response of hightimeresolution geomagnetic measures (such as the H component at the ground or the equivalent currents at 1minute resolution) show a marked decrease in amplitude with increasing distance from the auroral oval (an example of the former is presented by Rostoker (1972) and a statistical survey of the latter during 116 substorms seen from 100 geomagnetic stations is presented by Gjerloev & Hoffman (2014)). This means that the range in the H values in 3hour intervals also shows a decrease with increasing distance from the auroral oval. However, we note that at lower latitudes the variation becomes rather more complex. Ritter & Lühr (2008) surveyed the effects of 4000 substorm responses statistically at 4 stations, the most poleward of which was Niemegk. They found (their Fig. 8) that the initial response to substorm expansion phase onset in 1minute H values is actually almost constant with latitude at these low and middle latitudes, but at the higher magnetic latitude stations there was a faster subsequent decay in the substorm perturbation to H. The resulting effect on the values of the range in H during 3hour intervals is again a tendency for them to decrease with decreasing latitude, but it appears to have a different origin from that seen at higher latitudes, closer to the auroral oval.
To account for the latitude variation of the range response, the value of L used to set the K band limits is set by the minimum distance between the station and a nominal auroral oval position. Because of the offset of the auroral oval towards the nightside, this minimum distance (quantified by the geocentric angle between the station and the point of closest approach of the nominal auroral oval, δ) is set using a nominal oval at corrected geomagnetic latitude Λ_{CG} = 69°, which is an average oval location in the midnight sector where substorm expansions occur.
A key point is that in compiling the classic aa index, the L values have been assumed to remain constant over time for a given station, which means that the effects of secular changes in the geomagnetic field on δ have not been accounted for. Mayaud was aware of the potential for secular change in δ values but discounted it as small stating “note that the influence of the secular variation of the field on the distances to the auroral zone is such that the resulting variations of the lower limits for K = 9 are practically negligible at a scale of some tens of years” (Mayaud, 1968). Hence, in part, his view arose because saw aa as being generated to cover the previous 100 years and did not foresee its continued extension to cover another 50 years. Being aware that the effect of secular change in the intrinsic field could not be ignored indefinitely, Chambodut et al. (2015) proposed new α _{15} indices, constructed in a way that means that the secular drift in the magnetic latitude of the observatories used is accounted for. In addition, Mursula & Martini (2007) also noted the potential effect of the secular change on the Kvalues from the Sodankylä observatory.
The approach taken to generate aa is that the range data were scaled into Kvalues using the band limits set by assigned L values for the stations used to generate the northern and southern hemisphere indices. The values of L used by ISGI to define the Kband scales are 500 nT for all aa stations except Canberra (CNB) for where L = 450 nT is used, because of its greater distance from the auroral oval. These K values are then converted into a _{ K } values using a standard scale called “midclass amplitudes”, K2aK (Mayaud, 1980), given by Figure 1. However, in order to achieve intercalibration of the data from different stations, the a _{ K } values from each station were multiplied by a constant correction factor for that station to give aa _{N} and aa _{S} for the northern and southern hemisphere, respectively. The correction factors took into account two things: a constant magnetic latitude correction and an induction effect correction. The correction factors adopted were: 1.007 for Greenwich; 0.934 for Abinger; 1.059 for Hartland; 0.967 for Melbourne; 1.033 for Toolangi; and 0.976 for Canberra (using L = 450 nT for Canberra). Note that this has an effect on the allowed quantization levels of the indices. Without the correction factors there would be 10 allowed levels for both aa _{N} and aa _{S}. Averaging them together to get aa would give 19 possible values. Using the scaling factors means that at any one time there are still only 19 possible quantized levels, but those levels change a little with each station change (i.e. at 1920, 1925, 1957, and 1980).
Having the two aa stations roughly 10 h of local time apart means that one of the two is on the nightside at any time. This means that we cannot expect the two stations to agree at any given time. However, ideally there would be no systematic hemispheric asymmetries and, on average, the behavior of aa _{N} and aa _{S} should be the same. It has long been recognized that this is not the case for the classic aa index. Bubenik & FraserSmith (1977) studied the overall distributions of aa _{N} and aa _{S} and found that they were different: they argued that the problem was introduced by using a quantization scheme, a potential problem discussed by Mayaud (1980). Love (2011) investigated the difference in distributions of the K values on which aa _{N} and aa _{S} are based. This asymmetry will be investigated in Paper 2 of this series (Lockwood et al., 2018b) using a model of the timeofyear and timeofday response functions of the stations, allied to the effects of secular change in the main field (and associated station intercalibration issues) that are the subject of the present paper.
1.2 Hemispheric asymmetry in the centennialscale change of the classic aa index
Figure 2a illustrates another hemispheric asymmetry in the classic aa index. It shows annual means of aa _{N} (in red) and aa _{S} (in blue). These are the values averaged together in the generation of the official aa index by L’École et Observatoire des Sciences de la Terre (EOST), a joint of the University of Strasbourg and the French National Center for Scientific Research (CNRS) institute, on behalf of the International Service of Geomagnetic Indices (ISGI). The magnetometer data are now supplied by British Geological Survey (BGS), Edinburgh for the northern hemisphere and Geoscience Australia, Canberra for the southern hemisphere. We here refer to these aa _{N}, aa _{S} and aa data as the “classical” values, being those that are used to derive the official aa index by EOST, as available from ISGI (http://isgi.unistra.fr/) and data centers around the world.
Fig. 2 Variations of annual means of various forms of the aa index. (a) The published “classic” northern and southern hemisphere indices (aa _{N} and aa _{S} in red and blue, respectively). Also shown (in green) is 1.5 × a _{NGK}, derived from the Kindices scaled from the Niemegk data. The vertical dashed lines mark aa station changes (cyan: Melbourne to Toolangi; green: Greenwich to Abinger; red: Abinger to Hartland; and blue: Toolangi to Canberra). (b) The homogenized northern and southern hemisphere indices (aa _{HN} and aa _{HS} in red and blue, respectively) generated in the present paper. The thick green and cyan line segments are, respectively, the a _{NGK} and am index values used to intercalibrate segments. (c) The classic aa data series, aa = (aa _{N} + aa _{S})/2 (in mauve) and the new homogeneous aa data series, aa _{H} = (aa _{HN} + aa _{HS})/2 (in black). The orange line is the corrected aa data series aa _{C} generated by Lockwood et al. (2014) by recalibration of the AbingertoHartland join using the Ap index. (Note that before this join, aa and aa _{C} are identical and the orange line is not visible as it is underneath the mauve line). The cyan line and points show annual means of the am index. The grayshaded area in (c) is the interval used to calibrate aa _{HN} and aa _{HS} (and hence aa _{H}) against am. 
It can be seen that although aa _{N} and aa _{S} agree well during solar cycles 14–16 (1900–1930), aa _{N} is progressively larger than aa _{S} both before and after this interval. The vertical lines mark station changes (cyan for MEL to TOO; green for GRW to ABN; red for ABN to HAD; and blue for TOO to CNB). There has been much discussion about possible calibration errors between stations at these times. In particular, Svalgaard et al. (2004) pointed out that the classic aa _{N} values showed a major change across the ABNHAD join. These authors argued from a comparison against their “interhour variability” index, IHV, that this was responsible for an extremely large (8.1 nT) step in aa, such that all the upward drift in aa during the 20th century was entirely erroneous. However, the early version of IHV that Svalgaard et al. had employed to draw this conclusion came from just two, nearby, Northern Hemisphere stations, Cheltenham and Fredricksburg, which were intercalibrated using the available 0.75 yr of overlapping data in 1956. This calibration issue only influenced aa _{N} and Lockwood (2003) pointed out that, as shown in Figure 2a, aa _{S} also showed the upward rise over the 20th century, albeit of slightly smaller magnitude than that in aa _{N} (and hence, by definition aa). Using more stations, Mursula et al. (2004) found there was an upward drift in IHV over the 20th century, but it depended on the station studied; nevertheless, they inferred that the upward drift in aa was probably too large. As a result, Svalgaard et al. (2003) subsequently revised their estimates of a 1957 error in aa down to 5.2 nT (this would mean that 64% of the drift in aa was erroneous). However, Mursula & Martini (2006) showed that about half of this difference was actually in the IHV estimates and not aa, being caused by the use of spot samples by Svalgaard et al., rather than hourly means, in constructing the early IHV data. This was corrected by Svalgaard & Cliver (2007), who revised their estimate of the aa error further downward to 3 nT. Other studies indicated that aa needed adjusting by about 2 nT at this date (Jarvis, 2004; Martini & Mursula, 2008). A concern about many of these comparisons is that they used hourly mean geomagnetic data which has a different dependence on different combinations of interplanetary parameters to range data (Lockwood, 2013). Recent tests with other range indices such as Ap (Lockwood et al., 2014, Matthes et al., 2016) confirm that an upward skip of about 2 nT at 1957 is present in aa (about one quarter of the original estimate of 8.1 nT). However, it is important to stress that this calibration arises for data which do not contain any allowance for the effects of the secular change in the geomagnetic field (in the present paper, we will show that the rise in the classic aa between 1902 and 1987 is indeed slightly too large, but this arises more from neglecting the change in the intrinsic geomagnetic field than from station intercalibration errors).
The argument underpinning the debate about the calibration of aa was that the minimum annual mean in 1901 (near 6 nT) was much lower than any seen in modern times (14 nT in 1965) and so, it was argued, erroneous. This argument as shown to be specious by the low minimum of 2009 when the annual mean aa fell to 8.6 nT. Furthermore, subsequent to that sunspot minimum, solar cycle 24 in aa has been quite similar to cycle 14 (1901–1912) and so the rise in average aa levels between cycles 14 and 22 has almost been matched by the fall over cycles 23 and 24. This does not necessarily mean that the classic aa for cycle 14 is properly calibrated, but it does mean that the frequentlyused argument that it must be in error was false.
An upward 2 nT calibration skip in aa implies a 4 nT skip in aa _{N} and Figure 2a shows that after 1980 aa _{N} exceeds aa _{S} by approximately this amount. Hence it is tempting to ascribe this difference between aa _{N} and aa _{S} to the one calibration skip. However, inspection of the figure reveals aa _{N} grows relative to aa _{S} before the ABNHAD change in 1957. In Figure 2a, also plotted (in green) are annual mean a _{ K } values based on the Kindex data from Niemegk (NGK, 1880–present). These have been scaled using the same midclass amplitudes (K2aK) to give a _{NGK} and then multiplied by a bestfit factor of 1.5 to bring it into line with aa _{S}. It can be seen that 1.5 a _{NGK} and aa _{S} are very similar in all years, implying that the upward drift in aa _{N} is too large, even if it is not the ABNHAD change that is solely responsible.
1.3 Studies of space climate change using the aa index
Feynman & Crooker (1978) reconstructed annual means of the solar wind speed, V _{SW}, from aa, using the fact that aa, like all range geomagnetic indices, has an approximately V _{SW} ^{2} dependence (Lockwood, 2013). However, on annual timescales, aa also has a dependence on the IMF field strength, B, which contributes considerably to the long term drift in aa. Lockwood et al. (1999) removed the dependence of aa on V _{SW} using its 27day recurrence (which varies with mean V _{SW} on annual timescales) and derived the open solar flux (OSF, the total magnetic flux leaving the top of the solar corona) using “the Ulysses result” that the radial component of B is largely independent of heliographic latitude (Smith & Balogh, 1995; Lockwood et al., 2004; Owens et al., 2008). This variation was modelled using the OSF continuity equation by Solanki et al. (2000), who employed the sunspot number to quantify the OSF emergence rate. This modelling can be extended back to the start of regular telescopic observations in 1612. Svalgaard & Cliver (2005) noted that different geomagnetic indices have different dependencies on the IMF, B and the solar wind speed, V _{SW}, and therefore could be used in combination to derive both. This was exploited by Rouillard et al. (2007) who used aa in combination with indices based on hourly mean geomagnetic data to reconstruct annual means of B, V _{SW} and OSF back to 1868. Lockwood et al. (2014) used 4 different pairings of indices, including an extended aa data series (with a derived 2 nT correction for a presumed aa _{N} calibration skip in 1957) to derive B, V _{SW} and OSF, with a full uncertainty analysis, back to 1845. Lockwood & Owens (2014) extended the modelling to divide the OSF into that in the streamer belt and in coronal holes and so computed the streamer belt width variation which matches well that deduced from historic eclipse images (Owens et al., 2017). The streamer belt width and OSF were used by Owens et al. (2017), along with 30 years’ of output from a dataconstrained magnetohydrodynamic model of the solar corona based on magnetograph data, to reconstruct solar wind speed V _{SW} and number density N _{SW} and the IMF field strength B, based primarily on sunspot observations. Using these empirical relations, they produced the first quantitative estimate of global solar wind variations over the last 400 years and these were employed by Lockwood et al. (2017) to compute the variation in annual mean power input into the magnetosphere and by Lockwood et al. (2018a) to estimate the variation in geomagnetic storm and substorm occurrence since before the Maunder minimum. The aa index data were also used by the CMIP6 project (the 6th Coupled Model Intercomparison Project) to give a comprehensive and detailed set of solar forcing reconstructions for studies of global and regional climate and of space weather (Matthes et al., 2016). Vennerstrom et al. (2016) used the aa index to investigate the occurrence of great geomagnetic storms since 1868.
Hence the aa index has been extremely valuable in reconstructing space climate, and in taking the first steps towards a space weather climatology that covers more general conditions than do the direct satellite observations (which were almost all recorded during the Modern Grand Maximum (Lockwood et al., 2009)). In addition, the aa data have been hugely valuable in facilitating the exploitation of measured abundances of cosmogenic isotopes, ^{14}C, ^{10}Be and ^{44}Ti (Usoskin, 2017). These records of past solar variability, stored in terrestrial reservoirs such as tree trunks, ice sheets and fallen meteorites, do not overlap much (or at all) with modern spacecraft data. For example, ^{14}C cannot be used after the first atomic bomb tests, and recent ^{10}Be data is less reliable as it is taken from the firn rather than the compacted snow of the ice sheet, whereas ^{44}Ti accumulates in meteorites over very long intervals. The extension of spacecraft data by reconstructions based on aa has given an overlap interval since 1868 which can be used to aid the interpretation of the cosmogenic data (Asvestari & Usoskin, 2016; Owens et al., 2016).
1.4 Making a homogeneous aa index
From Section 1.3, it is apparent that the aa index is very important to studies of past space climate. The issues (such as hemispheric asymmetries and calibration glitches) in the aa index discussed here and other limitations (such as the strong artefact diurnal variation caused by the use of just 2 stations) will not invalidate the space climate work that has been done using aa, although they may call for some corrections. However, the increasing use and importance of aa makes it timely to take a comprehensive look at these issues. In Paper 2 (Lockwood et al., 2018b) we study how the compilation of the aa index influences its timeofday and timeofyear response and, as far as is possible, we make corrections for this and explain and correct the northsouth asymmetries in the distributions of 3hourly aa values. In the present paper, we study the difference in the longterm drift of the northern and southern aa indices. We show that the intercalibration glitches in aa, particularly that between Abinger and Hartland, were actually not just errors, but were also necessary to compensate for the drifts introduced into the data by the secular change in the intrinsic geomagnetic field. Figure 2b shows the end result of the process detailed in the present paper – a process that makes allowance for the effects of these drifts on the aa _{N} and aa _{S} values and then recalibrates the joins between data from the different stations. It can be seen from Figure 2 that the resulting “homogenized” aa _{HN} and aa _{HS} indices obtained from this process are much more similar to each other than are the classic aa indices, aa _{N} and aa _{S}.
Note that in this paper, we do just two things. Firstly, we correct Mayaud’s derivation to allow for secular drift in the main geomagnetic field – a factor which he understood but decided could be neglected. Indeed, part of the brilliance of Mayaud’s formulation was to use the minimum distance to the auroral oval, which is less subject to secular change than the geomagnetic latitude of the station. This is because both the geomagnetic latitude of the station and the geographic latitude of the average auroral oval drift with the secular change in and, although the two do not change in precisely the same way, there are similarities and so part of the secular drift is cancelled out by taking the difference between the two, δ. (Of course they do not cancel completely and that is why there is still a requirement to correct for the secular change in the main field). Secondly we revisit the intercalibration of the stations which becomes necessary when the station data has been corrected for the effect of the secular field change. We take the opportunity to calibrate the revised aa to modern data from the am index which is derived from a global network of 24 stations. As a test of the validity of our approach we show that it makes the variations of the annual means of the northern and southern hemisphere aa indices, aa _{N} and aa _{S}, much more similar although we make no changes that were designed in advance to make them similar. The reason why this is a useful improvement to the index comes from the rationale for averaging aa _{N} and aa _{S} together to get an index (aa) that is hoped to be global in its application and implications. In deriving aa, Mayaud selected the sites to be as close to antipodal as possible and give a continuous data sequence: he did not do calculations that showed that although aa _{N} and aa _{S} are different, the sites are in somehow special such that the difference between aa _{N} and a true global value (that would be detected from an extensive global network) is equal and opposite to that for aa _{S} – a condition that would guarantee that on averaging one gets a valid global mean. This being the case, the only rationale for averaging aa _{N} and aa _{S} to get a valid representation of a global mean is that they should the same. Note that this does not alone solve the asymmetry between the distribution of the aa _{N} and aa _{S} values which is investigated in Paper 2 (Lockwood et al., 2018b).
2 The effect of secular change in the magnetic field
Figure 3 shows the variation of the scale factor, s(δ), derived from the threshold range value L that defines K = 9, with the minimum geocentric angular separation of the station from a nominal auroral oval, δ. The oval is defined to be along typical corrected geomagnetic latitude (Λ_{CG}) of the nightside aurora of 69°. This empirical variation is taken from Mayaud (1968) and is the basis of the L values used to scale Kindices from observed range for all midlatitude stations. The scale factor s(δ) normalizes to an idealized Niemegk station (for which δ = 19° and L = L _{o} = 500 nT, the constant reference values established by Mayaud). The curve is described by the polynomial:$$s\left(\delta \right)=(L/{L}_{\mathrm{o}})=\mathrm{}3.83090.32401.\delta +0.01369.{\delta}^{2}(2.7711.{10}^{4}).{\delta}^{3}+(2.1667.{10}^{6}).{\delta}^{4}$$(1)where δ is in degrees. Equation (1) applies over the range 11° < δ < 40° which requires that the station be at midlatitudes (the relationship not holding for either equatorial or auroral stations).
Fig. 3 The variation of the scale factor s(δ) derived from threshold range value L that defines the K = 9 band, with the minimum angular separation of the station from a nominal auroral oval, δ. This empirical variation is scaled from Mayaud (1968, 1972) and is the basis of the L values used to scale Kindices from observed range for all midlatitude stations. The scale factor s(δ) normalizes to the idealized Niemegk station for which δ = 19° and L = Lo = 500 nT (ideal static Mayaud values). 
In this paper, corrected geomagnetic latitudes (Λ_{CG}), and Magnetic Local Times (MLT), are computed using the IGRF12 model (Thébault et al., 2015) for dates after 1900. For dates before this (not covered by IGRF12) we employ the historical gufm1 model (Jackson et al., 2000), values being scaled using linear regression of values from IGRF12 for an overlap intercalibration interval of 1900–1920. Figure 4a shows the variations of Λ_{CG} for the various stations used to generate aa, plus that of Niemegk (NGK, in orange). The vertical lines show the dates of transfer from one station to the next, using the same color scheme as Figure 2. It can be seen that for much of the 20th century the geomagnetic latitude of the northern and southern hemisphere stations changed in opposite directions, with the northern stations (GRW, ABN and HAD) drifting equatorward and southern (MEL and TOO) drifting poleward. This changed around 1984 when CNB began to drift equatorward, the same direction as the northern hemisphere station at that time, HAD.
Fig. 4 Analysis of the effect of secular change in the geomagnetic field on the aa magnetometer stations using a spline of the IGRF12 and the gufm1 geomagnetic field models (for after and before 1900, respectively). (a) The modulus of the corrected geomagnetic latitude, Λ_{CG} of the stations; (b) the angular separation of the closest approach to the station of a nominal nightside auroral oval (at Λ_{CG} = 69°), δ; and (c) the scale factor s(δ) = L/L _{o} where L is given as a function of δ by Figure 3 and L _{o} = 500 nT, the reference value for the Niemegk station (for which δ is taken to be 19°) except for Canberra which, because of its more equatorward location, is scaled using L _{o} = 450 nT. The northern hemisphere stations are Greenwich (code GRW, in mauve), Abinger (ABN, in green) and Hartland (HAD, in red). The southern hemisphere stations are Melbourne (MEL, in black), Toolangi (TOO, in cyan) and Canberra (CNB, in blue). Also shown is Niemegk (NGK, in orange: data available since 1890). Vertical dashed lines mark aa station changes. 
These changes in the Λ_{CG} of stations were accompanied by changes in the geographic latitude of the nominal aurora oval at Λ_{CG} = 69°. To compute δ or a given date, we use the geomagnetic field models to calculate the Λ_{CG} = 69° contour in the relevant hemisphere in geographic coordinates and then spherical geometry to find the angular great circle distances between the station in question and points on this contour: we then iterate the geographic longitude of the point on the contour until the minimum angular distance is found, which is δ. The variations of δ derived this way for each station are shown in Figure 4b. Using equation (1), this gives the variation of scale factors s(δ) in Figure 4c for each station. It can be seen that the secular change in the intrinsic field has caused a considerable drift in the threshold value for the K = 9 band, L, that should have been used. In compiling the original aa index, it was assumed that s(δ) for each station remained constant (the scale factors given in Section 1.1 being 1/s(δ) and assumed constant). Remember also that larger s(δ) means a higher L which would give a lower aa value. We could consider reanalyzing all the range data using Kscale band thresholds that varied according to Figure 4c: correcting the band thresholds would change many Kvalues, but would also leave many unchanged. However, there are now 150 years of aa data which gives 0.87 million 3hourly intervals to analyse from the two stations, many of which are not available as digital data. Clearly this would be a massive undertaking but it would also be a change in the construction philosophy because aa values have been scaled using constant L values (500 nT for all stations except Canberra for which 450 nT is used). The station correction factors applied in constructing the classic aa values include an allowance for the fact that the L values used are not optimum for the station in question: however, where in the classic aa these factors are constants over time, we here vary them to allow for the secular change in the intrinsic geomagnetic field. Therefore we divide classic aa _{N} and aa _{S} values by the s(δ) that applies for that station at that date. From the above, we stress that this type of correction is already employed in the classic aa data, as it is the same principle as adopted when applying the scale factors for the station. The only difference is that here we use the IGRF12/gufm1 model spline to apply timedependent scale factors, s(δ), rather than the constant ones for each station used in deriving the classic aa.
Introducing these timedependent scaling factors reduces the rise in aa _{N} by 4.11%, over the interval of the Greenwich data (compared to a constant factor) – a rate of drift of 0.0721% p.a.; by 0.83% over the interval of the Abinger data (0.0258% p.a.) and by 5.37% over the interval of the Hartland data (0.0895% p.a.). On the other hand, they increase the rise in aa _{S} by 4.77% over the interval of the Melbourne data (0.0917% p.a.); by 5.28% over the interval of the Toolangi data (0.0880% p.a.); but decrease the rise in aa _{S} by over the interval of the Canberra data by 1.84% (0.0497% p.a.). Thus allowing for the secular change in the intrinsic magnetic field reduces the disparity in the long termdrifts in aa _{N} and aa _{S} that can be seen in Figure 2a.
Figure 5 summarizes the differences between the computation of the classic aa index and that of the new homogenized indices presented in this paper. The lefthand plots compare the variations in the minimum angular distance of the stations to the auroral oval δ and compares them to the constant values used in generating the classic aa index. The righthand plots show the corresponding scale factors, s(δ). The (constant) correction factors used in constructing aa were derived account for several factors in addition to δ and their reciprocals are shown in the righthand plots as dotdash lines. (Reciprocals are plotted because the correction factors were multiplicative whereas we divide by the s(δ) scale factors).
Fig. 5 Variations of (left) the minimum angular distance to the auroral oval, δ, and (right) the scalefactors, s(δ), for the aa stations. The colours used are as in Figure 4 (namely mauve for Greenwich, green for Abinger, red for Hartland, black for Melbourne, cyan for Toolangi and blue for Canberra). The thin lines are the variations shown in Figure 4 and the thick lines are constant values used in generating the classic aa. The dotdash lines in the righthand panels show the reciprocals of the standard multiplicative correction factors and the thick lines the factors corresponding to the constant δ values in the lefthand panels. 
The Mayaud latitude correction formulation has also been used to generate the am, an and as indices since their introduction in 1959. In generating new 15minute indices in four local time sectors, Chambodut et al. (2015) used a different approach employing a polynomial in the stations’ geomagnetic latitudes. Although the purpose of the two schemes is the same, a comparison cannot be made between them because the new Chambodut et al. (2015) indices are 15minute range values, as opposed to the 3hour range (K index) values used by the aa, am, as and an indices. There are four separate indices in the Chambodut et al. (2015) set, one for each of four Magnetic Local Time (MLT) sectors whereas the Mayaud formulation is designed to account predominantly for the midnight sector by taking the minimum geomagnetic latitude offset to the auroral oval (which occurs in the midnight sector). The advantage of using geomagnetic latitude is that greater precision can be obtained (because there is no need to employ a nominal oval location) but the station calibration factor needs considerable annual updates because of the secular drift in the station’s geomagnetic latitude. On the other hand, the Mayaud formulation has the advantage of being less influenced by secular change in the main field, as discussed above.
We here use Mayaud’s formulation to correct for secular change via division by the s(δ) factors. However, was also taking the opportunity to recalibrate (via linear regression) the aa index against the am index which is based on 14 stations in the northern hemisphere and 10 stations in the south. This recalibration is carried out in Section 3.1 for the Hartland and Canberra data using linear regression over 2002–2009 (inclusive), and then passed back (“daisychained”) to earlier stations (from Hartland to Abinger and then Greenwich in Sections 3.2 and 3.3 and from Canberra to Toolingi and then Melbourne in Section 3.4). Figure 6 demonstrates how well this approach works by (top panel) comparing the results of applying this procedure to modern a _{ K } data from a range of stations at different geographic latitudes, λ _{G}: (mauve) Sodankylä, SOD, λ _{G} = 67.367°N; (brown) Eskdalemuir, ESK, λ _{G} = 55.314°N; (orange) Niemegk, NGK, λ _{G} = 52.072°N; (red) Hartland, HAD, λ _{G} = 50.995°N; (blue) Canberra, CNB, λ _{G} = 35.315°S; and (green) a spline of Gangara, GNA, λ _{G} = 31.780°S and nearby Gingin, GNG, λ _{G} = 31.356°S, Gingin is the replacement for Gangara after January 2013 and the spline was made using the overlap data between August 2010 and January 2013: this station pair is chosen as they are in the same southern hemisphere longitude sector as Melbourne but are at lower geomagnetic latitude (see below). The black line shows the am index data, the linear regression against which over the calibration interval (2002–2009 inclusive) gives the slope m and an intercept i for each station. The data are means over 27day Bartels solar rotation intervals and cover 1995 to the present day for reasons discussed in later in this section. It can be seen that the level of agreement between the station data processed this way and the am calibration data is very close for all stations. The scalefactors s(δ) used in Figure 6 vary with time and location between a minimum of 0.896 (for Gangara/Gingin) and maximum of 2.298 (for Sodankylä). The range covered by the aa stations is 0.940 (for Melbourne in 1875) and 1.102 (for Greenwich in 1868) – hence our test set of stations covers all of the range of δ for the aa stations, plus a considerable amount more. The bottom panel of Figure 6 shows the rootmeansquare (rms) deviation of the individual station values from the am index, ε _{rms}. For most Bartels’ rotations this is around 5%, but in the low solar minimum of 2008/2009 rises to consistently exceed 15% and in one 27day interval reaches almost 50%. This is partly because these are percentage errors and the values of am are low, but also because by averaging 24 stations, am has much greater sensitivity at low values than a _{ K } values from a single station. For these 27day intervals the mean ε _{rms} is 9.2% and this is reduced to 3.1% in annual mean data. Hence the procedure we deploy makes modern stations give, to a very good degree of accuracy, highly consistent corrected a _{ K } values, even though they cover a much wider range of δ, and hence correction factors s(δ), than are covered by the aa stations since the start of the aa data in 1868. We estimate that for the range of s(δ) involved in the historic aa data, the latitudinal correction procedure for annual means is accurate to better than 1% on average.
Fig. 6 Top: Scaled variations of modern a _{ K } values from various stations using the station location correction procedure used in this paper. For all stations, the observed a _{ K } values have been corrected for any secular magnetic field change by dividing by the s(δ) factor and then scaled to the am index using the linear regression coefficients m and i obtained from the calibration interval (2002–2009, inclusive). The plot shows 27day Bartels rotation means for data from: (mauve) Sodankylä, SOD; (brown) Eskdalemuir, ESK; (orange) Niemegk, NGK; (red) Hartland, HAD; (blue) Canberra, CNB; and (green) a spline of Gangara, GNA and nearby Gingin, GNG (see text for details). The black line is the am index. Bottom: the rms. fit residual of the rescaled station a _{ K } indices compared with the am index, ε _{rms}, for the 27day means. The average of ε _{rms} for the whole interval shown (1995–2017), is 〈ε _{rms}〉 = 9.7% 
As discussed in the introduction, a major application of the aa index is in reconstructing the nearEarth interplanetary conditions of the past and so it is useful to evaluate if the errors shown in Figure 6 are significant in this context. The data in Figure 6 are restricted after 1995 because this allows to make comparisons with nearcontinuous data from nearEarth interplanetary space. Lockwood et al. (2018c) have shown that gaps in the interplanetary data series render most “coupling functions” (combinations of nearEarth interplanetary parameters used to explain or predict geomagnetic disturbance) highly inaccurate if they are derived using data from before 1995. By introducing synthetic data gaps into nearcontinuous data, these authors show that in many cases differences between derived coupling functions can arise because one is fitting to the noise introduced by the presence of many and long data gaps. After 1995 the WIND, ACE and DISCOVR satellites give much more continuous measurements with fewer and much shorter data gaps. Because of the danger of such “overfitting”, Lockwood et al. (2018c) recommend the power input into the magnetosphere, P _{ α,} as the best coupling function. This is because P _{ α } uses the theoretical basis by Vasyliunas et al. (1982) to reduce the number of free fit variable to just one, the coupling exponent α, and yet achieves almost as high correlations with range geomagnetic indices as coupling functions that have separate exponents for different solar wind variables which, if they do achieve a slightly higher correlation, tend to do so by overfitting and with reduced significance because of the increased number of free fit parameters. The equation for P _{ α } shows a dependence on B ^{2α } V _{SW} ^{(7/3α)}(m _{sw} N _{sw})^{(2/3α)} (where B is the interplanetary magnetic field V _{SW} is the solar wind speed and (m _{sw} N _{sw}) is the mass density on the solar wind) and so accounts for all three nearEarth interplanetary parameters with one free fit parameter, the coupling exponent, α. This is much preferable to forms such as B ^{a} V _{SW} ^{b}(m _{sw} N _{sw})^{c} which have three free fit parameters and so are much more prone to “overfitting”.
In evaluating P _{ α }, great care is here taken in handling data gaps because the oftenused assumption that they have no effect on correlation studies can be a serious source of error. As pointed out by Lockwood et al. (2018c), the muchused Omni2 interplanetary dataset gives an hourly mean value even if there is just one sample available within the hour. This is adequate for parameters such as V _{SW} that have high persistence (i.e. long autocorrelation timescales) but inadequate for parameters such as the IMF orientation factor that has and extremely short autocorrelation timescale. Another complication is that, although coupling functions made by averaging interplanetary parameters and then combining them are valid and valuable, they are not as accurate as ones combined at high time resolution and then averaged. Hence we here start from 1minute Omni data (for after 1995 when data gaps are much fewer and shorter). Hourly means of a parameter are then constructed only when there are sufficient 1minute samples of that parameter to reduce the uncertainty in the hourly mean to 5%. The required number of samples for each parameter was obtained from the MonteCarlo sampling tests carried out by Lockwood et al. (2018c). From these data, hourly means of P _{ α } are constructed (for a range of α values between 0 and 1.25 in steps of 0.01). Note that a data gap in the P _{ α } sequence is formed if any of the required parameters is unavailable. These hourly P _{ α } samples are then made into 3hourly means (matching the 8 timeofday intervals of the geomagnetic range indices) only when all three of the required hourly means of P _{ α } are available. Lastly, as used by Finch & Lockwood (2007), each geomagnetic index data series is masked out at times of the data gaps in the 3hourly P _{ α } samples (and the P _{ α } data correspondingly masked out at the times of any gaps in the geomagnetic data it is being compared to) so that when averages over a longer interval are taken (we here use both 27day Bartels solar rotation intervals and 1year intervals) only valid coincident data are included in the averages of both data sets to be correlated. We find this rather laborious procedure improves the correlations and removes many of the apparent differences between the responses of different geomagnetic observatories.
Figure 7 shows the resulting correlograms for the Bartels rotation (27day) means for the stations also used in Figure 6. The correlation coefficient is shown as a function of the coupling exponent, α. The peak correlations for these 27day means are of order 0.93 and rise to over 0.98 for annual means. Using the three separate exponents a, b and c (discussed above) causes only very small increases in the peak correlation that are not statistically significant when one allows for the additional number of degrees of freedom. The optimum exponent for am for the 27day means is α = 0.45 ± 0.07 (see Lockwood et al. (2018c) for description of the two error estimation techniques that are used to generate these 1σ uncertainties) giving a peak correlation of 0.93. For annual means the peak correlation for am is 0.99 at α = 0.44 ± 0.02 (Lockwood et al., 2018c). The optimum values for all but two of the a _{ K } stations tested fall in, or close to, this range (shown by the coloured dots and vertical dashed lines). The optimum α for Sodankylä (0.42 ± 0.10, in mauve), Niemegk (0.46 ± 0.09, in orange), Hartland (0.42 ± 0.09, in red), Canberra (0.42 ± 0.11, in blue), for Gangara/Gingin (0.49 ± 0.12, in green) and Eskdalemuir (0.56 ± 0.16, in brown) all agree with that for am to within the estimated uncertainties and all show considerable overlap in estimated uncertainty range with that for am. Note that the peak correlation coefficient is also considerably lower for ESK and we find, in general, that increased geomagnetic station noise, and in particular lower instrument sensitivity, increases the optimum α (and its uncertainty range) as well as lowering the peak correlation. We find no consistent variation with magnetic latitude nor with the minimum distance to the auroral oval, δ and effectively the same coupling exponent applies at Sodankylä (considerably closer to the auroral oval than any of the aa stations at any date) as at Gangara/Gingin (further away from the auroral oval than any aa stations at any date). Hence this test shows that the changing magnetic latitudes of the aa stations is not introducing longterm changes into the response of the index to interplanetary conditions.
Fig. 7 Correlogams showing the correlation between 27day Bartels solar rotation means of power input into the magnetosphere, Pα, with the corrected a _{ K } indices, a _{ K }/s(δ), as a function of the coupling exponent, α. The colours are for the same data as used in Figure 6: (mauve) Sodankylä, SOD; (brown) Eskdalemuir, ESK; (orange) Niemegk, NGK; (red) Hartland, HAD; (blue) Canberra, CNB; and (green) a spline of Gangara, GNA and nearby Gingin, GNG (see text for details). The black line is the am index. The coloured dots and vertical dashed lines show the optimum α that gives the peak correlation. The horizontal bars show the uncertainty in the optimum α which is the larger of the two 1σ uncertainties computed using the two procedures described by Lockwood et al. (2018c). 
3 Recalibrating the stations
The drift in the scaling factors will have influenced the intercalibration of the stations. Consider the AbingerHartland join in 1957, which has been the cause of much debate, as discussed in Section 1.2. By end of the interval of the Abinger data, the use of a constant scale factor means that the classic aa was giving aa _{N} values that were too high by 1.44/2 = 0.72%, compared to the mean value for the Abinger interval. On the other hand, for the start of the Hartland data, classic aa _{N} values were too low compared to the average for the Hartland interval by 4.41%. Given that the average aa _{N} value was 24.6 nT for 1956 and 31.6 nT for 1957, this makes a difference of 1.6 nT which is approximately half that required to explain the apparent calibration skip between the Abinger and Hartland data. This throws a new light on the calibration “glitch” at the ABNHAD join which can be regarded as being as much a necessary correction to allow for the effect of the drift in the intrinsic magnetic field as a calibration error.
If we knew the precise dates for which the classic aa index (constant) scalefactors applied, we could generalize them using the s(δ) factors and so employ Mayaud’s original station intercalibrations. However, these dates are not clear and so the corrected indices aa _{N}/s(δ) and aa _{S}/s(δ) need new intercalibrations, which is done in this section using independent data. We take the opportunity to make calibrations that can also allow for other potential factors, such as any change in the subtraction of the regular diurnal variation associated with the change from manual to automated scaling. For both the two northern hemisphere station changes we use data from the Niemegk (NGK) station in Germany, K indices from where are available from 1890. Figure 4c shows that the s(δ) factor is relatively constant for NGK (orange line) but there are nevertheless some small changes (the range of variation in s(δ) for NGK in Figure 4c is 1.8%). Hence we use a _{NGK}/s(δ), where a _{NGK} is scaled from the NGK K values using the standard midclass amplitudes scale (K2aK). For the southern hemisphere we have no independent Kindex record that is as long, nor as stable, as that from NGK. For the ToolangiCanberra join, we use the am index (compiled for a network of stations in both hemispheres, Mayaud, 1980; Chambodut et al., 2013), but find we get almost identical results if we use the southern hemisphere component of am, as, or its northern hemisphere component, an, or even a _{NGK}/s(δ). For the MelbourneToolangi join we have no other data of the duration and quality of Niemegk and so we use use a _{NGK}/s(δ).
The procedure used is to take 11 years’ data from each side of the join (roughly one solar cycle). For both the “before” and “after” interval we compare the aa station data with the calibration station data. We employ daily means, thereby averaging out the diurnal variations. As discussed in the next paragraph, we carry out the calibration separately for eight independent equallength timeofyear (F) ranges in which we regress the corrected aa station data against the corrected calibration set (for the 11 years before and after the join, respectively). This means that each regression is carried out on approximately 500 pairs of daily mean values (11 × 365/8). All regressions were tested to ensure problems did not arise because of lack of homoscedacity, outliers, nonlinearity, interdependence and using a QQ test to ensure the distribution of residuals was Gaussian (thereby ensuring that none of the assumptions of ordinary least squares regression, OLS, are violated). The scatter plot was also checked in the 11 annualmean data points because the main application of the regressions in this paper is to annual means. The “before” and “after” regressions were then compared, as discussed below.
There are a number of reasons to be concerned about seasonal variation in magnetometer calibrations. These may be instrumental, for example early instruments were particularly temperature and humidity sensitive. In addition, induced Earth currents can depend on the height of the water table (although their effect is predominantly in the vertical rather than the horizontal components). In the case of Hartland, its coastal location makes ocean currents, and their seasonal variation, a potential factor. All these may differ at different sites. The conductivities of the ionosphere, and their spatial distribution above the station, and between the station and the auroral oval, will have a strong seasonal component and again this factor may not be exactly the same at different sites. Possibly the largest concern is the quiettime regular variation, S _{R}, that must be subtracted from the data before the range is evaluated and this correction may vary with season as the S _{R} pattern moves in location over the year (Mursula et al., 2009). We note that Matthes et al. (2016) used the Ap index, derived from a wider network of midlatitude magnetometers, to recalibrate the AbingerHartland join in the aa _{N} data and found that the calibration required varied with timeofyear. For this reason, the calibrations were carried out separately in the 8 independent timeofyear (F) bins: the number of F bins was chosen as a compromise between resolution of any annual variation and maintaining a high number of samples in each regression. Although, there was general agreement between the results from the different F bins, there were also consistent differences at some times of year. Note that this procedure allows us to recalibrate not only instrumental effects but also any changes in the background subtraction and scaling practices used to derive the Kindices. Scaling has changed from manual to automated and although the latter are repeatable and testable, the former are not; however, it helps increase homogenity that most of the classic aa data up to 1968 was scaled by Mayaud himself. Lastly, we note that Bartels recognized the need to allow for changes during the year in the intercalibration of stations because the conversion factors that he derived (and are still used to this day to derive the Kp index) not only depend on the station location, the Universal Time, and the activity level, but also depend on the time of year. Bartels employed 4 intervals in the year with three calibration categories (summer, winter and equinox).
By virtue of its more extensive network of stations in both hemispheres, and its use of areaweighted groupings, the am index is, by far, the best standard available to us for a global range index. Starting in 1959, it is coincident in time with all the Canberra data and almost all of the Hartland data. It therefore makes good sense to scale both the aa _{N}/s(δ) and aa _{S}/s(δ) data to recent am data, and then “daisychain” the calibration back to the prior two stations. As noted in the case of the sunspot number data composite (Lockwood et al., 2016), there are always concerns about accumulating errors in daisy chaining; however, we note that the calibration is here passed across only two joins in each hemisphere and the correlations with independent data used to calibrate the joins are exceptionally high. Furthermore, we have an additional check (of a kind not available to use when making the many joins needed for the sunspot number composite), namely that we have independent data from other stations (and equivalent data in the IHV index) that continues through much of the sequence and across all four joins. Strictlyspeaking, the Niemegk data are also a composite, the data series coming from three nearby sites that are within 40 km of each other: Potsdam (1880–1907), Seddin (1908–1930), and Niemegk (1931–present). The site changes were made to eliminate the influence of local electrical noise. Of these site changes, only that in 1930 falls within the 11year calibration periods (either side of an aa station change) that are deployed here, being 5 years after the GreenwichAbinger join and 10 years after the MelbourneToolangi join. We note there are probably improvements that could be made to the Potsdam/Seddin/Niemegk a _{NGK} composite, particularly using data from relatively nearby observatories, such as Swider (SWI), Rude Skov (RSV), Lovö (LOV) and Wingst (WNG) (e.g. Kobylinski and Wysokinski, 2006). Using local stations is preferable because the more distant they are, the larger the difference in the change in their s(δ) factors and hence the more they depend on the main field model used. Some calibration jumps in a _{NGK} have been discussed around 1932 and 1996: the latter is not in an interval used for calibration in this paper, but 1932 does fall within the 22year spline interval used to calibrate the GreenwichAbinger join in 1925.
To test the suitability of the Niemegk a _{ K } index data for use as a calibration spline, we search for longterm drifts relative to independent data. Given that fluctuations within the 11year “before” and “after” intervals will be accommodated by the relevant regression with the aa station data, our only concern is that the mean over the before interval is consistent with that over the after interval. One station that provides Kindices that cover all the aa calibration intervals is Sodankylä (SOD) from where Kindex data is available since 1914 and the SOD data have been used to test and recalibrate aa in the past (Clilverd et al., 2005). The correlation between daily means of a _{NGK } and a _{SOD} exceeds 0.59 for the calibration intervals and the corresponding correlation of annual means always exceeds 0.97. However, this is not an ideal site (geographic coordinates 67.367°N, 26.633 E) in that it is closer to the auroral oval than the midlatitude stations that we are calibrating: its δ falls from 6.11 in 1914 to 4.69 in 2017 and these δ values are below the range over which Mayaud recommends the use of the polynomial given in Equation (1). Figure 3 highlights why this a concern, as it shows that the effects of secular changes in the geomagnetic field on the required scaling factor are increasingly greater at smaller δ. Equation (1) predicts that s(δ) for Sodankylä (SOD) will have risen from 2.302 to 2.586 over the interval 1914–2017, which would make the corrected SOD data more sensitive to the secular change correction than the data from lowerlatitude stations. However, at this point we must remember that in applying Equation (1) to the SOD data we are using it outside the latitude range which Mayaud intended it to be used and also outside the latitude range of data that Mayaud used to derive it. However, Figure 6 shows that using Equation (1) with SOC data over two solar cycles has not introduced a serious error into the a _{SOD}/s(δ) and so it does supply a valuable additional test of the NGK intercalibration data (which also covers 2 solar cycles).
Nevertheless, because of these concerns over the a _{SOD}/s(δ) data, we have also used data from other stations, in particular the Kindices from Lerwick (LER) and Eskdalemuir (ESK) for the 22 years around the AbingerHartland join. We find it is important to correct the Kindices from these stations to allow for effect of changing δ because otherwise one finds false drifts relative to Niemegk, where the change in δ has been much smaller (see Fig. 4). The procedure employed here is to linearly regress 〈a _{NGK}/s(δ)〉_{τ=1yr} and 〈a _{XXX}/s(δ)〉_{τ=1yr,} where XXX is a generic IAGA code of the station used (giving regression slope α and intercept β) then compare the ratio$$M={\langle {a}_{\mathrm{NGK}}/s\left(\delta \right)\rangle}_{\tau =11\mathrm{yr}}/(\alpha {\langle {a}_{\mathrm{XXX}}/s\left(\delta \right)\rangle}_{\tau =11\mathrm{yr}}+\beta )$$(2)for the 11year intervals before and after (M _{B} and M _{A}, respectively). The ideal result would be M _{A}/M _{B} = 1, which would mean that any change across the join in a _{NGK}/s(δ) and a _{XXX}/s(δ) was the same. Because it is highly unlikely that Neimegk and station XXX share exactly the same error at precisely the time of the join, this would give great confidence in the intercalibration.
The steps taken to generate the “homogenous” aa indices, aa _{HN}, aa _{HS} and aa _{H}, are given sequentially in the following subsections. It should be noted that we are using daisy chaining of calibrations which was partially avoided in the classic aa index only because it was assumed that the station scale factors were constant, an assumption that we here show causes its own problems. Even then, the use of the station scale factors was, in effect, a form of daisy chaining.
3.1 Scaling of the Hartland and Canberra data to the am index
The first step is to remove the constant scale factors used in the compilation of the classic aa index to recover the 3hourly a _{ K } indices, i.e. for Greenwich we compute a _{GRW} = [aa _{N}]_{GRW}/1.007, and similarly we use a _{ABN} = [aa _{N}]_{ABN}/0.934, a _{HAD} = [aa _{N}]_{HAD}/1.059, a _{MEL} = [aa _{S}]_{MEL}/0.967, a _{TOO} = [aa _{S}]_{TOO}/1.033, and a _{CNB} = [aa _{S}]_{CNB}/1.084. Given that the major application of the aa index is to map modern conditions back in time, it makes sense to scale a new corrected version to modern data. Hence we start the process of generating a new, “homogeneous” aa data series by scaling modern a _{ K }/s(δ) data (i.e. the a _{ K } values corrected for the secular change in the geomagnetic field) against a modern standard. We use the am index as it is by far the best rangebased index in terms of reducing the false variations introduced by limited station coverage and being homogeneous over time in the distribution stations it has taken data from. However, it contains no allowance for the effects of longterm change in the geomagnetic field and therefore we carry out scaling of a _{HAD}/s(δ) and a _{CNB}/s(δ) data (from Hartland and Canberra, respectively) against am for a limited period only. We employ daily means (Am, A _{CNB} and A _{HAD}) to average out the strong diurnal variation in the a _{ K } indices caused by the use of just one station and the (much smaller) residual diurnal variation in am caused by the slightly inhomogeneous longitudinal coverage (particularly in the southern hemisphere) of the am stations. We use an interval of 7 years because we find that it is the optimum number to minimise estimated uncertainties: we employ 2002–2009 (inclusive) because that interval contains the largest annual mean aa index in the full 150year record (in 2003) and also the lowest in modern times (in 2009), which is only slightly larger than the minimum in the whole record. Hence this interval covers almost the full range of classic aa values. The correlation of the daily means in this interval (23376 in number) are exceptionally high being 0.978 for Am and A _{HAD}/s(δ) and 0.969 for Am and A _{CNB}/s(δ). Linear regressions (ordinary least squares) between these pairs of data series pass all tests listed above and yield the scaling factors given in Table 1. In all regressions between data series we use both the slope (i.e. a gain term, s _{c}) and the intercept (an offset term, c _{c}) because, in addition to differences in instrument sensitivity, noise levels and background subtraction means that there may, in general, also be zerolevel differences. Hence we scale a _{HAD}/s(δ) from Hartland using:$$\begin{array}{c}{\left[{\mathrm{aa}}_{\mathrm{HN}}\right]}_{\mathrm{HAD}}=0.9566.{\mathrm{aa}}_{\mathrm{HAD}}/s\left(\delta \right)1.3448\\ (\mathrm{for}1957\u2013\mathrm{present})\end{array}$$(3) and we scale a _{CNB}/s(δ) from Canberra using: $$\begin{array}{c}{\left[{\mathrm{aa}}_{\mathrm{HS}}\right]}_{\mathrm{CNB}}=0.9507.{a}_{\mathrm{CNB}}/s\left(\delta \right)+0.4660\\ (\mathrm{for}1980\u2013\mathrm{present})\end{array}$$(4)
The correlation coefficients (r _{b} and r _{a} for daily means in 11 years before and after the joins, respectively) and the slope s _{c} and intercept c _{c} for recalibrating stations for the 8 timeofyear (F) bins employed.
Over the interval 1980–present, this gives a distribution of 3hourly ([aa _{HN}]_{HAD} − [aa _{HS}]_{CNB}) values with a mode value of zero, which means there is no systematic difference between the rescaled indices from the two sites.
3.2 Intercalibration of the Hartland and Abinger
Figure 8 details the method by which the Abinger data is calibrated to provide a backwards extension of the Hartland data which is as seamless as possible. As discussed above, the calibration was separated into 8 independent, equalduration bins of the fraction of the year, F. Bin 1 is for 0 ≤ F < 0.125; bin 2 is 0.125 ≤ F < 0.25; and so on, up to bin 8 for 0.875 ≤ F ≤ 1. The left hand column of Figure 8 shows scatter plots between the a _{ABN}/s(δ) values (i.e. the classic aa values from Abinger after removal of the original scalefactor correction and allowance for the effect of the changing intrinsic field) against the a _{NGK}/s(δ) values (the similarlycorrected values from the Niemegk K indices) for the 11year period before the join and the middle column gives the scatter plots of the corrected and rescaled aa index values from Hartland, [aa _{HN}]_{HAD}, as given by Equation (2), for the 11year period after the join, again against the simultaneous a _{NGK}/s(δ) values. In each case, the grey dots are the scatter plot for daily values and black dots are the annual means (for the range of F in question). The correlation coefficients for the daily values are given in Table 1 (we do not give the corresponding correlations for annual means as they all between 0.99 and 0.999 but of lower statistical significance, coming from just 11 samples). The red lines are linear leastsquares regression fits to the daily values and all tests show that this is appropriate in all cases. The third column plots the best linear fit of a _{NGK}/s(δ) in the interval after the join (“fit 2”) as a function of the best linear fit of a _{NGK}/s(δ) in the interval before the join (“fit 1”). The dashed line is the diagonal and would apply if the relationship of the data before the join to a _{NGK}/s(δ) were identical to that after the join. The red lines in the righthand column have slope s _{c} and intercept c _{c}. Assuming that there is no discontinuity in a _{NGK}/s(δ) coincidentally at the time of the join (which means that the relationship between the calibration data and the real aa index before the join is the same as that after the join) we can calibrate the Abinger data (corrected for secular drift) with that from Hartland (rescaled to am, as discussed in the previous section) for a given F using:$${\left[{\mathrm{aa}}_{\mathrm{HN}}\right]}_{\mathrm{ABN}}\left(F\right)={s}_{\mathrm{c}}\left(F\right).{a}_{\mathrm{ABN}}\left(F\right)/s\left(\delta \right)+{c}_{\mathrm{c}}\left(F\right)$$(5)
Fig. 8 The intercalibration of aa _{N} data across the join between the Hartland (HAD) and Abinger (ABN) observations in 1957. The data are divided into eight equallength fractionofyear (F) bins, shown in the 8 rows, with the bottom row being bin 1 (0 ≤ F < 0.125) and the top row being bin 8 (0.875 ≤ F < 1). The lefthand column is for an interval of duration 11years (approximately a solar cycle) before the join and shows scatter plots of the aa data from Abinger (after division by s(δ) to allow for secular changes in the geomagnetic field) against the similarlycorrected simultaneous NGK data, a _{NGK} /s(δ). The middle column is for an interval of duration 11years after the join and shows the corresponding relationship between the alreadyhomogenized aa data from Hartland [aa _{H}]_{HAD} and the simultaneous a _{NGK} /s(δ) data. All axes are in units of nT. The grey dots are daily means to which a linear regression gives the red lines which are then checked against the annual means (for the F bin in question) shown by the black dots. The righthand column shows the fitted lines for the “before” interval, 1, against the corresponding fitted line for the “after” interval, 2: the red line would lie on the dotted line if the two stations had identical responses at the F in question. The slope and intercept of these lines, giving the intercalibration of the two stations at that F, are given in Table 1. 
The first group of values in Table 1 gives the s _{c} and c _{c} values in each F bin for this join between the HAD and ABN data. We here ascribe these values to the centre of the respective F bin and used PCHIP interpolation to get the value required for the F of a given [aa _{N}]_{ABN} data point. The annual variations in both s _{c} and c _{c} are of quite small amplitude but are often not of a simple form. This is not surprising considering the variety of different factors that could be influencing the variations with F, and that they are not generally the same at the two stations being intercalibrated nor at Niemegk.
We use the variation with F of both the scaling factor, s _{c}, and the offset, c _{c}, because at least some of the variation of the intercalibration with F will be associated with the seasonal variation in the regular diurnal variations at the two sites and the background subtraction, which could give offset as well as gain (sensitivity) differences between the two sites.
Inspection of Figure 8 and Table 1 show that there is a variation with F in the relationship between the two sites and our procedure takes account of this. Note that the intercept values are all small and that the red lines are actually shifted from the diagonal by the ratio of the classic aa scalefactors. This emphasizes that the data from these two stations is, after allowance had been made for the secular geomagnetic drift through the s(δ) factor, similar. This reinforces the point that the large “calibration skip” between the Hartland and Abinger aa _{N} values that has been widely discussed in the literature was, in the main, a necessary correction step to allow for the effects of the secular changes in the intrinsic field. Hence making a correction for this apparent calibration error, without first correcting for the temporal variation in the scaling factor s(δ), is only a first order correction and will give somewhat incorrect results in general.
As discussed above, we use Equation (2) to check the intercalibration data from Niemegk, where station XXX is SOD, LER and ESK for this join. If we do not correct for the effect of changing δ on the scaling factor s(δ) for these stations, we obtain values of M _{A}/M _{B} of between 1.018 and 1.052, which implies there is drift in the average Neimegk data (to values that are slightly too low) of between about 3% and 5% over the intercalibration interval. However, after correcting the change in the stations’ δ (in the same way as done for the aa stations and Niemegk in Fig. 4) we get an M _{A}/M _{B} of 1.053, 1.022 and 0.946 for LER, ESK and SOD, respectively. Giving these 3 estimates equal weight gives an average of 1.007, which implies the Niemegk calibration is stable to within 0.7% for our purposes. We note that this is not a test that we can repeat in such detail for all station joins. Hence we do not attempt to correct the NGK intercalibration data, beyond allowing for the effect of the drift in δ on s(δ). However, note that we will test this approach in the level of agreement in the final full aa _{HN} and aa _{HS} data sequences and in section 5, we will compare the longterm variation of these new aa indices with the equivalent IHV index as well as with a _{NGK}/s(δ), a _{ESK}/s(δ) and a _{SOD}/s(δ).
3.3 Intercalibration of Abinger and Greenwich
Figure 9 corresponds to Figure 8, but is for the join between the Abinger and Greenwich data. Note that because the “after” data in this case are the corrected and rescaled Abinger data, [aa _{HN}]_{ABN} given by Equation (2), the slope and intercept values (s _{c} and c _{c}) for this join are influenced by both the scaling of the Hartland data to am and by the AbingertoHartland join. Hence the calibration of Hartland against am is passed back to Greenwich, as is in the nature of daisychaining. Given the data are taken from older generations of instruments and the fact that this second join is influenced by the first, we might have expected the plots to show more scatter than in Figure 8. In fact this is not the case and Table 1 shows the correlations are actually slightly higher for this intercalibration than the one discussed in the last section. Because concerns have been raised about a potential skip in the calibration of the a _{NGK} composite in 1932, we use an “after” interval of 1926–1931 (inclusive, i.e. 6 years rather than the 11 years used for other joins). The correlations for all 8 F bins were indeed found to be marginally lower if the full 11 years (1926–1936) were used but the regression coefficients were hardly influenced at all.
The corrected Sodankylä Kindices give M _{A}/M _{B} = 0.943 for this join which could imply a 6% problem with the Niemegk spline. However, we note that Sodankylä gave a lower value than the average for the AbingerHartland calibration interval which is likely to be a consequence of its close proximity to the auroral oval. As for that join, we here use the Niemegk data as a calibration spline without correction, but will test the result in Section 5.
The Greenwich data are intercalibrated using the equivalent equation to Equation (4):$${\left[{\mathrm{aa}}_{\mathrm{HN}}\right]}_{\mathrm{GRW}}\left(F\right)={s}_{\mathrm{c}}\left(F\right).{a}_{\mathrm{GRW}}\left(F\right)/s\left(\delta \right)+{c}_{\mathrm{c}}\left(F\right)$$(6)using the appropriate s _{c} and c _{c} values given in Table 1 and the interpolation in F scheme described above.
3.4 Intercalibration of the southern hemisphere stations
Figures 10 and 11 are the same as for Figure 8 for the joins between, respectively, the Canberra and Toolangi stations and between the Toolangi and Melbourne stations (note that the colours of the regression lines matches the colours used to define the joins in Fig. 2). The Toolangi and Melbourne data are corrected using the corresponding Equations to (4) and (5) to give [aa _{HS}]_{TOO} and [aa _{HS}]_{MEL}.
Figure 10 uses the am data to make the CanberraToolangi intercalibration but, as mentioned above, almost identical results were obtained if either the as index or a _{NGK}/s was used. Using a _{NGK}/s did increase the scatter in the daily values slightly, but the regression fits remained almost exactly the same. In the case of the ToolangiMelbourne join, the best comparison data available are the Niemegk K indices, but based on the above experience of using it for the CanberraToolangi join, it is not a major concern that the intercalibration data are from the opposite hemisphere, although, as expected, it does increase the scatter between the daily means.
Note that the only operation to make aa _{HN} and aa _{HS} similar is the scaling of both to am over the interval 2002–2009, achieved by Equations (2) and (3). Thereafter the northern and southern data series are generated independently of each other. Therefore the degree to which the two hemispheric indices agree with each other over time becomes a test of the intercalibrations and the stability of the datasets.
4 The homogeneous composite
We can then put together 150year composite of aa _{HN} (using [aa _{HN}]_{GRW}, [aa _{HN}]_{ABN}, and [aa _{HN}]_{HAD}) and the red line in Figure 2b shows the resulting variations in annual means. The blue line is the corresponding composite of aa _{HS} (using [aa _{HS}]_{MEL}, [aa _{HS}]_{TOO}, and [aa _{HS}]_{CNB}). Comparison with Figure 2a shows that the calibrations described in the previous section have produced hemispheric data series which agree much more closely with each other than do aa _{N} and aa _{S.} To quantify the improvement, Figure 12 compares the distributions of the differences in daily means of northern and southern hemisphere indices in 50year intervals, Δ_{NS}. The top row is for the classic aa indices (so Δ_{NS} = aa _{N} − aa _{S}). The bottom row is for the homogenised aa indices (so Δ_{NS} = aa _{HN} − aa _{HS}). The left column is for 1868–1917 (inclusive); the middle column for 1918–1967; and the righthand column for 1968–2017. Note that distributions are narrower and taller for the first time interval because mean values were lower and so hemispheric differences are correspondingly lower.
Fig. 12 Distributions of the differences in daily means of northern and southern hemisphere indices, Δ_{NS}, for 50year intervals. The top row is for the classic aa indices, so that Δ_{NS} = aa _{N} − aa _{S}. The bottom row is for the homogenised aa indices, so that Δ_{NS} = aa _{HN} − aa _{HS}. Parts (a) and (d) are for 1868–1917 (inclusive); parts (b) and (e) are for 1918–1967; and parts (c) and (f) are for 1968–2017. In each panel, the vertical orange line is at Δ_{NS} = 0, the vertical cyan line is the median of the distribution, the vertical red line the mean (〈Δ_{NS}〉), and the green lines the upper and lower deciles. Note they are plotted in the order, orange, cyan, then red and so the mean can overplot the others (this particularly occurs in the bottom row). In each panel the distribution mean, 〈Δ_{NS}〉 and the standard deviation, σ_{ΔNS}, are given. 
A number of improvements can be seen in the distributions for (aa _{HN} − aa _{HS}), compared to those for (aa _{N} − aa _{S}). Firstly the mean of the distributions has been reduced to zero (to within 10^{−3}) in all three time intervals by the homogenized index. Not only is this smaller than for the corresponding classical index, but also the upward drift in the mean value Δ_{NS} has been removed. This improvement in the mean difference quantifies the improvement that can be seen visually by comparing Figures 2a and b. Secondly, the width of the distribution in (aa _{HN} − aa _{HS}) is always lower than for the corresponding distribution of (aa _{N} − aa _{S}): this can be seen in the given values of the standard deviation, σ_{ΔNS} and in the separation of the decile values (which are given by the vertical green lines). Thirdly the Δ_{NS} distributions for the classic index show a marked asymmetry: this can be seen by the fact that the median of the distributions (vertical cyan line) is consistently smaller than the mean and that the modulus of the lower decile value is always less than the upper decile value. This asymmetry has been removed completely in the homogenized data series after 1917. (For 1868–1917 the 1σ points are symmetrical but the mode is slightly lower than the mean.) Lastly the distributions for the classic index show a tendency for quantized levels (particularly for 1868–1917) and more kurtosis in shape than for the homogenized indices. On the other hand, (aa _{HN} − aa _{HS}) shows very close to a Gaussian form at all times. If there is a physical reason why the distribution should diverge from a Gaussian, it is not clear. Hence, agreement between the northern and southern hemisphere indices has been improved, in many aspects, by the process described in this paper.
Lastly, Figure 2c compares the annual means of the homogenised aa index derived here, defined by$${\mathrm{aa}}_{\mathrm{H}}=({\mathrm{aa}}_{\mathrm{HN}}+{\mathrm{aa}}_{\mathrm{HS}})/2$$(7)with the classic aa index and the corrected aa index, aa _{C}, that was generated by Lockwood et al. (2014) by correcting the classic aa index for the HartlandAbinger intercalibration using the Ap index. The black line is the aa _{H} index from Equation (6) and so contains allowance for the secular drift in the main field and for the recalibration of stations presented in Section 3. The mauve line is the classic aa index. It can be seen that, because of the scaling to the recent am index data, the aa _{H} index values are always a bit lower than aa. The cyan line and points show annual means in the am index. It is noticeable that as we go back in time towards the start of these data, these am means follow the classic aa rather well and so become slightly larger than the corresponding annual means in aa _{H}. This indicates that the secular drift in the intrinsic geomagnetic field is having an influence on even am over its lifetime. The orange line is the corrected aa data series, aa _{C}. By definition, this is the same as aa before the AbingerHartland join 1957: hence the orange line lies underneath the mauve one in this interval. Between 1957 and 1981, aa _{C} is slightly larger than aa _{H} most of the time, but after 1981 the orange line can no longer be seen because it is so similar to aa _{H}. Hence correcting for the AbingerHartland join, without correcting for the effects of the secular drift in the intrinsic field have caused corrected indices such as aa _{C} (and others like it) to underestimate the upward rise in aa _{H}.
Taking 11point running means to average out the solar cycle, aa, aa _{C} and aa _{H} all give smoothed minima in 1902 of, respectively, 11.66 nT, 11.77 nT and 10.87 nT. The maxima for aa and aa _{H} are both in 1987, shortly after the peak of the sunspot grand maximum (Lockwood & Fröhlich, 2007), being 27.03 nT and 24.25 nT, giving a rises of 15.37 nT in aa and 13.38 nT in aa _{H} over the interval 1902–1987. The corrected index, aa _{C}, is somewhat different with a value of 24.51 nT in 1987, but a slightly larger peak of 24.80 nT in 1955. Over the interval 1902–1987 the rise in aa _{C} is 12.73 nT.
5 Comparison of the homogenized aa index with the IHV index and corrected a _{ K } values from Niemegk, Eskdalemuir and Sodankylä
The development of the InterHour Variability (IHV) index was discussed in Section 1.2. The most recent version was published by Svalgaard & Cliver (2007). It is based on hourly means of the observed horizontal magnetic field at each station and its compilation is considerably simpler than, and completely different to, that of the range indices such as aa. It is defined as the sum of the unsigned differences between adjacent hourly means over a 7hour interval centered on local midnight (in solar local time, not magnetic local time). The daytime hours are excluded to reduce the effect of the regular diurnal variation and UT variations are removed assuming an equinoctial timeofday/timeofyear pattern, which reduces the requirement to have a network of stations with full longitudinal coverage. Using data from 1996–2003, Svalgaard & Cliver (2007) showed that IHV has major peaks in the auroral ovals, but equatorward of Λ_{CG} = 55° it could be normalized to the latitude of Niemegk using a simple adhoc function of Λ_{CG}. Note that IHV does not allow for the changes in the stations’ Λ_{CG} due to the secular change in geomagnetic field. This will be a smaller factor for IHV than for the range indices as the latitude dependence is weaker. However, in IHV this effect will also be convolved with that of the changing distribution and number of available stations. This is because the number of stations contributing to the annual mean IHV values tabulated by Svalgaard & Cliver (2007) varies, with just one for 1883–1889, two for 1889–1900, rising to 51 in 1979 and before falling again to 47 in 2003. Although the removal of the diurnal variation (by assuming an equinoctial variation) and the removal of the Λ_{CG} variation (by using the polynomial fit to the latitudinal variation in the 1996–2003 data) allows the IHV index to be compiled even if only one station is available, such an index value will have a much greater uncertainty because it will not have the noise suppression that is achieved by averaging the results from many stations in later years. It must be remembered, therefore, that the uncertainties in the IHV index increase as we go back in time.
Lockwood et al. (2014) show that in annual mean data IHV correlates well (correlation coefficient, r = 0.952) with BV _{SW} ^{ n }, where B is the IMF field strength, V _{SW} is the solar wind speed and n = 1.6 ± 0.8 (the uncertainty being at the 1σ level), whereas the corrected aa index and gave r = 0.961 with n = 1.7 ± 0.8. The difference in the exponent n is small (and not statistically significant) and so we would expect the longterm and solar cycle variations in aa and IHV to be very well correlated. Indeed, Svalgaard & Cliver (2007) found that even in Bartel’s rotation period (27day) means IHV and the range am index were highly correlated (r = 0.979).
Figures 13a–f compare annual means of the new homogenised indices aa _{H,} aa _{HN} and aa _{HS} to the IHV index. The lefthand plots show the time series and the bestfit linear regression of IHV. The right plots so scatter plots of the new indices against IHV and the least squares bestfit linear regression line in each case. For comparison, the bottom panel compares the hemispheric homogenized indices aa _{HN} and aa _{HS}. The agreement is extremely good in all cases: for aa _{HS} and IHV the coefficient of determination is r ^{2} = 0.937; for aa _{HN} and IHV r ^{2} = 0.962; for aa _{H} and IHV, r ^{2} = 0.958; and for aa _{HN} and aa _{HS}, r ^{2} = 0.992. This level of agreement is exceptionally high, considering IHV is constructed in an entirely different manner, and from different data and with different assumptions (e.g. it assumes an equinoctial timeofday/timeofyear pattern). In particular, note that IHV is not homogeneous in its construction as the number of stations contributing decreases as we go back in time: this would increase random noise but not explain systematic differences. Also IHV only uses nightside data whereas k indices use data from all local times; however, k indices respond primarily to substorms (see supplementary material file of Lockwood et al., 2018c) which occur in the midnight sector. Also shown in Figure 13 are the corresponding comparisons with the corrected a _{ K } indices from Niemegk, Sodankylä, and Eskdalemuir a _{NGK}/s(δ), a _{SOD}/s(δ) and a _{ESK}/s(δ) (parts g/h, i/j, and k/l respectively). The coefficients of determination (r ^{2}) are 0.945 and 0.958 and 0.914, respectively.
Fig. 13 Comparison of the new homogenised indices aa _{H,} aa _{HN} and aa _{HS}, with the IHV index and the δcorrected a _{NGK}, a _{SOD} and a _{ESK} values. The lefthand plots show the time series and their respective bestfit linear regressions. The righthand plots show scatter plots of the new indices against the test indices (a _{NGK}/s(δ), a _{SOD}/s(δ) or IHV) and the leastsquares bestfit linear regression line. The linear correlation coefficient r is given in each case. Parts (a) and (b) are for aa _{HN} and IHV; parts (c) and (d) are for aa _{HS} and IHV; parts (e) and (f) are for aa _{HS} and IHV; parts (g) and (h) are for aa _{H} and the corrected a _{ K } values from Niemegk, a _{NGK}/s(δ); parts (i) and (j) are for aa _{H} and the δcorrected a _{ K } values from Sodankylä, a _{NGK}/s(δ), parts (k) and (l) are for aa _{H} and the δcorrected a _{ K } values from Eskdalemuir, a _{ESK}/s(δ). For comparison, the bottom panel (parts m and n) compares the hemispheric homogenized indices aa _{HN} and aa _{HS}. In each panel, the grey area defines the estimated ±1σ uncertainty in aa _{H}. 
Hence Figure 13 is a good test of the intercalibrations used in constructing aa _{HN} and aa _{HS} in the context of annual mean data. There are differences between all the regressed variations but they are small. The internal correlation between the hemispheric aa indices is now greater than that with other equivalent data series: the worst disagreements are that aa _{HN} exceeds aa _{HS} around the peak of solar cycle 17 (around 1940) and aa _{HS} exceeds aa _{HN} around the peak of solar cycle 14 (around 1907). In both cases, the independent data in Figure 13 indicate that the error is in both aa _{HN} and aa _{HS} as these data follow aa _{H} more closely. In the case of the largest error (around 1940), IHV, a _{NGK}/s(δ), a _{SOD}/s(δ) and a _{ESK}/s(δ) all also suggest that aa _{HS} is an underestimate by slightly more than aa _{HN} is an overestimate and so aa _{H} is very slightly underestimated, but only by less than 0.5 nT. It should be noted that this largest deviation between aa _{HN} and aa _{HS} occurs when the data are supplied by the Abinger and Toolangi observatories, respectively and that Figure 2b shows that aa _{HN} and aa _{HS} agree more closely both earlier and later in the interval 1925–1956 when these two stations are used. Hence the deviation is caused by relative drifts in the data from these stations and not by the intercalibrations developed in this paper.
The grey areas in the lefthand panels of Figure 13 show the estimated ±1σ uncertainty in annual aa _{H} estimates, where σ = 0.86 nT is the standard deviation of the distribution of annual (aa _{HN} − aa _{HS}) values.
Figure 14 gives a more stringent test of the station joins in the aa _{H} index using means over 27day Bartels rotation intervals. The dots in the lefthand panels show the deviations of 27means of the aa _{H} index from scaled test indices. Longterm trends in those deviations are searched for by looking at 13point running means (just under 1 year) of those deviations, shown by the black lines. The histograms to the right give the overall occurrence distribution of the deviation of the dots from the blacked lines and hence give an indication of the scatter around the trend. In all cases, the test data are well correlated with aa _{H}, the linear correlation coefficients for Bartels rotation means being: 0.938 for IHV; 0.964 for a _{NGK}/s; 0.948 for a _{SOD}/s; and 0.900 for a _{ESK}/s. For comparison, the values are 0.989 and 0.987 for aa _{HN} and aa _{HS} (but of course they are not independent data from aa _{H}). To derive the deviation, the test index is linearly regressed with aa _{H} index over a common calibration interval of 1990–2003 (for which data are available for all indices) and test data that have been scaled aa _{H} using the leastsquares linear regression fit for this interval are denoted with a prime. What we are searching for are consistent steplike change in the deviations (on timescales comparable to the ~1 year smoothing time constant) at the time of one of the aa station joins (which are marked by the vertical dashed lines using the same colour scheme as in Fig. 2). The appearance of such a step in several of the test data series would indicate a calibration error in aa _{H}. The mosttested join is that between Toolangi and Canberra in aa _{HS} (blue dashed line). In generating aa _{H} this was calibrated using a spline of the am index. The only steplike feature is shortly after that join in the Eskdalemuir data and, as this is the least well correlated of the test indices and shows the latest scatter in its 27day means, this is not good evidence for a problem with this join. The muchdiscussed AbingertoHartland join (red dashed line) also does not produce a consistent signature in the test data, with equal mean deviations before and after in all the corrected a _{K}index data, i.e. from Sodankyla, Neimegk, Eskdalemuir and aa _{HS} (which at this time is data from Toolangi). There is a small step in the Niemegk deviations around 1970, but this is after the calibration interval for this join (which is 1946–1967). The deviations for IHV’ show a step between 1960.5 and 1963.5, but this is after the join and coincides with the large fall in all indices seen at the end of solar cycle 19. Hence this appears to be related to a slight nonlinearity between the IHV index and rangebased indices, rather than the calibration join. For the two earlier joins, GreenwichtoAbinger (green dashed line) and MelbournetoToolangi (cyan dashed line), the independent test data available are IHV, the aa _{H} data from the opposite hemisphere and to a lesser extent a _{SOD}/s (which only extends back to 1914 which is only 6 years before the MELTOO join). Note, however, that IHV is compiled using hourly means from just 2 stations before 1900, rising to 11 by 1920, and one of stations is Niemegk, and so it does not provide a fully independent test of our station intercalibrations. We note that the data use to make the joins, a _{NGK}/s, show some fluctuation over the relevant intervals, but no consistent step. The IHV data do show a step 3 years before the MELTOO join but this is at the same time as the strong rise in all indices at the start of cycle 15 from the very low values during the minimum between cycles 14 and 15. Hence, as for the 1961/2 step in IHV, this appears to be more associated with a slight nonlinearity between IHV and a _{K} /s values at low activity than with a calibration skip caused by an aa station change.
Fig. 14 (Lefthand plots) Bartels rotation interval (27day) means of the deviation of various scaled indices from the new aa _{H} index (points) and 13point running means of those 27day means (black lines). The indices are all scaled by linear regression to the aa _{H} index over the interval 1990–2003 (the interval shaded in gray). A prime on a parameter denotes that this scaling has been carried out. (Righthand plots) The distribution of the deviations of the 27day means (the dots in the corresponding lefthand plot) from their simultaneous 13point smoothed running means (the black line in the corresponding lefthand plot). (a) and (b) are for the scaled IHV index, IHV′; (c) and (d) for the corrected and scaled a _{ K } index from Niemegk, (a _{NGK}/s)′; (e) and (f) for the corrected and scaled a _{ K } index from Sodankylä, (a _{SOD}/s)′; (g) and (h) for the corrected and scaled a _{ K } index from Eskdalemuir, (a _{ESK}/s)′; (i) and (j) for the homogenized northern hemisphere aa index, aa _{HN}; and (k) and (l) for the homogenized southern hemisphere aa index, aa _{HS}. The dashed lines in the lefthand plots mark the dates of aa station joins, using the same colour scheme as used in Figure 2. Horizontal grey lines are at zero deviation. 
6 Conclusions
The classic aa indices now cover 150 years, an interval long enough that there are significant effects on the indices due to the effects of secular changes in the intrinsic geomagnetic field. We here correct for these using the standard approach to calculating Kindices, but making the scale factors employed for each station a function of time. We also show that this improves the intercalibration of the rangebased data from other stations which had also been influenced by the assumption of constant scale factors. The intercalibrations are shown, in general, to depend on timeof year (F), which is here accommodated using 8 equalsized bins in F and interpolating to the date of each 3hourly measurement. This allows us to correct for seasonal effects on both the instrumentation and the background subtraction procedures.
We call the corrected data series that we have produced the “homogenized” aa data series, because it eliminates a number of differences between the data series from the two hemispheres. In this paper we have concentrated on the results in annual means. In a companion paper (Paper 2), we make further allowances for the effects of the variations of each station’s sensitivity with timeofday and timeofyear (Lockwood et al., 2018b). These further corrections are carried out such that the annual means presented here (〈aa _{H}〉_{τ=1yr}, 〈aa _{HN}〉_{τ=1yr}, and 〈aa _{HS}〉_{τ=1yr}) remain unchanged. In the supplementary material to the present paper we give the annual mean values of the homogenized aa _{H}, aa _{HN}, and aa _{HS} data series as these will be subject to no further corrections. We also attach a file containing the annual δ and s(δ) values used to make allowance for the secular field changes. The supplementary data attached to Paper 2 will give the daily and 3hourly values of aa _{H}, aa _{HN} and aa _{HS}. The equations given in the text of the present paper, along with the coefficients given in Table 1, give a complete recipe for generating this first level of the aa homogenized data series from the classical aa values.
Given the close agreement between the independentlycalibrated aa _{HN} and aa _{HS} indices, we can be confident that the 13.38 nT rise in the 11year averages of aa _{H} is accurate. The standard error in the difference of annual means of aa _{HN} and aa _{HS} is ξ_{1} = 0.082 nT for 1996–2017 and ξ_{2} = 0.039 nT for 1868–1889. Treating these as the uncertainties in the average levels in these intervals gives an estimate uncertainty in the difference between them of (ξ_{1} ^{2} + ξ_{2} ^{2})^{1/2} = 0.091 nT which is just ≈1% of the 13.38 nT rise in aa _{H}. Also, because aa _{HN} and aa _{HS} are calibrated against the am index over the last 5 years, we can also be confident that the values of 10.87 nT and 24.25 nT for 1902 and 1987 are correct to within the above accuracies. Thus the new 11year smoothed aa _{H} values reveal a rise of 123% between these two dates. In comparison, the corresponding rise in the classic aa between these dates was 15.36 nT (132%) and that in aa _{C} was 12.73 nT (108%). Therefore, although the rise in the classic aa was excessive (by 9%), correcting for the AbingerHartland intercalibration in isolation, without allowing for the drifts caused by the secular change in the main field, gives an overcorrection and the rise in aa _{C} is here found to be too small by 15%.
As a result, reconstructions of solar wind parameters that have been based on aa _{C}, such as the open solar flux, the solar wind speed and the nearEarth IMF (Lockwood et al., 2014) will also have underestimated the rise that took place during the 20th century and may not have the correct temporal waveform (given that the peak aa _{C} was in 1955 rather than 1987). However, it should be noted that aa _{C} was one of four geomagnetic indices (including IHV) used by Lockwood et al. (2014) which means the effect of its underestimation of the rise will be reduced in the reconstructions. Replacing aa _{C} with aa _{H} should also have the effect of reducing the uncertainties. Note also that the dependencies of the various indices on the IMF, B, and the solar wind speed, V _{SW}, are such that it is the V _{SW} estimates that will be most affected. This will be investigated, and the reconstructions amended, in subsequent publications.
A point of general importance to geomagnetic indices is that, as shown in Figure 2c, the am index follows the classic aa series very closely, but as we go back in time towards the start of the am index in 1959, the homogenized index aa _{H} becomes progressively smaller than aa by a consistent amount. Similarly, close inspection of Figure 13 shows that IHV has fallen by slightly more than aa _{H} in this interval. These differences are the effects of changes in the intrinsic geomagnetic field on the indices. In the case of all previous indices based on K values (Kp, ap, am, as, an, and the classic aa) this arises from drift in the L values (the threshold value of the K = 9 band) in the case of IHV it arises from the normalization to a reference latitude (and potentially also from the use of an equinoctial pattern used to remove the diurnal variation). Hence we conclude that all geomagnetic indices should make allowance for these effects if they are to be fit for purpose in studies of space climate change.
Supplementary Material
Supplementary files supplied by authors
Access hereAcknowledgments
Without either group knowing it, much of the work reported in this paper was carried out completely independently at University of Reading (UoR) and at École et Observatoire des Sciences de la Terre (EOST) but when we discovered that we had independently reached almost identical results, we decided a single publication would minimise duplication and potential confusion. The authors thank all staff of ISGI and collaborating institutes, past and present, for the longterm compilation and databasing of the aa and am indices which were downloaded from http://isgi.unistra.fr/data_download.php. We also thank the staff of Geoscience Australia, Canberra for the southern hemisphere aastation Kindex data, and colleagues at British Geological Survey (BGS), Edinburgh for the northern hemisphere aastation Kindex data, and the Helmholtz Centre Potsdam of GFZ, the German Research Centre for Geosciences for the Niemegk Kindex data. The work at UoR and BGS is supported by the SWIGS NERC Directed Highlight Topic Grant number NE/P016928/1 with some additional support at UoR from STFC consolidated grant number ST/M000885/1. The work at EOST is supported by CNES, France.
The editor thanks two anonymous referees for their assistance in evaluating this paper.
References
 Adebesin BO. 2016. Investigation into the linear relationship between the AE, Dst and ap indices during different magnetic and solar activity conditions. Acta Geod Geophys 51(2): 315–331. DOI: 10.1007/s4032801501282 [CrossRef] [Google Scholar]
 Asvestari E, Usoskin IG. 2016. An empirical model of heliospheric cosmic ray modulation on longterm time scale. J Space Weather Space Clim 6: A15, 2016. DOI: 10.1051/swsc/2016011. [Google Scholar]
 Bartels J, Heck NH, Johnston HF. 1939. The three hourly range index measuring geomagnetic activity. Terr Magn Atmos Electr 44: 411–424. DOI: 10.1029/te044i004p00411. [NASA ADS] [CrossRef] [Google Scholar]
 Bubenik DM, FraserSmith AC. 1977. Evidence for strong artificial components in the equivalent linear amplitude geomagnetic indices. J Geophys Res 82(19): 2875–2878. DOI: 10.1029/ja082i019p02875. [CrossRef] [Google Scholar]
 Caan MN, McPherron RL, Russell CT. 1978. The statistical magnetic signature of magnetospheric substorms. Planet Space Sci 26(3): 269–279. DOI: 10.1016/00320633(78)900922. [CrossRef] [Google Scholar]
 Chambodut A, Marchaudon A, Menvielle M, ElLemdani F, Lathuillere C. 2013. The Kderived MLT sector geomagnetic indices. Geophys Res Lett 40: 4808–4812. DOI: 10.1002/grl.50947. [CrossRef] [Google Scholar]
 Chambodut A, Marchaudon A, Lathuillère C, Menvielle M, Foucault E. 2015. New hemispheric geomagnetic indices with 15 min time resolution. J Geophys Res Space Phys 120: 9943–9958, DOI: 10.1002/2015JA021479. [CrossRef] [Google Scholar]
 Clilverd MA, Clarke E, Ulich T, Linthe J, Rishbeth H. 2005. Reconstructing the longterm aa index. J Geophys Res 110: A07205. DOI: 10.1029/2004JA010762. [Google Scholar]
 Clauer CR, McPherron RL. 1974. Mapping the local timeuniversal time development of magnetospheric substorms using midlatitude magnetic observations. J Geophys Res 79(19): 2811–2820. DOI: 10.1029/JA079i019p02811. [CrossRef] [Google Scholar]
 Feynman J, Crooker NU. 1978. The solar wind at the turn of the century. Nature 275: 626–627. DOI: 10.1038/275626a0 [CrossRef] [Google Scholar]
 Finch ID, Lockwood M. 2007. Solar windmagnetosphere coupling functions on timescales of 1 day to 1 year. Ann Geophys 25: 495–506. DOI: 10.5194/angeo254952007. [CrossRef] [Google Scholar]
 Gjerloev JW, Hoffman RA. 2014. The largescale current system during auroral substorms. J Geophys Res Space Phys 119: 4591–4606. DOI: 10.1002/2013JA019176. [CrossRef] [Google Scholar]
 International Association of Geomagnetism and Aeronomy. 1975. IAGA Bulletin, 37, 128 p. resolution 3. [Google Scholar]
 Jackson A, Jonkers ART, Walker MR. 2000. Four centuries of geomagnetic secular variation from historical records. Phil Trans Royal Soc London A 358(1768): 957–990. DOI: 10.1098/rsta.2000.0569. [CrossRef] [Google Scholar]
 Jarvis MJ. 2004. Observed tidal variation in the lower thermosphere through the 20th century and the possible implication of ozone depletion. J Geophys Res 110: A04303. DOI: 10.1029/2004JA010921. [Google Scholar]
 Kobylinski Z, Wysokinski A. 2006. On the LongTerm Consistency of the Magnetic H Component and K Indices at the Swider and Niemegk Observatories (1921–1975). Sun Geosphere 1(2): 37–41. [Google Scholar]
 Lockwood M. 2003. Twentythree cycles of changing open solar magnetic flux, J Geophys Res 108(A3): 1128. DOI: 10.1029/2002JA009431. [CrossRef] [Google Scholar]
 Lockwood M. 2013. Reconstruction and prediction of variations in the open solar magnetic flux and interplanetary conditions. Living Rev Sol Phys 10: 4. DOI: 10.12942/lrsp20134. [CrossRef] [Google Scholar]
 Lockwood M, Fröhlich C. 2007. Recent oppositely directed trends in solar climate forcings and the global mean surface air temperature. Proc Roy Soc A 463: 2447–2460. DOI: 10.1098/rspa.2007.1880. [NASA ADS] [CrossRef] [Google Scholar]
 Lockwood M, Owens MJ. 2014. Centennial variations in sunspot number, open solar flux and streamer belt width: 3. Modelling,. J Geophys Res Space Phys 119(7): 5193–5209. DOI: 10.1002/2014JA019973. [CrossRef] [Google Scholar]
 Lockwood M, Stamper R, Wild MN. 1999. A doubling of the sun’s coronal magnetic field during the last 100 years. Nature 399: 437–439. DOI: 10.1038/20867. [NASA ADS] [CrossRef] [Google Scholar]
 Lockwood RB, Forsyth AB, McComas DJ. 2004. Open solar flux estimates from nearEarth measurements of the interplanetary magnetic field: comparison of the first two perihelion passes of the Ulysses spacecraft. Ann Geophys 22: 1395–1405. DOI: 10.5194/angeo2213952004. [NASA ADS] [CrossRef] [Google Scholar]
 Lockwood M, Rouillard AP, Finch ID. 2009. The rise and fall of open solar flux during the current grand solar maximum. ApJ 700(2): 937–944. DOI: 10.1088/0004637X/700/2/937. [CrossRef] [Google Scholar]
 Lockwood M, Nevanlinna H, Barnard L, Owens MJ, Harrison RG, Rouillard AP, Scott CJ. 2014. Reconstruction of Geomagnetic Activity and NearEarth Interplanetary Conditions over the Past 167 Years: 4. NearEarth Solar Wind Speed, IMF, and Open Solar Flux. Ann Geophys 32: 383–399. DOI: 10.5194/angeo323832014. [Google Scholar]
 Lockwood M, Owens MJ, Barnard LA, Usoskin IG. 2016. Tests of sunspot number sequences: 3. Effects of regression procedures on the calibration of historic sunspot data. Sol Phys 291: 2829–2841. DOI: 10.1007/s1120701508292. [NASA ADS] [CrossRef] [Google Scholar]
 Lockwood M, Owens MJ, Barnard LA, Scott CJ, Watt CE. 2017. Space Climate and Space Weather over the past 400 years: 1. The Power input to the Magnetosphere. J Space Weather Space Clim 7: A25. DOI: 10.1051/swsc/2017019. [Google Scholar]
 Lockwood M, Owens MJ, Barnard LA, Scott CJ, Watt CE, Bentley S. 2018a. Space Climate and Space Weather over the past 400 years: 2. Proxy indicators of geomagnetic storm and substorm occurrence. J Space Weather Space Clim 8: A12. DOI: 10.1051/swsc/2017048. [Google Scholar]
 Lockwood M, Finch ID, Barnard LA, Owens MJ, Clarke E. 2018b. A homogeneous aa index: 2. hemispheric asymmetries and the equinoctial variation. J Space Weather and Space Clim, submitted with the present paper. [Google Scholar]
 Lockwood M, Bentley S, Owens MJ, Barnard LA, Scott CJ, Watt CE, Allanson O. 2018c. The development of a space climatology: 1. Solarwind magnetosphere coupling as a function of timescale and the effect of data gaps. Space Weather 16. DOI: 10.1029/2018SW001856. [Google Scholar]
 Love JJ. 2011. Longterm biases in geomagnetic K and aa indices. Ann Geophys 29: 1365–1375. DOI: 10.5194/angeo2913652011. [CrossRef] [Google Scholar]
 Martini D, Mursula K. 2008. Centennial geomagnetic activity studied by a new, reliable longterm index. J Atmos SolTerr Phys 70: 1074–1087. DOI: 10.1016/j.jastp.2008.01.008. [CrossRef] [Google Scholar]
 Matthes K, Funke B, Andersson ME, Barnard L, Beer J, et al. 2016. Solar Forcing for CMIP6 (v3.1). Geoscientific Model Development 10(6): 2247. DOI: 10.5194/gmd201691. [Google Scholar]
 Mayaud PN. 1968. Indices Kn, Ks et Km 1964–1967. IAGA Bull 21, Éditions du CNRS, Paris. Available from http://isgi.unistra.fr/Documents/References/Mayaud_CNRS_1968.pdf. [Google Scholar]
 Mayaud PN. 1971. Une mesure planétaire d’activité magnetique, basée sur deux observatoires antipodaux. Ann Geophys 27: 67–70. [Google Scholar]
 Mayaud PN. 1972. The aa indices: A 100year series characterizing the magnetic activity. J Geophys Res 77: 6870–6874. DOI: 10.1029/JA077i034p06870. [NASA ADS] [CrossRef] [Google Scholar]
 Mayaud PN. 1980. Derivation, meaning and use of geomagnetic indices, Geophysical Monograph, 22. American Geophysical Union, Washington, DC. [Google Scholar]
 Menvielle M, Berthelier A. 1991. The Kderived planetary indices: description and availability. Rev Geophys 29: 415–432. DOI: 10.1029/91RG00994. [CrossRef] [Google Scholar]
 Mursula K, Martini D. 2006. Centennial increase in geomagnetic activity: Latitudinal differences and global estimates. J Geophys Res 111: A08209. DOI: 10.1029/2005JA011549. [CrossRef] [Google Scholar]
 Mursula K, Martini D. 2007. New indices of geomagnetic activity at test: Comparing the correlation of the analogue ak index with the digital A_{h} and IHV indices at the Sodankylä station. Adv Space Res 40: 1105–1111. DOI: 10.1016/j.asr.2007.06.067. [CrossRef] [Google Scholar]
 Mursula K, Martini D, Karinen A. 2004. Did open solar magnetic field increase during the last 100 years? a reanalysis of geomagnetic activity. Sol Phys 224: 85–94. DOI: 10.1007/s112070054981y. [CrossRef] [Google Scholar]
 Mursula K, Usoskin IG, Yakovchouka O. 2009. Does sunspot number calibration by the “magnetic needle” make sense. J Atmos SolTerr Phys 71(17/18): 1717–1726. DOI: 10.1016/j.jastp.2008.04.017. [CrossRef] [Google Scholar]
 Owens MJ, Arge CN, Crooker NU, Schwadron NA, Horbury TS. 2008. Estimating total heliospheric magnetic flux from singlepoint in situ measurements. J Geophys Res 113: A12103. DOI: 10.1029/2008JA013677. [Google Scholar]
 Owens MJ, Cliver E, McCracken K, Beer J, Barnard LA, Lockwood M, Rouillard AP, Passos D, Riley P, Usoskin IG, Wang YM. 2016. NearEarth Heliospheric Magnetic Field Intensity since 1800. Part 2: Cosmogenic radionuclide reconstructions. J Geophys Res 121(7): 6064–6074. DOI: 10.1002/2016JA022550. [CrossRef] [Google Scholar]
 Owens MJ, Lockwood M, Riley P. 2017. Global solar wind variations over the last four centuries. Nature Scientific Reports 7: 41548. DOI: 10.1038/srep41548. [Google Scholar]
 Ritter P, Lühr H. 2008. NearEarth magnetic signature of magnetospheric substorms and an improved substorm current model. Ann Geophys 26: 2781–2793. DOI: 10.5194/angeo2627812008. [CrossRef] [Google Scholar]
 Rostoker G. 1972. Geomagnetic indices. Rev Geophys 10(4): 935–950. DOI: 10.1029/RG010i004p00935. [CrossRef] [Google Scholar]
 Rouillard AP, Lockwood M, Finch ID. 2007. Centennial changes in the solar wind speed and in the open solar flux. J Geophys Res 112: A05103. DOI: 10.1029/2006JA012130. [Google Scholar]
 Saba MMF, Gonzalez WD, Clua de Gonzalez AL. 1997. Relationships between the AE, ap and Dst indices near solar minimum (1974) and at solar maximum (1979). Ann Geophys 15(10): 1265–1270. [CrossRef] [Google Scholar]
 Smith EJ, Balogh A. 1995. Ulysses observations of the radial magnetic field. Geophys Res Lett 22: 3317–3320. DOI: 10.1029/95gl02826. [Google Scholar]
 Solanki SK, Schüssler M, Fligge M. 2000. Evolution of the Sun’s largescale magnetic field since the Maunder Minimum. Nature 408: 445–447. DOI: 10.1038/35044027. [Google Scholar]
 Svalgaard L, Cliver EW. 2005. The IDV index: Its derivation and use in inferring longterm variations of the interplanetary magnetic field strength. J Geophys Res 110: A12103. DOI: 10.1029/2005JA011203. [CrossRef] [Google Scholar]
 Svalgaard L, Cliver EW. 2007. Interhourly variability index of geomagnetic activity and its use in deriving the longterm variation of solar wind speed. J Geophys Res 112: A10111. DOI: 10.1029/2007JA012437. [Google Scholar]
 Svalgaard L, Cliver EW, Le Sager P. 2003. Determination of interplanetary magnetic field strength, solar wind speed and EUV irradiance. In: Wilson A, Editor, Solar Variability as an Input to the Earth’s Environment, ESA Special Publication, vol. 535, 15–23, European Space Agency, Nordwijk. [Google Scholar]
 Svalgaard L, Cliver EW, Le Sager P. 2004. IHV: a new longterm geomagnetic index. Adv Space Res 34: 436–439. DOI: 10.1016/j.asr.2003.01.029 [CrossRef] [Google Scholar]
 Thébault E, Finlay CC, Beggan CD, Alken P, Aubert J, et al. 2015. International Geomagnetic Reference Field: the 12th generation. Earth Planet Space 67: 79. DOI: 10.1186/s4062301502289. [CrossRef] [Google Scholar]
 Usoskin IG. 2017. A history of solar activity over millennia. Living Rev Sol Phys 14: 3. DOI: 10.1007/s4111601700069. [NASA ADS] [CrossRef] [Google Scholar]
 Vasyliunas VM, Kan JR, Siscoe GL, Akasofu SI. 1982. Scaling relations governing magnetospheric energy transfer. Planet Space Sci 30: 359–365. DOI: 10.1016/00320633(82)900411. [CrossRef] [Google Scholar]
 Vennerstrom S, Lefevre L, Dumbović M, Crosby N, Malandraki O, Patsou I, Clette F, Veronig A, Vršnak B, Leer K, Moretto T. 2016. Extreme Geomagnetic Storms – 1868–2010. Sol Phys 291(5): 1447–1481. DOI: 10.1007/s112070160897y. [NASA ADS] [CrossRef] [Google Scholar]
Cite this article as: Lockwood M, Chambodut A, Barnard L, Owens M, Clarke E, et al. 2018. A homogeneous aa index: 1. Secular variation. J. Space Weather Space Clim. 8, A53.
All Tables
The correlation coefficients (r _{b} and r _{a} for daily means in 11 years before and after the joins, respectively) and the slope s _{c} and intercept c _{c} for recalibrating stations for the 8 timeofyear (F) bins employed.
All Figures
Fig. 1 Schematic illustration of the generation of K and a _{ K } indices. Illustrative variations of the two orthogonal horizontal field components measured at one site are shown, X (toward geographic north, in blue) and Y (toward geographic east, in orange). These variations are after the regular diurnal variation has been subtracted from the observations. In the fixed 3hour UT windows (00–03 UT, or 03–06 UT, and so on up to 21–24 UT), the range of variation of both components between their maximum and minimum values is taken, ΔX and ΔY. The larger value of the two is kept and scaled according to a standard, quasilogarithmic scale (illustrated by the black and mauve bands to the right) for which all Kband thresholds are set for the site in question by L, the threshold range value for the K = 9 band. The value of L for the site is assigned according to the minimum distance between the site and a nominal (fixed) auroral oval position. The K value is then converted into the relevant quantised value of a _{ K } (in nT) using the standard “midclass amplitudes” (K2aK) scale. In the schematic shown, ΔX > ΔY, thus the X component gives a K value of 8 (whereas the Y component would have given a K of 5). Thus for this 3hour interval, a _{ K } value would be 415 nT. In the case of the classic aa indices, the hemispheric index (aa _{N} or aa _{S}, for the observatory in the northern or southern hemisphere, respectively) is f × a _{ K }, where f is a factor that is assumed constant for the observing site. 

In the text 
Fig. 2 Variations of annual means of various forms of the aa index. (a) The published “classic” northern and southern hemisphere indices (aa _{N} and aa _{S} in red and blue, respectively). Also shown (in green) is 1.5 × a _{NGK}, derived from the Kindices scaled from the Niemegk data. The vertical dashed lines mark aa station changes (cyan: Melbourne to Toolangi; green: Greenwich to Abinger; red: Abinger to Hartland; and blue: Toolangi to Canberra). (b) The homogenized northern and southern hemisphere indices (aa _{HN} and aa _{HS} in red and blue, respectively) generated in the present paper. The thick green and cyan line segments are, respectively, the a _{NGK} and am index values used to intercalibrate segments. (c) The classic aa data series, aa = (aa _{N} + aa _{S})/2 (in mauve) and the new homogeneous aa data series, aa _{H} = (aa _{HN} + aa _{HS})/2 (in black). The orange line is the corrected aa data series aa _{C} generated by Lockwood et al. (2014) by recalibration of the AbingertoHartland join using the Ap index. (Note that before this join, aa and aa _{C} are identical and the orange line is not visible as it is underneath the mauve line). The cyan line and points show annual means of the am index. The grayshaded area in (c) is the interval used to calibrate aa _{HN} and aa _{HS} (and hence aa _{H}) against am. 

In the text 
Fig. 3 The variation of the scale factor s(δ) derived from threshold range value L that defines the K = 9 band, with the minimum angular separation of the station from a nominal auroral oval, δ. This empirical variation is scaled from Mayaud (1968, 1972) and is the basis of the L values used to scale Kindices from observed range for all midlatitude stations. The scale factor s(δ) normalizes to the idealized Niemegk station for which δ = 19° and L = Lo = 500 nT (ideal static Mayaud values). 

In the text 
Fig. 4 Analysis of the effect of secular change in the geomagnetic field on the aa magnetometer stations using a spline of the IGRF12 and the gufm1 geomagnetic field models (for after and before 1900, respectively). (a) The modulus of the corrected geomagnetic latitude, Λ_{CG} of the stations; (b) the angular separation of the closest approach to the station of a nominal nightside auroral oval (at Λ_{CG} = 69°), δ; and (c) the scale factor s(δ) = L/L _{o} where L is given as a function of δ by Figure 3 and L _{o} = 500 nT, the reference value for the Niemegk station (for which δ is taken to be 19°) except for Canberra which, because of its more equatorward location, is scaled using L _{o} = 450 nT. The northern hemisphere stations are Greenwich (code GRW, in mauve), Abinger (ABN, in green) and Hartland (HAD, in red). The southern hemisphere stations are Melbourne (MEL, in black), Toolangi (TOO, in cyan) and Canberra (CNB, in blue). Also shown is Niemegk (NGK, in orange: data available since 1890). Vertical dashed lines mark aa station changes. 

In the text 
Fig. 5 Variations of (left) the minimum angular distance to the auroral oval, δ, and (right) the scalefactors, s(δ), for the aa stations. The colours used are as in Figure 4 (namely mauve for Greenwich, green for Abinger, red for Hartland, black for Melbourne, cyan for Toolangi and blue for Canberra). The thin lines are the variations shown in Figure 4 and the thick lines are constant values used in generating the classic aa. The dotdash lines in the righthand panels show the reciprocals of the standard multiplicative correction factors and the thick lines the factors corresponding to the constant δ values in the lefthand panels. 

In the text 
Fig. 6 Top: Scaled variations of modern a _{ K } values from various stations using the station location correction procedure used in this paper. For all stations, the observed a _{ K } values have been corrected for any secular magnetic field change by dividing by the s(δ) factor and then scaled to the am index using the linear regression coefficients m and i obtained from the calibration interval (2002–2009, inclusive). The plot shows 27day Bartels rotation means for data from: (mauve) Sodankylä, SOD; (brown) Eskdalemuir, ESK; (orange) Niemegk, NGK; (red) Hartland, HAD; (blue) Canberra, CNB; and (green) a spline of Gangara, GNA and nearby Gingin, GNG (see text for details). The black line is the am index. Bottom: the rms. fit residual of the rescaled station a _{ K } indices compared with the am index, ε _{rms}, for the 27day means. The average of ε _{rms} for the whole interval shown (1995–2017), is 〈ε _{rms}〉 = 9.7% 

In the text 
Fig. 7 Correlogams showing the correlation between 27day Bartels solar rotation means of power input into the magnetosphere, Pα, with the corrected a _{ K } indices, a _{ K }/s(δ), as a function of the coupling exponent, α. The colours are for the same data as used in Figure 6: (mauve) Sodankylä, SOD; (brown) Eskdalemuir, ESK; (orange) Niemegk, NGK; (red) Hartland, HAD; (blue) Canberra, CNB; and (green) a spline of Gangara, GNA and nearby Gingin, GNG (see text for details). The black line is the am index. The coloured dots and vertical dashed lines show the optimum α that gives the peak correlation. The horizontal bars show the uncertainty in the optimum α which is the larger of the two 1σ uncertainties computed using the two procedures described by Lockwood et al. (2018c). 

In the text 
Fig. 8 The intercalibration of aa _{N} data across the join between the Hartland (HAD) and Abinger (ABN) observations in 1957. The data are divided into eight equallength fractionofyear (F) bins, shown in the 8 rows, with the bottom row being bin 1 (0 ≤ F < 0.125) and the top row being bin 8 (0.875 ≤ F < 1). The lefthand column is for an interval of duration 11years (approximately a solar cycle) before the join and shows scatter plots of the aa data from Abinger (after division by s(δ) to allow for secular changes in the geomagnetic field) against the similarlycorrected simultaneous NGK data, a _{NGK} /s(δ). The middle column is for an interval of duration 11years after the join and shows the corresponding relationship between the alreadyhomogenized aa data from Hartland [aa _{H}]_{HAD} and the simultaneous a _{NGK} /s(δ) data. All axes are in units of nT. The grey dots are daily means to which a linear regression gives the red lines which are then checked against the annual means (for the F bin in question) shown by the black dots. The righthand column shows the fitted lines for the “before” interval, 1, against the corresponding fitted line for the “after” interval, 2: the red line would lie on the dotted line if the two stations had identical responses at the F in question. The slope and intercept of these lines, giving the intercalibration of the two stations at that F, are given in Table 1. 

In the text 
Fig. 9 The same as Figure 8 for the join between the Greenwich and Abinger data. 

In the text 
Fig. 10 The same as Figure 8 for the join between the Toolangi and Canberra data. 

In the text 
Fig. 11 The same as Figure 8 for the join between the Melbourne and Toolangi data. 

In the text 
Fig. 12 Distributions of the differences in daily means of northern and southern hemisphere indices, Δ_{NS}, for 50year intervals. The top row is for the classic aa indices, so that Δ_{NS} = aa _{N} − aa _{S}. The bottom row is for the homogenised aa indices, so that Δ_{NS} = aa _{HN} − aa _{HS}. Parts (a) and (d) are for 1868–1917 (inclusive); parts (b) and (e) are for 1918–1967; and parts (c) and (f) are for 1968–2017. In each panel, the vertical orange line is at Δ_{NS} = 0, the vertical cyan line is the median of the distribution, the vertical red line the mean (〈Δ_{NS}〉), and the green lines the upper and lower deciles. Note they are plotted in the order, orange, cyan, then red and so the mean can overplot the others (this particularly occurs in the bottom row). In each panel the distribution mean, 〈Δ_{NS}〉 and the standard deviation, σ_{ΔNS}, are given. 

In the text 
Fig. 13 Comparison of the new homogenised indices aa _{H,} aa _{HN} and aa _{HS}, with the IHV index and the δcorrected a _{NGK}, a _{SOD} and a _{ESK} values. The lefthand plots show the time series and their respective bestfit linear regressions. The righthand plots show scatter plots of the new indices against the test indices (a _{NGK}/s(δ), a _{SOD}/s(δ) or IHV) and the leastsquares bestfit linear regression line. The linear correlation coefficient r is given in each case. Parts (a) and (b) are for aa _{HN} and IHV; parts (c) and (d) are for aa _{HS} and IHV; parts (e) and (f) are for aa _{HS} and IHV; parts (g) and (h) are for aa _{H} and the corrected a _{ K } values from Niemegk, a _{NGK}/s(δ); parts (i) and (j) are for aa _{H} and the δcorrected a _{ K } values from Sodankylä, a _{NGK}/s(δ), parts (k) and (l) are for aa _{H} and the δcorrected a _{ K } values from Eskdalemuir, a _{ESK}/s(δ). For comparison, the bottom panel (parts m and n) compares the hemispheric homogenized indices aa _{HN} and aa _{HS}. In each panel, the grey area defines the estimated ±1σ uncertainty in aa _{H}. 

In the text 
Fig. 14 (Lefthand plots) Bartels rotation interval (27day) means of the deviation of various scaled indices from the new aa _{H} index (points) and 13point running means of those 27day means (black lines). The indices are all scaled by linear regression to the aa _{H} index over the interval 1990–2003 (the interval shaded in gray). A prime on a parameter denotes that this scaling has been carried out. (Righthand plots) The distribution of the deviations of the 27day means (the dots in the corresponding lefthand plot) from their simultaneous 13point smoothed running means (the black line in the corresponding lefthand plot). (a) and (b) are for the scaled IHV index, IHV′; (c) and (d) for the corrected and scaled a _{ K } index from Niemegk, (a _{NGK}/s)′; (e) and (f) for the corrected and scaled a _{ K } index from Sodankylä, (a _{SOD}/s)′; (g) and (h) for the corrected and scaled a _{ K } index from Eskdalemuir, (a _{ESK}/s)′; (i) and (j) for the homogenized northern hemisphere aa index, aa _{HN}; and (k) and (l) for the homogenized southern hemisphere aa index, aa _{HS}. The dashed lines in the lefthand plots mark the dates of aa station joins, using the same colour scheme as used in Figure 2. Horizontal grey lines are at zero deviation. 

In the text 
Current usage metrics show cumulative count of Article Views (fulltext article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 4896 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.