The Space Weather Atmosphere Models and Indices (SWAMI) project: Overview and first results

Space weather driven atmospheric density variations affect low Earth orbit (LEO) satellites during all phases of their operational lifetime. Rocket launches, re-entry events and space debris are also similarly affected. A better understanding of space weather processes and their impact on atmospheric density is thus critical for satellite operations as well as for safety issues. The Horizon 2020 project Space Weather Atmosphere Model and Indices (SWAMI) project, which started in January 2018, aims to enhance this understanding by: Developing improved neutral atmosphere and thermosphere models, and combining these models to produce a new whole atmosphere model. Developing new geomagnetic activity indices with higher time cadence to enable better representation of thermospheric variability in the models, and improving the forecast of these indices. The project stands out by providing an integrated approach to the satellite neutral environment, in which the main space weather drivers are addressed together with model improvement. The outcomes of SWAMI will provide a pathway to improved space weather services as the project will not only address the science issues, but also the transition of models into operational services. The project aims to develop a unique new whole atmosphere model, by extending and blending the Unified Model (UM), which is the Met Office weather and climate model, and the Drag Temperature Model (DTM), which is a semi-empirical model which covers the 120–1500 km altitude range. A user-focused operational tool for satellite applications shall be developed based on this. In addition, improved geomagnetic indices shall be developed and shall be used in the UM and DTM for enhanced nowcast and forecast capability. In this paper, we report on progress with SWAMI to date. The UM has been extended from its original upper boundary of 85 km to run stably and accurately with a 135 km lid. Developments to the UM radiation scheme to enable accurate performance in the mesosphere and lower thermosphere are described. These include addition of non-local thermodynamic equilibrium effects and extension to include the far ultraviolet and extreme ultraviolet. DTM has been re-developed using a more accurate neutral density observation database than has been used in the past. In addition, we describe an algorithm to develop a Topical Issue Scientific Advances from the European Commission H2020 projects on Space Weather Corresponding author: david.jackson@metoffice.gov.uk + Current Affiliation: Norwegian Meteorological Institute, Tromso, Norway J. Space Weather Space Clim. 2020, 10, 18 D.R. Jackson et al., Published by EDP Sciences 2020 https://doi.org/10.1051/swsc/2020019 Available online at: www.swsc-journal.org OPEN ACCESS RESEARCH ARTICLE This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. new version of DTM driven by geomagnetic indices with a 60 minute cadence (denoted Hp60) rather than 3-hourly Kp indices (and corresponding ap indices). The development of the Hp60 index, and the Hp30 and Hp90 indices, which are similar to Hp60 but with 30 minute and 90 minute cadences, respectively, is described, as is the development and testing of neural network and other machine learning methods applied to the forecast of geomagnetic indices.

The project stands out by providing an integrated approach to the satellite neutral environment, in which the main space weather drivers are addressed together with model improvement. The outcomes of SWAMI will provide a pathway to improved space weather services as the project will not only address the science issues, but also the transition of models into operational services. The project aims to develop a unique new whole atmosphere model, by extending and blending the Unified Model (UM), which is the Met Office weather and climate model, and the Drag Temperature Model (DTM), which is a semi-empirical model which covers the 120-1500 km altitude range. A user-focused operational tool for satellite applications shall be developed based on this. In addition, improved geomagnetic indices shall be developed and shall be used in the UM and DTM for enhanced nowcast and forecast capability. In this paper, we report on progress with SWAMI to date. The UM has been extended from its original upper boundary of 85 km to run stably and accurately with a 135 km lid. Developments to the UM radiation scheme to enable accurate performance in the mesosphere and lower thermosphere are described. These include addition of non-local thermodynamic equilibrium effects and extension to include the far ultraviolet and extreme ultraviolet. DTM has been re-developed using a more accurate neutral density observation database than has been used in the past. In addition, we describe an algorithm to develop a Topical Issue -Scientific Advances from the European Commission H2020 projects on Space Weather

Introduction
Precise knowledge of the location and motion of space objects is important, and this importance is rapidly increasing with the advent of large constellations (e.g., National Academies of Sciences, Engineering, and Medicine, 2016;Muelhaupt et al, 2019). The 250-800 km altitude range, which is chiefly in the thermosphere, hosts scientific and operational Low Earth Orbit (LEO) satellites and thousands of other LEO objects. LEO satellite orbital tracking, satellite launch and re-entry operations, satellite conjunction analysis, and maintaining the space debris catalogue all need knowledge of the mean state and variability of the atmosphere, as detailed further below: LEO atmospheric operations require knowledge of the state of the thermosphere near the satellite orbit in order to calculate satellite drag effects to predict the satellite orbit, and apply orbital correction if needed. Accurate orbital prediction is needed in order to track the object accurately in both time and space. Tracking of LEO objects is therefore an indispensable task for space agencies (Vallado & Finkleman, 2014;Hejduk & Snow, 2018). The ever-increasing amount of space debris poses problems to active satellites because of increased risk of collisions. Currently, orbit predictions often are not accurate enough to determine whether a manoeuvre to avoid a collision is warranted. Conjunction analysis requires not only more accurate thermosphere models, but also accurate uncertainty estimates, which models currently do not provide (Poore, 2016;Bussy-Virat et al., 2018). The re-entry region is loosely defined as 80-140 km. A controlled satellite re-entry (de-orbit) requires the application of a change in satellite velocity, at the right location and of the right magnitude, to ensure the de-orbit takes place over a specific area (e.g. the South Pacific). Uncontrolled re-entries also occur quite frequently. In both cases accurate knowledge of the mean density and its variability along the final orbital revolutions is needed, but sufficient accuracy is currently lacking. As an example, for the uncontrolled re-entry of the European Space Agency (ESA) Gravity field and steady-state Ocean Circulation Explorer (GOCE; Drinkwater et al., 2003) satellite in 2013, predictions of the re-entry interface at 80 km varied by thousands of kilometres even one orbit (~90 min) before the actual event (e.g. Visser & van den IJssel, 2016). Satellite launch operations require knowledge of the state of the atmosphere from the surface up to the LEO satellite altitude. A better knowledge of the small-scale and local fluctuations in density and wind in this altitude range is needed in order to accurately estimate the aerodynamic forces acting on the launcher fairing as well as to optimize booster performance.
Key to estimating atmospheric drag is accurate modelling of the thermosphere (chiefly density) along the trajectory of the object. The air density exhibits a regular pattern that is controlled by the diurnal, seasonal and solar cycle variations of solar irradiance (Qian & Solomon, 2012), but it also undergoes irregular spatiotemporal variations of a few to hundreds of percent that result from space weather. For instance, during the geomagnetic storms of October-November 2003 near 400 km altitude the air density was enhanced by 300%-800% (Sutton et al., 2005;Bruinsma et al., 2006, Krauss et al., 2015, 2018. Increases in atmospheric drag due to such space weather events can lead to large changes in the predicted position of objects in LEO. Models of the thermosphere currently used for LEO satellite operations typically use a semi-empirical approach, and include the Drag Temperature Model (DTM; Bruinsma et al., 2012;Bruinsma, 2015), the Jacchia-Bowman (JB) model (e.g. Bowman et al, 2008) and the Naval Research Laboratory for mass spectrometer and incoherent scatter radar extended (NRL MSISE00) model (e.g. Picone et al., 2002). These models provide coarse resolution representations of the thermosphere which are adequate to track or predict satellite orbital location. In addition, DTM and JB do not represent the atmosphere below the lower thermosphere, and NRL MSISE00 only represents these lower levels very simply, so currently re-entry or launcher trajectory calculations are produced by running different types of models at different altitudes.
A significant amount of variability in the thermosphere results from coupling with the lower atmosphere through upward propagation of planetary waves, tides, and gravity waves. For example, Liu et al. (2013) indicated that day to day tidal amplitude variations related to changes in tropospheric weather patterns can be 25%-50% in the lower thermosphere, while a number of authors (e.g. Fuller-Rowell et al., 2008;Goncharenko et al., 2010) have shown the impact on ionospheric total electron content associated with stratospheric sudden warmings. The upward propagating waves that contribute to the coupling are not represented in semi-empirical models, or are only represented incompletely. But, since an ability to model such coupling processes is important for thermospheric forecasting and LEO satellite operations, there is therefore a strong case for the use of a single whole atmosphere model that represents the neutral atmosphere from the surface up to the top of the thermosphere (see Jackson et al., 2019 for a summary). Examples of existing whole atmosphere models include the Whole Atmosphere Community Climate Model with thermosphere and ionosphere extension (WACCM-X, Liu et al., 2010Liu et al., , 2018, the Whole Atmosphere Model (WAM, Akmaev et al., 2008;Fuller-Rowell et al., 2008) and the Ground to topside model of the Atmosphere and Ionosphere for Aeronomy (GAIA, Fujiwara & Miyoshi, 2010).
In SWAMI the aim is to develop a new whole atmosphere model, which we call the MOdel of the Whole Atmosphere (MOWA). MOWA is a blend of the first-principles UM, which is the Met Office operational numerical weather prediction and climate model (e.g. Walters et al., 2019), and DTM. We adopt this approach, rather than using an existing whole atmosphere model, for a number of reasons.
First, the UM uses a non-hydrostatic formulation of the atmospheric dynamics, whereas existing surface to thermosphere models use hydrostatic dynamics (although some models that use non-hydrostatic dynamics are now in development; e.g. Ullrich et al., 2017). The hydrostatic assumption involves making the approximation that all terms in the vertical momentum equation may be neglected except for the gravity and pressure terms. This can be a very good assumption at the proper scales, i.e. when the horizontal length scale is significantly larger than the vertical length scale, as is the case when only simulating the lower atmosphere. However, it can be poor assumption in the thermospherefor example, Larsen & Meriwether (2012) report large vertical winds in the thermosphere. These winds should be better represented in a non-hydrostatic model than a hydrostatic one, and consequently the interactions between constituent transport, radiation and chemistry and thus thermosphere/ionosphere interactions should be better represented in the UM. Furthermore, the robustness of the extended UM dynamical core shall be enhanced by incorporating the innovative implicit representation of molecular viscosity detailed by Griffin & Thuburn (2018). This scheme damps upward travelling waves (chiefly acoustic waves and gravity waves) in a physically plausible way in contrast to other commonly used approaches such as enhanced diffusion and damping of vertical velocities.
Second, DTM2013 is already the most accurate thermosphere model presently available, with relative errors in the 200-300 km altitude range between 5% and 10% (Bruinsma, 2015). The accuracy is achieved through assimilation of recent density data (from the CHAllenging Minisatellite Payload [CHAMP; Reigber et al., 1996], GOCE and the Gravity Recovery and Climate Experiment [GRACE; Tapley et al., 2004]) combined with a new solar activity proxy, the 30 cm solar radio flux (F30) (Dudok de Wit et al., 2014), which is shown to be better suited than the more commonly used radio flux at 10.7 cm (the F10.7 index) for upper atmosphere specification. Here, we report improvements to the DTM mainly in the 170-450 km altitude region, by assimilating a more extensive dataset of observations than has been used in the past. The aim is for the storm time modeling performance to be significantly improved thanks to the high cadence Hp indices. This additional data assimilation and the new and higher cadence geomagnetic indices will further strengthen DTM's position as the leading model in this area.
A representation of the impacts of space weather in a whole atmosphere model also requires nowcasts and forecasts of solar and geomagnetic indices. As mentioned above, recent research includes replacement of the F10.7 index with the F30 index as the solar proxy in upper atmosphere models. The progress in geomagnetic index development and forecasts however is lagging with respect to the solar activity research, and therefore this aspect is addressed in SWAMI. The Kp index, which is commonly used to drive existing thermosphere models, is provided with a 3 h cadence. Models developed with this cadence cannot reflect the very short timescale response of the atmosphere to solar wind input (e.g. Oliveira et al., 2017). A higher cadence index allows a much better timing of such events.
In addition, atmospheric density reacts much faster to solar wind input (with order of minutes, locally), than can be described with a 3 hourly cadence. Here, we develop improved geomagnetic indices in two ways: We report on work to develop a new high cadence version of Kp, called Hp, with cadences of 30 min, 60 min and 90 min to better represent storm-time onset and thermospheric variability. The cadence chosen is partly guided by a user consultation. We develop novel predictions of both traditional and higher cadence indices. Current automated forecast algorithms are usually simple (to ensure operational robustness). Both linear and non-linear approaches for prediction are applied (Wing et al., 2005;Wintoft et al., 2017), and the prediction capability of implemented models differ (see e.g. Boberg et al., 2000;Wing et al., 2005;Bala & Reiff, 2012).
A further aim of SWAMI is to make the output available for operational use by satellite operators and other end users. Currently, when accurate re-entry or launcher trajectory calculations are needed, they must be executed by running different types of models at different altitudes, and in addition the atmospheric variability is not provided by those models. SWAMI will provide the mean state of the atmosphere as well as the variability, which can then be used as uncertainty in the calculations. MOWA will be used to produce a look-up table version, the MOWA Climatological Model (MCM), which can be run very quickly on any machine. MCM is an innovative, user-focused tool for use by satellite operators, launch service providers, satellite re-entry analysts as well as space weather scientists. Therefore, it shall benefit the entire user community that requires knowledge of the mean state of the atmosphere and its variability as a function of location and time from the surface up to the top of the thermosphere. The aim is for MCM to be more accurate and of much higher resolution than the currently available Committee on Space Research International Reference Atmosphere (CIRA) model (100 s vs. 1000 s of kilometres, about 30 vs. 360 min). As such, it will be a first of its kind. The more accurate reference from the Earth's surface will also enable global space weather science in the 80-140 km altitude range.
Regarding operational implementation, MCM will be made available to the broad community, including via ESA's Virtual Space Weather modelling Centre (VSWMC). VSWMC uses a cutting-edge framework aimed at providing an integrated environment for installing and operating models and applications. This includes support for installing model components on a central server as well as on distributed platforms, and definition of data streams, data exchange mechanisms, and definition and implementation of coupling mechanisms between models. There is also process control to allow synchronous or asynchronous execution of model runs, making use of the defined coupling mechanisms, and the use of a standard approach for implementation of interoperability based on the High Level Architecture (HLA) IEEE 1516 standard.
In summary, the overarching goal of SWAMI is to give Europe a strategic advantage in orbit prediction through thermosphere modelling, and associated user services, and this will arise via the three main objectives of SWAMI, which are: MODELS: To develop MOWA (covering the 0-1500 km altitude range), and an operational tool for satellite re-entry and launch applications (MCM) based on MOWA. Both models shall provide estimates of both climatology and space weather variability. DRIVERS: To provide nowcasts and forecasts of the new high cadence Hp geomagnetic indices, to be used in the UM and DTM. These products are equally useful for a wide range of space weather services that rely on rapid geomagnetic activity specification. END USER FOCUS: To develop steps (e.g. provision of software) to transition MCM into operations. A set of acceptance criteria, e.g. regarding robustness of model code, near-real-time data processing and forecast delivery, are being defined to ensure that the model system can be considered ready for operational use.
The outline of the paper is as follows. An overview of the aims of SWAMI appears in Section 2, while progress in developing the DTM and the upward extended version of the UM appears in Section 3. Work on the higher cadence version of the geomagnetic index, and on nowcasts and forecasts of this index, is in Section 4. Section 5 includes plans to develop an operational version of MOWA and conclusions appear in Section 6.

Concept and methodology
The concept of the SWAMI project appears in Figure 1. The new models MOWA and MCM, with improved geomagnetic activity drivers and forecast, are developed via a four-step procedure shown in the figure. The coupling of the UM and DTM shall be done over an altitude range that preserves highest accuracy in both models before and after the transition: up to about 140 km for UM, and DTM above approximately 170 km, by means of a fairing function in the 120-180 km altitude range.
In order to produce MOWA and MCM, further development of both UM and DTM is first required. The following methodology is adopted.
The UM includes a comprehensive representation of atmospheric physics and a coupled atmospheric chemistry scheme. Overall concept of SWAMI project. Solar activity proxies, in grey, are not developed in SWAMI, but an existing operational data server is used. D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 It is currently used with an upper boundary near 85 km altitude and the aim in this project is to extend its upper boundary to around 170 km altitude to enable coupling with the DTM, which has a lower boundary at 120 km. The upward extension of the UM requires the following steps: Additions to the radiation scheme to account for Non-Local Thermodynamic Equilibrium (NLTE) effects in the mesosphere and lower thermosphere (MLT) (Fomichev et al., 2004), and to extend the UM radiation scheme (the Suite Of Community Radiative Transfer codes based on Edwards and Slingo [SOCRATES]; Edwards & Slingo, 1996) to include Extreme Ultra-violet (EUV) and Far Ultraviolet (FUV) wavelengths (< 200 nm). Pass the photolysis rates from the radiation scheme to the United Kingdom Chemistry and Aerosol (UKCA) scheme (e.g. Morgenstern et al., 2009), which is the chemistry scheme used in the UM. This is needed to drive exothermic chemical reactions. The associated exothermic heating is responsible for the large rise in temperature with altitude in the lower thermosphere (see e.g. Marsh et al., 2007). To complete the UM lower thermosphere heat budget, a representation of Joule heating (e.g. Billett et al., 2018) needs to be included, together with the NO cooling often observed in the aftermath of geomagnetic storms (Knipp et al., 2017). The UM has no representation of the ionosphere nor electrodynamics. Accordingly, the Joule heating and NO cooling will likely be extracted from simulations of the Thermosphere Ionosphere Electrodynamics General Circulation Model (TIEGCM, e.g. Qian et al., 2013), which will be run for similar periods to the extended UM simulations. Make changes to the UM dynamics to ensure stability as the model lid is raised and resolved model wave amplitudes increase as a result of the reduced density. This shall include the addition of molecular viscosity and diffusion (Griffin & Thuburn, 2018).
In addition, the UM can be linked to current atmospheric conditions, by relaxing the UM troposphere and stratosphere to regular (6 hourly) meteorological analyses, using the "nudging" approach (Telford et al., 2008). Pedatella et al. (2013) showed that constraining whole atmosphere model simulations with lower atmosphere observations reduced root mean square wind errors by 30%-40% in the lower thermosphere. Therefore, this relaxation shall help to improve the UM thermosphere simulation by linking it to daily varying conditions. The DTM lower boundary from approximately 120-220 km is not fitted to data, and it has never been extensively evaluated because data in that region are extremely rare. Successful coupling of DTM and UM in the 150-170 km zone requires enhancing the accuracy of the former model in order to be as compatible with the UM as is possible taking the different model algorithms into account. The few available data at the lower boundary have to be assimilated: GOCE densities from 270 km to 170 km (i.e. up to 3 days before atmospheric re-entry); Global Ultraviolet Imager (GUVI; Paxton et al., 2004) O/N 2 ratios from the Thermosphere-Ionosphere-Mesosphere Energetics and Dynamics (TIMED) mission; and US Air Force daily mean (Storz et al., 2005) and SETA accelerometer densities (Forbes et al., 1995). Updated and rescaled GRACE and CHAMP densities shall also be used together with new data from Swarm-A.
MOWA can be used operationally in the future but is too computationally expensive and complicated to exploit for users themselves. Therefore, a multi-year UM run will be produced and the climatology of this used to develop a look-up table driven climatological model (MCM) which will combine the UM climatology and DTM. MCM will use both the mean state and migrating diurnal and semi-diurnal tides calculated from the UM run. The variability in MCM and MOWA shall be represented at the lower levels by using the standard deviation of the difference between the UM multi-year climatology and UM results for each year. Since the multi-year run is planned to cover 11 years, this will enable a representation of multi-year atmospheric variability, including contributions from the solar cycle. The analysis can also be extended to calculate the climatology plus variability related to geomagnetic activity (binned by Hp index). For speed and simplicity, the MCM shall be designed based on monthly means and their associated standard deviations. However, the output from the UM run shall be 3 hourly, which will enable the analysis of MOWA (within the domain covered by the UM) to be done down to semi-diurnal timescales. Analysing the variability as a function of time scale and location enables us to qualitatively as well as quantitatively evaluate the impact of space weather on the lower thermosphere in the 120-150 km altitude range.
As part of the development of higher cadence geomagnetic indices, it is necessary to ensure that the DTM can work both with Kp and the higher cadence Hp indices. For comparison with DTM, the high cadence values can be reconstructed back in time to 1995, as there is 1-minute geomagnetic observatory data available. An essential part will be concerned with developing and implementing an effective forecast of the indices for different lead times. The starting point is to develop forecasts using neural networks, but other machine learning approaches will also be investigated. The neural network approach has been proven to be one of the best techniques for finding the multivariate, non-linear transfer function between input variables (solar wind parameters) and output variables (the geomagnetic index).
In order to transition the models to an operational state, interfaces, coupling mechanisms and data streams between the various components have to be designed, documented and implemented. The ESA VSWMC project is developing a framework based on HLA that provides mechanisms and tools for these purposes. The VSWMC architecture will be analysed in preparation of an implementation of MCM, DTM and the geomagnetic indices, and the associated coupling between these components. In addition, smaller scale prototype services will be set up outside the VSWMC framework in order to test implementations. The prototype environment will also allow rapid deployment of components for testing and validation purposes.
3 Thermospheric modelling 3.1 DTM Semi-empirical thermosphere models are mainly used in satellite orbit determination and prediction to compute aerodynamic drag. They predict temperature, total density and most often composition too as a function of location (altitude, D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 latitude, longitude, local solar time), solar and geomagnetic activities, and season. The first Drag Temperature Model, developed in the seventies (DTM78; Barlier et al., 1978), has been upgraded several times (Berger et al,. 1998;Bruinsma et al., , 2012Bruinsma et al., , 2014. DTM models are constructed by fitting to a density database, with the objective of reproducing the mean climatology of the thermosphere. The quality of the database, i.e. its consistency, accuracy and its distribution in time and space, is essential to the achievable model accuracy. The spatial resolution of the models is of the order of thousands of kilometres, and density variations with smaller scales are sources of geophysical noise. The geomagnetic index used to date, Kp, limits the temporal resolution to 3 h.

DTM development and database
For SWAMI, two models are actually under development: an operational and a research grade model. The density database, of which the main datasets are shown in Figure 2, underlying the model development is also not the same. However, for both models they notably contain the full CHAMP (Reigber et al., 1996), GRACE (Tapley et al., 2004) and GOCE  high-resolution accelerometer-inferred density datasets. The research grade model is based on density data after 2000, because the high cadence geomagnetic activity indices Hp cannot be computed for the entire density database back to the sixties. Results presented here on both models are preliminary. The final models will be computed as late as possible in the project, in 2020, in order to benefit from new density data (e.g., GRACE Follow-On densities, additional year of Stella and Cryosat2).
The new density database is very different from the one used in the construction of DTM2013. We provide in the following a summary of the changes. It should be noted that even if data from the same satellites are used, there can be differences in the accuracy and consistency. The GOCE  and Swarm-A (Van den IJssel et al., 2019) data used were computed by TU Delft (TUD) for ESA, and the CHAMP data was computed with the same aerodynamic model by TUD too. The TUD satellite aerodynamic modelling is much more complex, and as a result more accurate and realistic, than the models used in the US Air Force operational High Accuracy Satellite Drag Model (HASDM; Storz et al., 2005) or at the Centre National D'Etudes Spatiales (CNES). As a result, their inferred densities are more accurate. The data were not re-scaled to HASDM, as was done for DTM2013. An example of the data scaling as applied to GOCE in DTM2013, i.e. by scaling to HASDM densities, is displayed in Figure 3. The ESA/TUD daily mean densities are 19% smaller on average over the mission than the HASDM densities, and a scaling factor of 1.23 was applied. CHAMP daily-mean densities are also presented in Figure 3, and the presently used CHAMP TUD densities are also about 20% smaller than the CNES CHAMP densities used in DTM2013. The main consequence of using the more accurate datasets is that DTM2019 predicts smaller densities, by about 20%, when compared with DTM2013.  The GRACE data were processed by CNES, and were subsequently scaled to the CHAMP and Swarm-A densities in a procedure that compares densities after normalization to the mean altitude of both missions only when the orbital planes are within 15°(i.e. 1 h local solar time) of each other at the equator. The Two Line Element (TLE) global daily mean densities (Emmert, 2009) are the longest datasets available, spanning four solar cycles, and they were used to estimate solar activity and seasonal variations only. Stella daily-mean densities, computed by CNES, did not require scaling because the aerodynamic model for a sphere is simple. All other datasets used in the construction of DTM2013, and notably the mass spectrometer data, were used with very low weights in the present modelling.
Daily-mean densities inferred from satellite drag data of eight satellites in elliptical orbits, with perigee heights in the 200-500 km range, were kindly provided by Bruce Bowman. Latitude coverage is from pole to pole taking all objects into account, and most satellites cover more than a solar cycle. The densities are computed from the estimated energy dissipation rates (EDR), i.e. orbit decay, which are derived from the orbits fitted directly to radar and optical tracking data. This method thus benefits from the true accuracy of the classified radar data. The uncertainty of the daily EDR densities is estimated to be 2%-4%. For a complete description of the method and the data we refer to Bowman et al. (2004). A new data type has been used in the assessment of the models only, the O/N 2 ratio at 135 km from 2002 to 2007 measured by the TIMED/ GUVI instrument (http://guvitimed.jhuapl.edu/).
The solar and geomagnetic activity indices that drive the model determine if it is the operational or research version. The indices used in the research model, F30 (Dudok de Wit et al 2014) and Hp60 (see Sect. 4) respectively, are not considered sufficiently robust and unfailing in an operational environment, as F10.7 and Kp respectively are. The algorithm of the research model has been updated in order to benefit from the higher cadence of the Hp60 indices, but has otherwise remained identical to the DTM2013 algorithm.

DTM algorithm
The representation of the total density in the altitude range 120 to approximately 1500 km is achieved by summing the contributions of the main thermosphere constituents (N 2 , O 2 , O, He, H), under the hypothesis of independent static diffuse equilibrium. The total density q at altitude z is calculated as follows: Partial densities i are specified at 120 km altitude and propagated to higher altitudes employing the height function f(z), which is based on temperature. The exospheric temperature and the partial density variations as a function of the environmental parameters L (latitude, local solar time, solar flux, and geomagnetic activity) are modelled by means of the spherical harmonic function G(L).
The function G in case of the research model required modification because of the higher cadence of the Hp60 indices of 60 min instead of 3 h in case of Kp. The modifications consisted of purely logistic as well as algorithmic changes. The latter concerned in particular applying delays as a function of latitude, as well as using averages over 2-24 h.

Model assessments
A short selection of comparisons is presented of the ISO models NRLMSISE-00 (Picone et al., 2002), JB2008 (Bowman et al., 2008) and DTM2013 (Bruinsma, 2015), and the two new DTM2019 versions. Table 1 lists the statistics of the observed (O) to model (C) density ratios computed for the complete main datasets in the DTM2019 database. The effect of the simplified aerodynamic coefficient modelling on the densities used in all other models is reflected in the mean O/C, which is systematically smaller than unity for all datasets except Stella (a sphere).
The statistics over entire datasets cannot reveal improvement in geomagnetic perturbation modelling because storm events are rare. The solar activity modelling, i.e. the choice of proxy, is on the other hand reflected in these numbers. The performance is best with the research model, but the operational model is, despite using F10.7 and Kp, rather good and even slightly better than DTM2013. A slightly improved performance through different weighting and parameter estimation, and notably the high mean O/C with Swarm-A data, is expected for the final models.

Extended UM
The UM is the Met Office weather and climate model and in most applications is used with an upper boundary near 85 km altitude (e.g. Walters et al., 2019). The UM was originally documented by Cullen (1993). A revision to the model dynamical core followed around 10 years later (Davies et al., 2005). Its current dynamical core, ENDGame, solves the non-hydrostatic, fully compressible deep-atmosphere equations of motion on a rotating sphere using a semi-implicit semi-Lagrangian formulation (Wood et al., 2014).
The UM parametrizes a range of atmospheric processes, which operate on smaller length scales than those used in the dynamical scheme, and are therefore not resolved on the numerical grid. Furthermore, some physical processes such as radiation are not represented in the equations of motion used in the UM. Only a few of the parametrizations remain relevant in the simulation of the MLT region. Radiative heating and chemistry are represented by the SOCRATES radiation scheme (Edwards & Slingo 1996) and the UKCA scheme (e.g. Morgenstern et al., 2009), respectively. Gravity waves generated by non-orographic sources are parametrized by the Ultra Simple Spectral Parametrization (USSP) scheme (Warner & McIntyre, 2001).

Initial UM runs and the nudging scheme
Developments for the UM radiation scheme (as detailed in Sect. 3.2.2) were not ready at the start of the SWAMI project, which made the radiation scheme unsuitable for use in the MLT. Initial UM tests were performed with a raised UM upper boundary, but with pre-SWAMI parametrizations. This is an important first test, in order to investigate how stable the UM is at higher altitudes.
These first simulations are run at a horizontal resolution of 1.25°in latitude Â 1.875°in longitude, using existing parametrizations and a raised model boundary (specifically 88 levels extending up to 100 km), developed from the original 85 levels that extend up to 85 km. Two simulations were carried out, with September and March start dates, and both failed in the following solstice with excessive and unrealistic model winds seen near the summer pole. Further analysis indicated that these winds are caused by the inappropriate use of the SOCRATES radiation scheme at higher altitudes and the consequent production of excessive and erroneous heating rates at upper levels (see Griffith et al., 2020 for further details).
The solution to this is to replace the LTE heating rates produced by SOCRATES with the new NLTE scheme, which will give more accurate heating rates above~65 km. At the time of the initial tests, the NLTE scheme was not ready. Therefore, to maintain progress, we developed a scheme to relax the UM temperatures to a climatological profile, using a nudging scheme similar to that described by Telford et al. (2008). This nudging roughly approximates the impact of NLTE radiative heating and exothermic chemical heating, until such time as these are added to the UM. The nudging scheme developed here is a Newtonian relaxation to a globally uniform temperature profile over a given timescale (by default, 24 h is used). Further details appear in Griffith et al. (2020).
In the new runs, the UM radiative heating was switched off above 70 km, and temperature increments from the nudging scheme were used above 70 km instead. The immediate impact was that the extended UM run with a 100 km upper boundary was stabilized, and this paved the way for further UM experiments with higher lids and different vertical resolutions.
At the upper boundary of the 88 level model configuration, the vertical level height increment is approximately 6 km. However, this resolution is not appropriate to accurately represent vertically travelling waves in the MLT. In order to accurately model vertical wavenumbers in atmospheric dispersion relations (see e.g. Griffin & Thuburn, 2018) the vertical resolution must be finer than the atmospheric scale height. Ideally, the vertical resolution should be one quarter of the atmospheric scale height, so as to interpolate a vertically propagating atmospheric wave appropriately over this scale height, and resolve its sinusoidal variation. One half of a scale height will not resolve all vertically propagating waves over this scale height, but will resolve larger scale waves in an acceptable manner. The analysis by Griffith et al. (2020) indicated that the respective vertical resolutions used should be 1.5 km and 3 km. As well as testing different vertical resolutions, simulations with different UM lid heights were carried out. Heights of 120 km and 135 km were tested, in addition to the 100 km lid used previously.
A stable simulation with a 1.5 km resolution was only possible with a 100 km lid. With higher lids the enhanced resolution led to greater vertical velocities (associated with upward propagating waves) near the top of the model, which caused the 120 km and 135 km lid runs to fail, even when the model "sponge layer" was strongly applied. The sponge layer is additional damping applied near the UM upper boundary, and is commonly used in atmospheric models to prevent spurious reflection of vertically propagating waves from the model lid. In the UM, this sponge layer scheme works by damping vertical velocity (e.g. Melvin et al., 2010). Even when a 3 km vertical resolution is used, the sponge layer has to be strengthened by a factor of at least 4 (for the 120 km lid) and of at least 8 (for the 135 km lid), compared to the 85 km or 100 km runs, to ensure a stable run.
An example of zonal mean zonal wind and zonal mean temperature for June for the UM run with a 3 km vertical resolution appears in Figure 4. The wind and temperature structures below 100 km are consistent for all runs. Griffith et al. (2020) also show good agreement between the 85 km and 100 km runs below 85 km and also good agreement with the Microwave Limb Sounder temperature climatology and the Upper Atmosphere Research Satellite Reference Atmosphere Project (URAP) (Swinbank & Ortland, 2003) zonal wind climatology. However, easterly winds at equatorial uppermost levels in the 135 km run are too strong and the latitudinal and longitudinal structure of the models fields above around 100-110 km is increasingly poor compared to fields from other models (not shown). This is due to the very strong relaxation to the global mean temperature used in the nudging scheme above around 100 km, where the temperatures rise strongly with height. This issue is likely to be solved with the introduction of exothermic heating from the UKCA but in the interim could be addressed by replacing the global profile in the nudging scheme with a more detailed temperature climatology. We also had some success in producing simulations with a 150 km lid. However, given the above issues we will not produce any more such simulations until the issues with the latitudinal/longitudinal structure in the lower thermosphere are addressed.

The NLTE radiation scheme
The UM NLTE radiation scheme was not ready when the above UM experiments were run, but this scheme has now been tested and implemented in the UM. This scheme represents NLTE infrared cooling for the 15 lm CO 2 band (Fomichev et al., 1998) and 9.6 lm O 3 band (Fomichev & Blanchet, 1995), and NLTE heating from the 1.05-4.3 lm CO 2 band and the 200-310 nm O 3 Hartley band. This NLTE radiation scheme is commonly used in MLT models, and has been rewritten from scratch to make it compatible with UM coding standards. In the UM, the radiative heating is calculated twice, once with SOCRATES and once with the Fomichev scheme, and then they are blended at the point where NLTE processes begin to become important (above 65 km altitude). This transition is quite smooth and no averaging across the transition zones seems D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 Fig. 4. Zonal mean zonal wind (top) and zonal mean temperature (bottom) for UM simulations with a 3 km vertical resolution and with lids at 100 km (left), 120 km (middle) and 135 km (right). to be needed. It can be seen in Figure 5 that including NLTE effects removes erroneous SW heating at the top boundary that caused instabilities in the UM.
With the inclusion of NLTE radiation, the UM still requires the nudging scheme to be used to remain stable, but the nudging can be switched on at 90 km instead of at 70 km as before. Figure 5 shows that the zonal mean zonal wind in the NLTE simulations is fairly similar to that in Figure 4, which shows that the strategy of using nudging in the model development stage is robust and acts to approximately represent the effects of the NLTE heating. There are some differences between the wind fields. The start dates for the runs shown in Figures 4 and 5 are different which explains the different phases of the Quasi Biennial Oscillation in the equatorial stratosphere in Figures 4 and 5. In addition, the diagonal pattern of westerly winds extending from around 50 km and 45°S to 45°N and around 100 km is more pronounced in Figure 5 than in Figure 4 and is in slightly better agreement with the URAP climatology.

Treatment of radiative heating, photolysis and photoionisation in the FUV and EUV
Solar radiation at FUV and EUV wavelengths <200 nm is generally absorbed in the mesosphere and thermosphere, primarily by O 2 , N 2 , and O. This results both in photolysis/ photoionisation of these major species along with a number of minor species, and to direct radiative heating of the atmosphere. Current parametrizations of radiative transfer used in the UM deal with radiative heating and photolysis in two separate schemes. Radiative heating is handled using the two-stream SOCRATES scheme (Edwards & Slingo, 1996;Manners et al., 2018) covering wavelengths >200 nm (Walters et al., 2019), while photolysis >177 nm is handled by the eight-stream Fast-JX scheme (Wild et al., 2000;Telford et al., 2013).
To parametrize a unified treatment of radiative heating, photolysis and photoionisation at shorter wavelengths (namely for the wavelengths <177 nm not currently included in SOCRATES), a flexible new scheme has been implemented in the SOCRATES code. This introduces two novel techniques: 1. Replacement of the plane-parallel approximation with the pseudo-spherical approximation, allowing the calculation of solar radiation absorbed in the twilight regions of the atmosphere where the solar zenith angle is greater than 90°. Under the plane-parallel approximation a column within the UM is treated as a set of flat layers extending infinitely in the horizontal. In the pseudo-spherical approximation, the plane-parallel atmosphere is replaced by a set of spherical shells. Where the local zenith angle is greater than zero, the direct flux arriving at each layer will take a different path through these shells, so a separate calculation for the direct beam is required for each layer. This allows calculations where the solar zenith angle is below the horizontal to the limit where the solar beam would intersect the surface. The scheme implemented in SOCRATES has been applied to exoplanet atmospheres (Lines et al., 2018), but is also crucial to accurately model radiative transfer in the MLT region. 2. Accurate spectral calculation of photolysis rates using a novel technique to map the fluxes calculated using the correlated-k method onto a high resolution spectrum.
The correlated-k method essentially reorders the wavelengths within a spectral band in order of increasing strength of absorption and then bins up wavelengths with similar absorption coefficients. Radiative transfer calculations need then only be done for each absorption bin, or k-term. The novel technique here is to retain the information on the wavelengths that each k-term represents, so that the calculated fluxes for each k-term are mapped back to spectral channels at these wavelengths. The high resolution fluxes are then convolved with the solar spectrum, photoabsorption cross-sections and quantum yields in order to derive accurate photolysis rates for any given photolysis pathway. An added benefit is that spectra can be diagnosed at high-resolution, effectively providing line-by-line resolution for the cost of a broad-band calculation.
The rate of photolysis or photoionisation for a given molecule is: where r is the absorption cross-section of the molecule (in m 2 ), Q is the quantum yield, and A is the actinic flux, which is the integrated radiative intensity over all directions. The integral is over all wavelengths (k). The quantum yield is read in as part of the standard configuration file (or spectral file). In the case of highly energetic photons causing photoionisation it is possible for the photoelectron to cause further dissociation/ionisation events. This can be parametrized by providing scaled quantum yields. The summation over channels is done by weighting each channel by the incoming solar spectrum, which may be specified at arbitrary time resolution in an extended spectral file. The radiative heating rate is determined from the flux divergence across the layer. Where photolysis occurs some of the absorbed flux is used to dissociate or ionise the molecules. This fraction of the energy is therefore not (immediately) available to increase the temperature of the gas and should instead be handled by the chemistry scheme for possible release in exothermic reactions. The proportion of the flux used for photolysis is determined at the same time as the photolysis rates within each channel. The flux divergence used for photolysis is then summed over channels in the same way as the photolysis rates and then summed over all photolysis pathways. The residual heating is determined by subtracting this from the total flux divergence.
The scheme developed here has been designed to be applicable over the entire spectral range from X-ray to near-infrared wavelengths. This is entirely configurable through the use of an external spectral file which contains all the information on the wavelength breakdown, the gas absorbers present, gas absorption coefficients, photolysis pathways and quantum yields, the solar spectrum (optionally time varying), Rayleigh scattering coefficients, and eventually cloud and aerosol absorption and scattering parameters. An initial configuration has been developed covering the wavelengths 0.05-320 nm. Following Solomon & Qian (2005) we have used absorption cross-sections and quantum yields from Fennelly & Torr (1992) for EUV, and Henke et al. (1993) for the X-ray, for the species O 2 , N 2 , O and N. Shortward of 65 nm the photoelectron enhancement factors derived by Solomon & Qian (2005) are used to enhance the quantum yields. In the FUV, gas absorption coefficients are D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 derived from the recommendations in the JPL report 15-10 (Burkholder et al., 2015). Quantum yields for the important region around the solar H Lyman-a line (121.6 nm) are taken from Lacoursiere et al. (1999). O 2 absorption dominates throughout the far-UV with complex absorption spectra over particular wavelength ranges such as the Schumann-Runge bands at 175-205 nm. The k-term mapping technique automatically concentrates channels in these complex parts of the spectrum.
Future work will refine this configuration for use in the UM to provide photolysis and heating rates in the MLT region. These can then be passed to the improved UKCA, which is being developed in a separate project.

The viscous scheme
While the non-hydrostatic formulation of the UM should lead to a more accurate representation of thermospheric dynamics, at this stage the model is not stable enough to run in the thermosphere. Griffin & Thuburn (2018) showed in idealised studies that the UM has difficulties simulating an acoustic wave propagating upwards into the thermosphere. These acoustic waves are simulated by a non-hydrostatic core, but are absent from any hydrostatic solution. The acoustic waves can be important in geomagnetic storms (e.g. Deng et al., 2008), so they need to be represented as realistically as possible. This can be done using realistic damping that controls the growth of wave amplitude with decreasing density, rather than the crude use of high levels of artificial model diffusion.
Molecular viscosity and diffusion are real physical processes that have a significant damping effect on vertically propagating waves in the thermosphere (above 130 km) where the timescale at which molecular viscosity and diffusion act becomes shorter than the growth rate of acoustic waves (Griffin, 2018). Griffin (2018) found that with a standalone version of ENDGame the inclusion of vertical molecular viscosity and diffusion acts to reduce acoustic wave amplitudes above 130 km, and hence improves the model stability. Including molecular viscosity and diffusion in the UM will be very important for its stability as it is extended upwards. Its inclusion may also negate the need for the sponge layer in the UM in simulations with a top model boundary extended higher than 130 km.
Including vertical molecular viscosity and diffusion in the UM is non-trivial. They act on very short timescales compared to the timestep used by the UM's semi-implicit dynamical solver, and so it must be included with the dynamics in a fully coupled way. Trying to include it separately from the dynamics causes time-splitting errors to be generated if the timestep used is too large (greater than one minute). As the vertical molecular viscosity and diffusion cannot simply be added as an increment to the right-hand side of the dynamical equations, as with other UM parametrization schemes, this makes the coding of the scheme very complex. In addition, the full UM uses a different solver (an incremental solver close to that described by Allen & Zerroukat, 2016) to that used by the stand-alone version of ENDGame used by Griffin (2018) (multigrid solvere.g. Buckeridge & Scheichl, 2010), so the scheme needs to be re-implemented.
The Euler governing equations with stress tensors including viscosity coefficients is given by Griffin (2018). The additional viscous coefficients are all functions of thermodynamic variables and include the coefficients of molecular viscosity and thermal diffusivity. After Allen & Zerroukat (2018), a final form of the equation which includes viscous terms is derived, and which can be used as input for the incremental solver. The viscous scheme will be implemented as a change to the inputs of the UM's incremental solver in the dynamical core. Less approximations are made with this method compared to that used with the multigrid solver in the stand-alone version of ENDGame, so it may even give an improvement in accuracy.
Pseudocode for the entire viscous scheme has been written up and internally reviewed, to confirm the veracity of the scheme. The process of producing the code for the scheme is underway. Unit tests have been being added to the various routines in the scheme to make the code more robust and easier to debug. These have involved, for example, checking that all the inputs and outputs from the code are within expected parameters, checking that the thermodynamic variables take physically realistic values, or checking that the arrays being passed through the routines are the correct sizes.

New geomagnetic indices 4.1 High cadence, open-ended geomagnetic index Hp
The SWAMI project aims at a high cadence geomagnetic index that resembles the Kp index. Here, high cadence means that the time resolution of the index is higher than the 3-hourly resolution of the Kp index. A number of geomagnetic indices, which are available for different purposes and based on different principles, are shown in Table 2. The newly developed, high cadence index Hp is based, as much as possible, on the  (Hp30). Likewise, the geomagnetic ap index, which is the linear version of the Kp index, is provided in the versions ap90, ap60 and ap30. The Kp index has a maximum value of 9. This means that all geomagnetic events that are above a certain threshold will be assigned a Kp index of 9. An adaptation to make the Hp indices open-ended is in development and will allow to describe very high levels of geomagnetic activity with Hp = 9+, 10À, 10o, 11À and so on. The Kp index (Bartels, 1957a, b;Mayaud, 1980;Menvielle & Berthelier, 1991;Siebert & Meyer, 1996) is a measure of particle radiation effects observed in sub-auroral to mid-latitude geomagnetic field records and thus a proxy for the energy input from the solar wind into the ionosphere and thermosphere. This energy input is modulated by magnetospheric processes and the magnetic effects tracked by Kp are mostly caused by ionospheric electric currents associated to the polar electrojet and their field aligned currents. The Kp index (and ap index) is produced and distributed by the German Research Centre for Geoscience, and is endorsed by the International Association of Geomagnetism and Aeronomy (IAGA).
The PC index describes polar cap currents and also represents a measure of the solar wind energy input (Troshichev et al., 1988(Troshichev et al., , 2006. The AE indices (Davis & Sugiura, 1966;Tomita et al, 2011) are caused by ionospheric currents associated to the polar electrojet and are a measure of solar wind energy input and of magnetospheric processes. AE and PC, like Kp, are based on data corrected for quiet-time geomagnetic variations as well as for local time and seasonal effects.
AE and PC are based on instantaneous values of the geomagnetic field, they are available in near real-time (NRT) with 1 minute time resolution. Kp is a range index, related to the maximum difference of the magnetic variation within a UT 3-hourly interval, i.e. one index value is assigned for the period from 00:00 UT to 03:00 UT, one from 03:00 to 06:00 UT, and so on.
The low temporal resolution of the Kp index is a disadvantage for space weather applications. On the other hand, the Kp index goes back to 1932 and is very well established as space weather tool (e.g. NOAA space weather scales are based on it).
Another advantage of the Kp index arises from the location of the 13 geomagnetic observatories it is based on. During strong events, the auroral oval expands and this leads to an increase in geomagnetic activity at sub-auroral and midlatitudes, which are tracked by Kp.
Other indices with higher time resolution, like the Dst (1-hour resolution, a measure of the magnetospheric ring current; Sugiura & Hendricks, 1967) or the Wp index (1-min resolution, indicator of substorm onset time; Nosé et al., 2012), are not very well correlated to energy input from the solar wind.
In contrast to the other geomagnetic indices listed in Table 2, the Kp index values 0, 0+, 1À, 1o, 1+, 2À, . . ., 9À, 9 are discrete and limited to a maximum value of 9. This is another disadvantage of the Kp index, since extreme events above a certain threshold will always be given a Kp = 9, instead of being subdivided into different values, depending on the strength of the event.
The new index shall be as useful as possible for existing services and models, which predominantly use Kp. Our aim here is to provide a new geomagnetic index that resembles, as much as possible, the well-established Kp index both in terms of reliability, availability and properties, e.g. in the frequency distribution of its index values. The development of a new, Kp-like index with higher cadence than Kp will be briefly described in the following. This index will be open-ended, i.e. extreme events can be assigned values of Hp = 9+, 10À, 10o, 10+, . . ., depending on the amplitude of the event.
Kp is determined from the standardised local K values for each observatory. While Kp is a measure of global energy input from the solar wind and magnetosphere into the ionosphere and thermosphere, the local K values from individual observatories are also used for regional investigations and therefore we also develop high cadence K value equivalents, i.e. H90, H60, and H30 values.
Both the definitive Kp index and the nowcast Kp index are available from GFZ. The definitive index is calculated at GFZ semi-monthly from the K values provided by the Kp observatories. The nowcast K and Kp indices are calculated at GFZ from near real-time observatory data provided. For the K values, the FMI-method is used (Sucksdorff et al., 1991), which was found to be the best computer-based algorithm (Menvielle et al, 1995). The methodology to determine H and Hp values, described in the following, is based on the nowcast K and Kp used at GFZ. In the following we describe this process.
The Hp indices are based on the same geomagnetic observatories as the Kp index. The subtraction of the quiet time curve is taken from the FMI method. A slightly different approach is taken when determining the range of the geomagnetic variation (after subtraction of the quiet curve) in the index interval (90, 60, 30 min). For K, the range between the variation's minimum and maximum is always used. For H, we use either this range, or the maximum difference between the geomagnetic variation and the quiet curve, whichever is larger. In practice, we found that this deviation from the K determination method had very little influence on the result.
Thus, for each index interval we determine one disturbance value in nT for the two horizontal components and choose the larger one as being representative for this index interval. We found the resulting disturbance values (in nT) for the shorter index intervals to be systematically smaller than for the 3-hourly intervals. The disturbance values are then mapped to the index values H = 0, 1, 2, . . ., 9, such that the frequency distribution of the H indices for a specific observatory is identical to the frequency distribution of the K index values. For example, for Niemegk observatory, a disturbance between 330 nT and 500 nT yields K = 8. These boundaries for H = 8 at Niemegk are 254 nT and 371 nT for H90 = 8; 218 nT and 337 nT for H60 = 8 and 190 nT and 267 nT for Hp30 = 8. The lower boundaries compensate for the systematically lower disturbance values in the shorter index interval.
The H values are standardized with the same tables for local time and seasonal dependency as the K values and averaged over all observatories to yield a mean value. These mean values are then mapped to the Hp values such that the D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 resulting Hp index has the same frequency distribution as the Kp index.
In Figure 6 an example for Kp and Hp indices is shown for 2 days where strong geomagnetic activity occurred. Strong geomagnetic variation was observed at all 13 Kp stations from 22:30 on Sept. 7 to 02:15 on Sept. 8 as well as from 12:00 to 15:00 on Sept. 8. These times are marked in light blue. The time series of the two consecutive days shows several aspects of the characteristics of Kp and Hp and their relationship: The Hp indices follow Kp closely if Kp changes are small and gradually, as, e.g., on Sept. 7 from 00:00 to 15:00 and on Sept. 8 from 03:00 to 09:00. At times of strong geomagnetic activity (light blue area) Kp and Hp indices show similarly high values. The Hp indices follow Kp closely during larger changes in Kp, if these are caused by geomagnetic activity that (by chance) commences at or shortly after the start of a 3-hourly Kp interval, such as on Sept. 8, at 12:00 UT. The temporal evolution of the Hp indices show significant differences from the Kp values for strong changes in geomagnetic activity that commences at other times than the start of a 3-hourly Kp interval. This is observed on Sept. 7 at 21:00 UT to 21:30 UT, when Kp = 8À, while Hp90 = 4, Hp60 = 4+ and Hp30 = 5À.
In summary, these examples indicate that the high cadence Hp indices track the Kp index well. A more detailed description of the derivation and the properties of the Hp indices will be given in a dedicated publication emerging from the SWAMI project.
Currently, a test dataset for the years 2003-2005 and 2017 and its description is available for download . These years were chosen to include the Halloween storm in 2003 and a large number of moderate geomagnetic storms (in 2005) as well as the recent geomagnetic storm in September 2017 (e.g., Curto et al., 2018). It should be noted that the published test dataset of Hp is a version of the index that is not yet open-ended, and consequently the maximum value of Hp is limited to 9o. In 2020, the high cadence, open-ended Hp index will be made available by GFZ in near real-time and as an archive back to 1995. (Disclaimer to users of the dataset: Please carefully test and validate all your model output and services for which you use the Hp indices [including the ap90, ap60, ap30] as input parameter. This is especially true when these models and services were originally derived or parameterized with the Kp index).

Index nowcasting and forecasting
The prediction of satellite orbits relies on the prediction of the thermosphere conditions that in turn relies on the prediction of geomagnetic conditions. The Kp index is widely used for a number of applications. For the orbit propagation up to the current time one can use the near real-time Kp index when available as discussed in Section 4.1. However, such a near-real-time Kp still comes with a delay, and projections into the future require a nowcast of Kp as well as predictions.
One simple approach can be to use an average Kp index ("average Kp") model. That would certainly be inaccurate but would allow predictions. Slightly more elaborate predictions can be based on the most recent values of Kp ("persistence Kp"), where Kp is forecast to be the same as the last available Kp. While this model would also not result in an "accurate" prediction, it will statistically reflect the general state of the geomagnetic activity as it will be based on recently estimated Kp from the measurements. An even more elaborate approach would be to use the measurements from the previous solar rotation ("recurrence Kp"). Such predictions would not only account for the general level of activity during a particular solar cycle, but may provide a more or less accurate forecast of the recurrent geomagnetic activity that is driven by the fast solar wind streams originating from the coronal holes. An additional improvement of the forecast can be made by utilizing the solar wind measurements ("SW model"). Combining all these datasets one can utilize all of the observations and data including history of Kp and solar wind conditions ("full model").
While predictions using various tools and inputs similar to described above have been done in the past (Costello, 1997;Fig. 6. Kp and Hp indices on September 7 and 8, 2017. Two periods of strong magnetic variation at all Kp stations are marked in light blue. D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 Boberg et al., 2000Balikhin et al., 2001;Boaghe et al., 2001;Wintoft et al., 2017;Tan et al., 2018), until recently it remained unclear how each of these models based on different data compare to each other, if they were combined what the relative contribution of each of these models would be, and how their performance depends on the forecast horizonthe time interval in the future for which the forecast is issued.
To objectively evaluate the importance of different input data, Shprits et al. (2019) developed a series of models based on neural networks. To make sure that the evaluation of the accuracy of the models does not depend on the interval for validation of the network, the authors used a K-fold cross validation technique. The main idea of K-fold cross validation is that the entire interval of time-series is divided into a number of intervals. After that, one of the intervals is excluded and models are developed (hereafter referred to as "trained") on the remaining dataset and tested on the excluded time interval. Such a validation is repeated for each of the time intervals one by one. The net value of the error is estimated as an average from all models trained in the manner as described above. Shprits et al. (2019) utilized 1-minute propagated solar wind measurements from 2005 until 2016 obtained from the OMNI-Web data service. They first divided the considered time interval into 11 yearly periods and then performed a K-fold cross validation. Each cross-validation used one year that was excluded from training for validation and the rest of the data for training. Training and validation were repeated 11 times, resulting in 11 training errors and 11 validation errors. The average of all of these errors was used as an indicator for the average over a solar cycle of each of the models. For a more detailed description of the model parameters and datasets, please see .
The results of the models and their K-fold cross validation are shown in Figure 7. For nowcasting (estimation of the current conditions or short-term predictions), the simple models based on average values of Kp and values from the previous solar cycle turned out to be more accurate than the persistence predictions. Addition of the solar wind data resulted in a much more accurate prediction. Interestingly enough, the accuracy of the model based only on the solar wind was very similar to the accuracy of the model that utilized near-real-time and historical values of Kp, leading to the rather surprising result that the accuracy of predictions cannot be improved by augmenting them with real-time estimates. That does of course not mean that real-time measurements are not important as they can give the most accurate values of Kp for the recent history.
For predictions more than 12 h into the future, the value of the solar wind predictions decreases as current solar wind conditions cannot help predict the future. For such longer-term predictions the historical measurements and in particular recurrence (values from the previous solar cycle) start playing an important role. The models based on historical values for horizons of over two days outperform the solar-wind based models.
It should be noted that while some of the applications of geomagnetic conditions depend on the average values, predictions of the majority of space-weather related phenomena depend on storm-time values of the Kp index. The accuracy of storm-time predictions is much lower. First of all, conditions may be changing fast during the CME-driven storms. Another complication is that machine-learning based models tend to be more tailored towards lower values of Kp, as quiet times occur much more often than disturbed geomagnetic conditions. Storm-time Kp accounts for less than 1% of all Kp time series. Forecasts trained on the original Kp time series tend to systematically underestimate storm-time Kp data as the majority of the data points used as an input are quiet-time values. Figure 8 shows the errors in different forecasts models for Kp 4 and Kp > 4. D.R. Jackson et al.: J. Space Weather Space Clim. 2020, 10, 18 To account for this, Shprits et al. (2019) suggested to perform a resampling of the data. They showed that resampling can significantly improve the results. For the development of the neural network the input dataset consists of values of Kp and values of the solar wind immediately preceding the current time. Such a dataset can be augmented by duplicating the points corresponding to high Kp. Such a duplication or "resampling" continues until the number of high Kp values becomes equal to the number of low Kp values. When such a rebalanced dataset is used for training, the empirical model puts much more weight on storm-time values and the accuracy of the forecast for high Kp may be improved. Shprits et al. (2019) also suggested that other methods that can put more weight on high Kp values can result in similar improvements for the storm-time. These can potentially be methods that allow to specify the cost function for the training that can "pay more attention" to high values of Kp than to low values. While tailoring of the model to disturbed geomagnetic conditions may result in improvements during storm times, it will also decrease the accuracy during quiet time as less attention will now be paid to the most typical values of Kp from 0 to 4-5. Tailoring of the methods and testing of the methods for particular applications should be the subject of future research.
While They also analyzed the input data and tested various ML methods to reduce the dimensionality of the input data, which in this case was historical measurement of the solar wind parameters. This procedure is usually referred to as "feature selection" in ML. As the predictions utilize time series, the number of input values for the training (in ML referred to as "features") grows when high cadence is considered, and the number of inputs may be very large when the long-term history is considered. Handling of such large input datasets may be computationally demanding and can result in so-called "overtraining" (or "overfitting"), which will decrease the accuracy.
Moreover, it becomes more difficult to evaluate the importance of each of the inputs for the model.
To find the most important subsets of inputs, Zhelavskaya et al. (2019) used a number of methods that included: Random Forest (RF) (Ho, 1995), Fast Function Extractor (FFX) (McConaghy, 2011), Maximum Relevancy Minimum Redundancy (MRMR) , and Mutual Information Maximization (MIM) (Bollacker & Ghosh, 1996). The Random Forest is widely used for ML due its fast training speed. The FFX is based on symbolic regression and allows to infer the main features due to its white box approach. The MIM and MRMR methods are based on the concept of mutual information (MI). MI is a quantity that represents the "amount of information" that can be obtained from one random variable through observation of another random variable. See Figure 1 in Zhelavskaya et al. (2019) for further illustration of these methods.  The obtained results (Fig. 9) show that the performance of the models as measured by the RMS error and Correlation coefficient for all considered models was rather similar and much lower than that of the reference models based on persistence of Kp or average Kp (not shown on Fig. 9). This most likely implies that significant improvements to the prediction of Kp cannot be obtained by simply using another regression method. To achieve future improvements, predictions based on solar wind observations such as predictions with global heliospheric codes should be used. Such codes can provide advanced forecasts of the solar wind that may produce much more accurate Kp predictions for longer term horizon up to 1-2 days, which is the time it takes for the disturbances on the Sun to reach the Earth. 5 Research to operations 5.1 Design of MCM As indicated above, the extended UM and the improved DTM shall be blended together to produce MOWA. The exact blending methods (fairing functions) have not yet been established. MOWA can be used operationally in the future but is too computationally expensive and complicated to exploit for users themselves. Therefore, MOWA will be used to develop MCM, in the form of look-up tables. Optimum search and interpolation algorithms will be researched in the last year of the project (2020), and the temporal and spatial resolution of the look-up tables will be evaluated to guarantee minimal loss of precision. This model, together with the DTM, shall provide a quick and easy method for satellite operators to plan re-entry and launch operations, without using the full UM.
MCM is focused on providing an operational model for orbit propagation and re-entry computations. This requires: Easy usage and integration in larger projects; Good computational performance.
In order to achieve this, the design shown in Figure 10 has been devised. It exposes a common interface for the UM and DTM models, so that the user of the MCM can use both combined models (together with the blend function) without worrying about the underlying internals.
Finally, as detailed in the user and software requirements (see Sect. 5.2), the model will be accessible with just a date and time, an altitude and a location (specified in terms of latitude and longitude), although the input interface will also allow the user to employ their own estimation of space weather indices.

Requirements for operational application
To decide and establish the user requirements for operational application, a survey was sent to 19 samples, and 8 answers were obtained. The participants came from public, private, research and non-research institutions. The questions covered topics related to the distribution of the computational model, the information that the possible users think is of more interest, in which aspects space weather affected their operations or research, what interface would be more interesting for their applications, etc.
Using the information drawn from this survey, the following user requirements were derived: The prototype models implementation shall be made available in the form of source code; The prototype models must be accessible from Python code; DTM and MCM shall be made available to the users with a single-entry point; The prototype library shall be made available for Linux systems; The prototype shall be disseminated in a website that allows running it from a web browser; The prototype shall allow using the models with minimal inputs (date, location and altitude); The prototype shall allow the users to optionally use their own space weather indices; The prototype library will reject erroneous user inputs; The models shall be integrated in the VSWMC framework; The prototype library models shall expose a unique atmospheric model comprising the DTM and MCM; The library will cover altitudes from 0 to 1500 km; The run-time performance of the library will be comparable to the performance of the isolated underlying models; The prototype shall be verified against results of the standalone models.
These requirements will ensure that the software developed for this project is in line with what a typical user in need of this model is expecting. A Python package following the design specified in Section 5.1 will be published and made easily available at the SWAMI project website, the VSWMC framework and other possible channels of distribution. In addition, there will be a web API integrated in the project website that will allow for running requests over the Internet.

Conclusions
In this paper, we report on progress with SWAMI to date. The aim is to produce a high quality whole atmosphere model that is suitable for operational use.
The DTM has been re-developed using a more accurate and consistent neutral density observation database than has been used in the past. In addition, new data from Swarm-A, TLE global mean densities and the US Air Force EDR densities have been assimilated into a new operational model DTM2019_oper. Assessment of this model shows it is slightly more accurate than DTM2013 and that DTM as a whole maintains its advantage over other models like NRLMSISE-00 and JB2008. A research version of DTM, called DTM2019_res, has also been developed which uses Hp60 indices instead of Kp as a geomagnetic proxy, and F30 instead of F10.7 as solar activity proxy. The performance of this model over the entire observational dataset is good, and this reveals DTM's ability to represent the impacts of solar activity on the thermosphere. Assessment of the impacts of individual geomagnetic storms, and thus the impact of using Hp60 instead of Kp, is a more complex task that has yet to be completed.
The UM has been extended from its original upper boundary of 85 km to run stably and accurately with a 135 km lid. Initial simulations did not employ any changes to the model to account for the MLT other than the use of a raised lid. It was quickly realised that use of a nudging scheme (Griffith et al., 2020) to approximate MLT radiative and chemical heating was required, and initial runs used this, with the UM SOCRATES radiative heating switched off above 70 km. The zonal mean wind and temperature fields from these runs compare quite well with climatology but the latitude/longitude structure above around 100 km is increasingly inaccurate, which reveals issues with nudging to a global mean temperature profile. This can be addressed by introducing improved radiation and chemistry schemes to the UM and switching off the nudging. A first step towards this has been the introduction of a NLTE radiation scheme (Fomichev et al., 2004) and the use of nudging only above 90 km instead of 70 km. Work to extend the SOCRATES radiation scheme to include the FUV and EUV wavelengths was presented. An initial configuration covering the wavelengths 0.05-320 nm for the species O 2 , N 2 , O and N has been developed and its performance is satisfactory. The next step is to further refine and validate this scheme and extend it to other relevant species and to pass the associated photolysis rates to an upgraded UKCA scheme in order to calculate exothermic chemical heating. At this stage the nudging scheme can be turned off.
The availability of the UM up to 135 km means that code has been written to read, compare and blend UM output with DTM, which is the first step towards developing MOWA and MCM. This initial comparison has highlighted the inaccuracies in the UM mentioned above. The blending between UM and DTM will likely also be more accurate if the UM lid can be raised further, to around 150-170 km. Initial attempts to run the UM at these altitudes has required a very strong "sponge" layer or have failed. These can be addressed by introducing a molecular diffusion and viscosity scheme. This will introduce realistic damping and enhance model stability. The scheme has already been developed by Griffin & Thuburn (2018) but needs to be adapted to run with the UM dynamical solver.
A higher cadence geomagnetic index is required in order to better represent thermospheric variability related to geomagnetic storms. Hp30, Hp60 and Hp90 indices, which are like the Kp index, but use 30 min, 60 min and 90 min cadences, respectively, are described here. The new indices are designed to resemble, as much as possible, the Kp index in terms of reliability, availability and properties such as the frequency distribution of its index values. It was shown that the Hp indices follow Kp for small and gradual Kp changes and during strong geomagnetic activity Kp and Hp indices show similarly high values. The temporal evolution of the Hp indices show significant differences from the Kp values for strong changes in geomagnetic activity that commences at other times than the start of a 3-hourly Kp interval. Overall, this suggests the Hp indices are behaving in the way they were designed to. A test Hp dataset is now available for users. An open-ended version, not capped at Hp = 9, is under development to allow a more detailed description of extreme events.
The development and testing of neural network and other machine learning methods applied to the forecast of geomagnetic indices is also an important part of SWAMI since this will lead to improved thermospheric forecasts for all the models used or developed in SWAMI. An initial assessment showed that for nowcasting, simple models based on average values of Kp and values from the previous solar cycle turned out to be more accurate than the persistence predictions. Addition of the solar wind data resulted in a much more accurate prediction. For predictions more than 12 h into the future, recurrence (values from the previous solar cycle) starts playing an important role.
Forecasts trained on the original Kp time series tend to systematically underestimate storm-time Kp data as the majority of the data points used as an input are quiet-time values. Duplication (or "resampling") of the points corresponding to high Kp, in order to produce a rebalanced dataset for training, can lead to improved index forecasts during storm times, though at the expense of decreasing the accuracy during quiet times. Such tailoring should be the subject of future research. While the above tests provided a systematic error estimation for the models based on neural networks, it remained unclear if the accuracy could further be improved by utilizing different methods for regression. To address this question, the performance of the LR, GB and FNN machine learning methods were compared. The performance of all these models was rather similar, which most likely implies that significant improvements to the prediction of Kp cannot be obtained by simply using another regression method. To achieve future improvements, predictions based on solar wind observations such as predictions with global heliospheric codes should be used.
In summary, we demonstrate significant scientific and operation progress towards improved European orbit prediction capabilities and provide guidance for future development in this field.