Thermosphere modeling capabilities assessment: geomagnetic storms

The specification and prediction of density fluctuations in the thermosphere, especially during geomagnetic storms, is a key challenge for space weather observations and modeling. It is of great operational importance for tracking objects orbiting in near-Earth space. For low-Earth orbit, variations in neutral density represent the most important uncertainty for propagation and prediction of satellite orbits. An international conference in 2018 conducted under the auspices of the NASA Community Coordinated Modeling Center (CCMC) included a workshop on neutral density modeling, using both empirical and numerical methods, and resulted in the organization of an initial effort of model comparison and evaluation. Here, we present an updated metric for model assessment under geomagnetic storm conditions by dividing a storm in four phases with respect to the time of minimum Dst and then calculating the mean density ratios and standard deviations and correlations. Comparisons between three empirical (NRLMSISE-00, JB2008 and DTM2013) and two first-principles models (TIE-GCM and CTIPe) and neutral density data sets that include measurements by the CHAMP, GRACE, and GOCE satellites for 13 storms are presented. The models all show reduced performance during storms, notably much increased standard deviations, but DTM2013, JB2008 and CTIPe did not on average reveal a significant bias in the four phases of our metric. DTM2013 and TIE-GCM driven with the Weimer model achieved the best results taking the entire storm event into account, while NRLMSISE-00 systematically and significantly underestimates the storm densities. Numerical models are still catching up to empirical methods on a statistical basis, but as their drivers become more accurate and they become available at higher resolutions, they will surpass them in the foreseeable future.


Introduction
Thermosphere models are used operationally mainly in the determination and prediction of orbits of active satellites and orbital debris, and conjunction analysis is becoming a major issue with the fast-growing number of objects in space. The accuracy of the determination and prediction of ephemerides of objects in Low Earth Orbit (LEO; altitudes lower than 1000 km) hinges on the quality of the force model for atmospheric drag (Hejduk & Snow, 2018). This force depends, besides on satellite characteristics (Doornbos, 2011;Mehta et al., 2017), heavily on the highly variable, both spatially as well as temporally, total neutral density, and to a lesser degree also on composition and temperature. Thermosphere variability is driven, on different time scales, by the changing solar extreme UV (EUV) emissions, Joule heating and particle precipitation due to interaction of the magnetosphere with the solar wind (referred to as "geomagnetic activity" as opposed to "solar activity"), and due to upward propagating perturbations that originate in Earth's lower atmosphere, which are currently not accurately quantified (Pedatella et al., 2014). As a result, the selected solar and geomagnetic activity drivers, and in case of density prediction, the accuracies of their forecasts, are crucial to thermosphere model accuracy. The impact due to errors in the driver forecasts is out of the scope of this paper but has recently been addressed by Bussy-Virat et al. (2018) and Hejduk & Snow (2018).
In order to track the progress over time of thermosphere first principle (FP) and semi-empirical (SE) models, appropriate metrics are required. Secondly, high quality neutral density data sets, preferably with high-spatial resolution covering long intervals of time (i.e. years, and ideally a complete solar cycle), are needed. The neutral density observations can then be used to verify model accuracy in space and time, i.e. with respect to latitude-longitude-local time variations, and solar and geomagnetic activity levels and seasonal variations. Data and metrics allow benchmarking of the models, quantifying errors and performance, and giving detailed descriptions of the improved (or degraded) performance.
Metrics and results for thermosphere model assessment on timescales from years to days have already been discussed and published (Bruinsma et al., 2018), but the metrics are not wellsuited to describe model performance during geomagnetic storms. This is due to the relative rareness of strong storms (average frequency of 60-200 days per 11-year cycle; https:// www.swpc.noaa.gov/noaa-scales-explanation), but also due to their sudden occurrences and relatively short durations (1-3 days typically). We propose specific additional metrics for stormtime, using high-quality and high-resolution density data that is required for model comparisons under storm conditions, using most of the events defined in Bruinsma et al. (2018). The assessment procedure and metrics are tailored to geomagnetic storms by unambiguously defining the time interval of evaluation, applying the same metrics as in the Bruinsma et al. (2018) study but on 4 specific phases of the storm as well as on the total interval. Secondly, additional metrics concern the maximum and timing of the storm peak density.
Presently, evaluations of both SE and FP models are available for single storms (Forbes et al., 1987(Forbes et al., , 2005Bruinsma et al., 2006) or several storms (Liu & Luehr, 2005;Knipp et al., 2017), or data of a specific satellite mission (Kalafatoglu Eyiguler et al., 2019). Because different metrics, data, or event durations were used, results are not directly comparable even in the case that the same storm was analyzed. The ultimate goal of this exercise is to evaluate all thermosphere models available on the CCMC (Community Coordinated Modeling Center: https:// ccmc.gsfc.nasa.gov) by comparing to the same data and for the same events, applying consistent and always identical metrics, in order to establish score cards that can help users select the best model for their objective.
The same three SE models, NRLMSISE-00 (Picone et al., 2002), JB2008 (Bowman et al., 2008) and DTM2013 (Bruinsma, 2015), and FP models, TIE-GCM (Roble et al., 1988;Richmond et al., 1992) and CTIPe (Fuller-Rowell et al., 1996), which were used in the Bruinsma et al. (2018) assessment, are considered in this paper. These five models are implemented at CCMC. The model resolution and drivers used in this assessment are listed in Table 1. One has to take into account that JB2008 regularly changes the files with the solar drivers, which are computed and modified by the modelers. The driver files (SOLFSMY and DTCFILE) that were online in the month March 2019 were used for the assessment given in this paper.
Section 2 provides short descriptions of the five models tested in this first part of the assessment. Section 3 presents the storm-time metrics designed to assess the models, which is applied to the five models in Section 4. After a short summary, the conclusions are given in Section 5.

Model descriptions
The following three sections are rather similar to the descriptions in (Bruinsma et al., 2018), but are repeated in this paper to be self-contained and for ease of reading.
2.1 Semi-empirical thermosphere models: NRLMSISE-00, JB2008 and DTM2013 SE models are mainly used in orbit computation and mission design and planning, and sometimes to provide initial conditions for FP models. They are easy to use and computationally fast, providing density and temperature estimates for a single location at a time (e.g., for each orbit position). They are climatology (or "specification") models that have a low spatial resolution of the order of thousands of kilometers, which is due to the low maximum degree (typically < 6) of the spherical harmonic expansion used in the algorithm, and low temporal resolution of hours, imposed by the cadence of the geomagnetic indices. Consequently, SE models cannot reproduce the realistic wave-like activity during geomagnetic storms such as large-scale traveling atmospheric disturbances (TAD; Bruinsma & Forbes, 2007), the complex dynamics in the polar caps, or the time-variable effects of tidal perturbations propagating from the lower atmosphere. They also do not take into account any storm pre-conditioning; the estimate is entirely based on a combination of statistical fits to the driver inputs. The minimum altitude of JB2008 and DTM2013 is 120 km, whereas for NRLMSISE-00 it is 0 km, and each can be used to approximately 1500 km. SE models are constructed by optimally estimating the model coefficients to data in a least-squares sense. Each model is based on a different combination of density, temperature, and composition measurements. The main sources of density data are satellite-drag inferred total densities by means of orbit perturbation analysis (Jacchia & Slowey, 1963) or accelerometers (e.g. Champion & Marcos, 1973), and neutral mass spectrometers (e.g. Nier et al., 1973), with the latter providing composition measurements. Drag-inferred and spectrometer data have in common that they do not provide an absolute measurement of density. This is due to calibration issues and, e.g., unknown oxygen recombination rates in case of mass spectrometers. The most recent and precise accelerometer-inferred density datasets are also not absolute. Their magnitude depends on the satellite model, and in particular the aerodynamic coefficient, which effectively is a scaling factor, that was assumed in the computation. As a consequence, the SE thermosphere models fit to the scales of the satellite models used in the respective databasesand these are rarely consistent between modelers. Intercalibration is a necessary and complicated activity for all modelers, but it is not always entirely successful due to e.g., no overlap in time or a drifting offset between datasets. JB2008 (up to 2008 at least) and DTM2013 predict densities that are close (often within 5%) to the US Air Force operational thermosphere model HASDM (Storz et al., 2005), and therefore they are considered having the same scale.

TIE-GCM
The NCAR Thermosphere-Ionosphere-Electrodynamics General Circulation model (TIE-GCM) is a first-principles upper atmospheric general circulation model that solves the S. Bruinsma et al.: J. Space Weather Space Clim. 2021, 11, 12 Eulerian continuity, momentum, and energy equations for the coupled thermosphere-ionosphere system (Roble et al., 1988;Richmond et al., 1992). It uses pressure surfaces as the vertical coordinate and extends in altitude from approximately 97 km to 600 km. The model resolution on the geographic grid employed throughout this study are 5°horizontal and 1/2 scale height H in the vertical, while a 2.5°horizontal and H/4 vertical resolution is also available. Tidal forcing at the lower boundary is specified by the Global Scale Wave Model (Hagan et al., 2001), and semi-annual and annual density periodicities are enhanced by applying seasonal variation of the eddy diffusivity coefficient at the lower boundary (Qian et al., 2014). Solar inputs are driven using F 10.7 radio solar flux measurements as a proxy for XUV/EUV/FUV solar flux as described by Solomon & Qian (2005), which were derived using the same model resolution and tidal lower boundary specification. The electrodynamo potential field is internally generated at middle and low latitudes using the model densities and neutral winds. This is merged with a magnetospheric potential at high latitudes, using one of two available empirically driven models. First, the Heelis et al. (1982) empirical formulation, driven by the Kp index, follows the method described in Solomon et al. (2012). Second, the Weimer empirical model, which uses upstream solar wind and IMF as input as described in the following subsections, is described by Weimer (2005). Separate simulations were carried out using each of these high-latitude potential models for this work. Recent developments include the addition of helium for high-altitude extension and lower boundary options as described in Qian et al. (2014), Sutton et al. (2015), and Maute (2017). The version 2.0 of TIE-GCM used in this work is a community release that was issued in March 2016.

CTIPe
The coupled thermosphere-ionosphere-plasmasphere electrodynamics (CTIPe) is a global, three-dimensional, timedependent, nonlinear, self-consistent model that solves the momentum, energy, and composition equations for the neutral and ionized atmosphere (Fuller-Rowell et al., 1996;Millward et al. 2001;Codrescu et al., 2012). The global atmosphere in CTIPe is divided into a series of elements in geographic latitude, longitude, and pressure. The latitude resolution is 2°, the longitude resolution is 18°, and model parameters are calculated with a 1 min time step. In the vertical direction, the atmosphere is divided into 15 levels in logarithm of pressure from a lower boundary of 1 Pa at 80 km to more than 500 km altitude. The magnetospheric input is based on the statistical models of auroral precipitation and electric fields described by Fuller-Rowell & Evans (1987) and Weimer (2005), respectively. Auroral precipitation is keyed to the hemispheric power index (PI), based on the TIROS/NOAA auroral particle measurements. The Weimer electric field model is keyed to the solar wind parameters impinging the Earth's magnetosphere, and its input drivers include the magnitude of the interplanetary magnetic field (IMF) in the y-z plane, together with the velocity and density of the solar wind. A combination of measurements from Advanced Composition Explorer and Wind spacecraft instruments, obtained at NOAA Space Weather Prediction Center, NASA's Space Physics Data Facility and Los Alamos National Laboratory, have been used to address data gaps and quality issues (e.g., Skoug et al., 2004). The (2,2), (2,3), (2,4), (2,5), and (1,1) propagating tidal modes are imposed at 80 km altitude (Fuller-Rowell et al., 1991;Müller-Wodarg et al., 2001). The amplitudes and phases for the Hough modes are based on results from the Global Scale Wave Model-09 (GSWM-09; https://www2.hao.ucar.edu/gswm-global-scalewave-model). In this paper, the lower boundary conditions in CTIPe simulations are specified using monthly averaged wind and temperature fields from the Whole Atmosphere Model (WAM; Akmaev et al., 2008;Fuller-Rowell et al., 2008). CTIPe uses time-dependent estimates of nitric oxide (NO) obtained from Marsh et al. (2004) empirical model based on student nitric oxide explorer (SNOE) satellite data rather than solving for minor species photochemistry self-consistently. For higher altitude applications, helium needs to be included in the model. Solar heating, ionization and dissociation rates, and their variation with solar activity are specified by Solomon & Qian (2005) solar EUV energy deposition scheme for upper atmospheric general circulation models.

Model assessment procedure
The next two subsections describe the density data and the necessary preprocessing, the phases and storms selected for comprehensive model evaluation by applying the new storm performance metrics. All models will be tested according to the same standards allowing unambiguous comparisons, and quantification of improvement of future model upgrades.

Selected density data
The density variability in the thermosphere is very large, typically hundreds of percent, on long time scales (the solar cycle) but also on short time scales of hours to days in the event of strong geomagnetic storms. The variability depends on location, season, and the level of solar and geomagnetic activity, and it increases with altitude. However, most density measurements are in-situ along the satellite orbits, and the spatial and temporal data distribution of a single satellite is in fact rather poor. The latitudinal extent depends on the orbital inclination, and the local time coverage is essentially limited to the time of its ascending and descending pass, which changes slowly due to the precession of the orbital plane. The temporal resolution achieved with a single satellite is one orbital period of roughly 1.5 h (and then the satellite passes the same latitude and local time, but at a different longitude), even if the measurement cadence is 5 or 10 s. Precise density datasets inferred from accelerometer data of recent satellite missions are selected because only these are compatible with storm-time evaluation, notably precise measurements from pole to pole. However, the assessments are based on densities from one, or two satellites at best, and one cannot reconstruct the complete picture of the thermosphere at any given time; we only see the densities in a latitude-local time frame at a relatively constant altitude, with a temporal resolution of about 95 min. Table 2 lists the essential information of the three selected datasets, CHAMP (Doornbos, 2011), GRACE (Bruinsma;unpublished) and GOCE (Bruinsma et al., 2014). The densities used in this study were smoothed to suppress variations with scales smaller than 600 km, and then down-sampled to 80 s cadence. The original GOCE densities can be obtained, after registration, on the ESA server (https://earth.esa.int). Table 3 lists the dates, minimum Dst and maximum ap/Kp of the storms, and the satellite data available for model assessment. The intervals are selected to cover a strong storm sequence that returns to low geomagnetic activity (low Kp), and secondly, observations from two satellites are preferred. This led to a slightly different selection of storms than proposed in (Bruinsma et al., 2018). Thirteen storms, eleven of which were CME-driven and two by High Speed Streams, were selected for this assessment using the available data provided by the three satellites. The eight storms in 2005 are the so-called problem storms for the US Air Force (Knipp et al., 2013). Strong storms cause the largest satellite orbit perturbations, together with degraded tracking performance, and model assessment is most pertinent and needed for those events. The impact of weak storms is not dramatic, and only one is selected because Table 3. Selected storm intervals and type (* = coronal mass ejection; ** = high speed stream), minimum Dst and maximum ap/Kp, satellite density data (CH = CHAMP, GR = GRACE, GO = GOCE) and rounded local time at equator (hr).

Start & end date
Min Dst ( 1. The four phases of the assessment interval, centered on the time of minimum of (hourly) Dst. The 3-hourly Kp index is also shown because it used in DTM2013, and NRLMSISE-00 after conversion to ap.
S. Bruinsma et al.: J. Space Weather Space Clim. 2021, 11, 12 it was a problem storm. While not strong but of moderate strength, the three storms in 2012 and 2013 were selected because they are the only ones for which high resolution density data is available in solar cycle 24, from GRACE and GOCE.
The local time at the equator is also given in Table 3, and it is clear that the distribution is sparse for any single storm.

Metrics for model-data comparison
The updated metrics for storms differ from (Bruinsma et al., 2018) mainly in how the time interval for assessment is defined. Storms are divided in four phases, two before and two after the minimum Dst value. After verifying that differences with a physics-based definition of the phases are small, and to facilitate automation, it was decided to use intervals of fixed lengths for the phases. The phases correspond to pre-storm (1), onset (2), recovery (3), and post-storm (4). Figure 1 illustrates these four phases with respect to the minimum in Dst. The pre-storm interval is used to de-bias the model with respect to the observations by computing a scaling factor, which is then applied to the model densities in phases 1-4. The scaling factor is determined by computing the ratio of the sum of all observations to the sum of all model densities in phase 1. This de-biasing procedure is used to minimize the effect of non-storm related model errors on the assessment. All data for each storm are selected 30-h before to 48-h after the minimum in Dst, which is defined as t 0 . The Dst and densities centered on t 0 for all storms listed in Table 3 are displayed per satellite in Figure 2. The choice of fixed lengths of the intervals is supported by the density profiles shown in Figure 2. Note that GOCE is missing data in phase 1 for the storm in 2012; there is no data for GRACE. This example of a data gap in one of the phases, maintained in this analysis because of the few observed storms in solar cycle 24, is another reason for the sparseness of the storm density database. Data is regularly missing during storms. In this case for GOCE in 2012, the scaling factor for de-biasing was not determined according to the metrics, but with part of the data in phase 2.
The models and data can be compared by computing density residuals, which is an absolute difference (observed minus computed). A better quantity to express a model's skill to reproduce the observations, i.e. reality, is the observed-tocomputed (O/C) density ratio. Density ratios of one indicate perfect duplication of the observations, i.e. an unbiased model that reproduces all features; deviation from unity points to under (larger than one) or overestimation (smaller than one). Because of the very large and dynamic range in density, mainly due to differences in altitude and solar activity (i.e. phase of the solar cycle), it is rather difficult to analyze and interpret model performance in absolute values. The relative precision given in the form of density ratios is always simple to comprehend. A model bias, i.e. the mean of the density ratios differs from unity, is most damaging to orbit extrapolation because it causes position errors that increase with time. The standard deviation (SD) of the density ratios, computed as percentage of the observation, represents a combination of the ability of the model to reproduce observed density variations, and the geophysical noise (e.g. waves, the short duration effect of large flares) and  S. Bruinsma et al.: J. Space Weather Space Clim. 2021, 11, 12 instrumental noise in the observations. The mean and SD of the density ratios, due to their distribution, are computed in Log space (Sutton, 2018): where N is the total number of observations. The correlation coefficients R are also computed. The correlation coefficient is independent of model bias and R 2 represents the fraction of observed variance captured by the model. Mean, SD and correlation are computed for each storm, for each separate phase as well as for the entire interval. These metrics are the same as in (Bruinsma et al., 2018) but applied to the defined storm intervals only in order to isolate the performance of the geomagnetic storm algorithm of the models. A second assessment, and updated metrics compared with (Bruinsma et al., 2018), concerns the amplitude and timing of the maximum density peak considering the entire time interval. The absolute relative amplitude error is expressed as a percentage of the measured maximum, and the timing of the peak with respect to the observed peak is expressed in hours, an example of which is shown in Figure 3. However, these two quantities cannot always be determined unambiguously, for example when two peaks are present, or a broad peak is present. For that Fig. 4. The mean l and the standard deviation r of the 24 mean density ratios, per phase (black) and overall (red), using data from three satellites. Fig. 5. The mean l and the standard deviation r of the 24 standard deviations of the density ratios (SD; %), per phase (black) and overall (red), using data from three satellites.
S. Bruinsma et al.: J. Space Weather Space Clim. 2021, 11, 12 reason, we have in the next section rejected results that could not be well determined.

Storm-time assessment results
The means of the density ratios per phase are displayed in Figure 4 for the five models, using different colors per satellite. The printed numbers are the means (l) and standard deviations (r) of the 24 colored symbols plotted vertically for each phase (in black) and for the entire storm event (phases 1-4; in red). The de-biasing (i.e. applying the scaling factor determined in phase 1 to all model densities in phases 1-4) results in density ratios close to unity in phase 1, and less scatter (smaller r) in all phases. The TIE-GCM runs with the Heelis and Weimer models as drivers are named TIEGCM-H and TIEGCM-W, respectively. From the satellite operational point of view, the overall mean is the most important number to consider as it directly relates to the satellite position at the end of the storm. It informs on the performance of the model over a complete storm, and a mean density ratio of one means that thermospheric density (proportional to satellite aerodynamic drag) was correct on average. Densities that were predicted too large compensated those too low and vice versa during the storm interval, leading to a correct mean density and consequently a correct satellite position at the end of the storm (but not necessarily during the storm). The most accomplished SE (FP) model according to this criterion is DTM2013 (TIEGCM-W), which obtains a mean of 0.99, i.e. only 1% mean bias, and a small standard deviation of 0.07. DTM2013, JB2008 and CTIPe have stable means per phase, i.e. the bias does not evolve significantly as a function of storm activity. NRLMSISE-00 has a clear storm signature, underestimating the density during the storm phases 2 and even more during phase 3. TIEGCM underestimates density Fig. 6. The mean l and standard deviation r of the 24 correlation coefficients per phase (black) and overall (red), using data from three satellites. Fig. 7. The absolute relative amplitude difference (top) and the modeled minus observed delay in time with respect to the maximum density peak, i.e. negative values mean that the model peaks too early (bottom) for 11 storms, using data from three satellites.
S. Bruinsma et al.: J. Space Weather Space Clim. 2021, 11, 12 in Phase 2, but the Weimer model radically improves performance in the decaying storm phase (Phase 3), enhancing the mean from 0.90 to 0.97 and reducing the standard deviation from 0.16 to 0.09.
The mean standard deviations of the density ratios (SD; %) per phase are displayed in Figure 5. The mean standard deviations for GOCE (green symbols) is smallest, followed by CHAMP (red symbols), and GRACE (blue symbols) has the largest mean standard deviation; this nicely demonstrates that relative variability increases with altitude, although it is partly due to data becoming noisier too. All models display significant degradation during the main storm phases 2 and 3, and all have the worst performance (largest standard deviation) for phase 2. The standard deviation of the SE models nearly doubles from phase 1 to 2, and the largest standard deviations are seen for NRLMSISE-00 and JB2008 (33.0%). JB2008 then achieves the best performance for phase 3, while NRLMSISE-00 and CTIPe have the largest mean standard deviations (29.5%) for that phase. The most accomplished SE and FP models over the entire storm are DTM2013 and TIEGCM-W, which have means of 25.5% and 28.3%, respectively. However, the positive impact on the standard deviation thanks to using the Weimer model instead of the Heelis model in TIE-GCM is very modest.
The mean correlation coefficients per phase are displayed in Figure 6. As expected, all models have reduced correlations in phases 2 and 3, and the lowest correlations are reached in phase 2. In line with results shown in Figures 4 and 5, the highest mean correlations over the entire storm are attained with DTM2013 and TIEGCM-W, which have a mean of 0.85.
The amplitudes and phases of the peaks in density are compared for 11 storms (18/01/2005 and 11/09/2005 do not allow unambiguous testing) and the results are displayed in Figure 7. The amplitudes are on average underestimated with NRLMSISE-00, JB2008, and TIE-GCM, whereas DTM2013 and CTIPe overestimated. CTIPe and JB2008 estimate the amplitudes with the smallest mean difference of slightly over 20%, while NRLMSISE-00 underestimates on average by 60%. DTM2013 and NRLMSISE-00 on average predict the storm peaks too early, while TIE-GCM and CTIPe by a small amount (0.37 h) predict them too late. The phasing of the storm peak is best on average with JB2008, which has a 4-min mean delay; however, the standard deviation is largest of all models, 4 h. The largest mean timing error of 2.89 h is reached with TIEGCM-H; using Weimer instead of the Heelis model improves the results considerably to an average delay of 0.51 h (amplitude is more correct too).

Summary and conclusions
The density data and indices for the 13 selected storms as well as updated metrics for thermosphere model assessment have been described in this paper. All storms are divided in four phases, which are relative to the time of minimum Dst. The mean and standard deviation of the density ratios, the correlations and the amplitude and timing of the peak density, are computed using available CHAMP, GRACE and GOCE density data and the CIRA models NRLMSISE-00, JB2008 and DTM2013, and the first principles models CTIPe and TIE-GCM using Heelis or Weimer drivers. The best results over the entire 4-phase storm period are obtained with DTM2013 and TIEGCM-W, while the oldest model, NRLMSISE-00, is the least precise. Compared to the assessments presented in (Bruinsma et al., 2018), in which the same models except TIEGCM-W were evaluated using entire years of data, this study confirms that best results are obtained with DTM2013 and that NRLMSISE-00 is trailing. During storms, TIEGCM-W is more precise than JB2008, NRLMSISE-00 and CTIPe, and using the Weimer instead of the Heelis model has a large impact during storms. The standard deviation of the density ratios increases with altitude, as in (Bruinsma et al., 2018), but around 450 km (GRACE) they are 40-60% during storms, which is 2-3 times larger than when comparing over entire years.
An important result is that the means of the density ratios of DTM2013, JB2008 and CTIPe do not significantly depend on the phase 1-4 of the storm, even if the standard deviations very much do for all models. NRLMSISE-00 underestimates during the onset and decaying phases (2 and 3) of storms, TIEGCM-H underestimates phase 2 and overestimates in phase 3, and TIEGCM-W underestimates phase 2. The results per phase, and the fidelity of the storm peak amplitude and time of maximum, showed strengths and weaknesses that the model developers can presently focus on.
The current model assessment is far from comprehensive. A more thorough model assessment, e.g. solar cycle, local time and altitude effects, under storm conditions requires more and better-distributed observations of density. In this study, with data from two (or one) satellites, density variations are monitored in four local time planes at best, and that only at two altitudes. Only four Kp = 9 storms were observed since 2000, thanks to CHAMP, GRACE and GOCE; bear in mind that the assessment is possible only thanks to data of opportunity, the objective of those missions was geodesy.