Evaluation of the DTM-2009 thermosphere model for benchmarking purposes

Aims: The Drag Temperature Model (DTM) is a semi-empirical model describing the temperature, density, and composition of the Earth’s thermosphere. DTM2009 and DTM2000, and the COSPAR reference models NRLMSISE-00 and JB2008, are evaluated in order to establish benchmark values for new DTM models that will be developed in the framework of the Advanced Thermosphere Modelling and Orbit Prediction (ATMOP) project. Methods: The total density data used in this study, including the high-resolution CHAMP and GRACE data, cover the 200– 1000 km altitude range and all solar activities. DTM2009, using an improved DTM2000 algorithm, was constructed with most data assimilated in DTM2000, but also with CHAMP and GRACE data. The bias and precision of the four models is evaluated by comparing to the observations according to a metric, which consists of computing mean, RMS, and correlation. Secondly, the residuals are binned, which procedure aids in revealing specific model errors. Results: This evaluation shows that DTM2009 is the most precise model for the data that were assimilated. Comparison to independent density data shows that it is also the most accurate model overall and a significant improvement over DTM2000 under all conditions. JB2008 is the most accurate model below 300 km, JB2008 and DTM2009 perform best in the 300–500 km altitude range, whereas above 500 km NRLMSISE-00 and DTM2009 are most accurate. The precision of JB2008 decreases with altitude, which is due to its modeling of variations in local solar time and seasons in particular of the exospheric temperature rather than modeling these variations for the individual constituents. Specific errors in DTM2009, for example related to the employed solar activity proxy, will be fixed in the next model release, DTM2012. A specific analysis under geomagnetic storm conditions is outside the scope of the present paper.


Introduction
Atmospheric density models are, besides their use in atmospheric studies, required in the computation of the atmospheric drag force in satellite orbit determination and prediction.They predict instantaneous temperature and (partial) density as a function of the location (altitude, latitude, longitude, local solar time), solar and geomagnetic activities, and day-of-year.The latest version of the Drag Temperature Model (DTM2009), is evaluated in this paper together with the COSPAR reference models NRLMSISE-00 (Picone et al. 2002) and JB2008 (Bowman et al. 2008), and the last published version of DTM, DTM2000 (Bruinsma et al. 2003) by comparison to total density data.
The evaluation of DTM2009 is done in the framework of the Advanced Thermosphere Modelling and Orbit Prediction project (ATMOP; http://www.atmop.eu),which is a European Union 7th Framework project.The main purpose of the present evaluation is to establish benchmark values for the new DTM models that will be developed and released in the autumn of 2012 (DTM2012) and 2013 (DTM2013).The main objective of the ATMOP research project is to update the DTM thermosphere model and to develop an operational version.It will enable atmospheric density calculation, which is mandatory for improved survey and precise tracking of objects in Low Earth Orbit (LEO) and the initiation of appropriate measures to minimize risks to satellites (tracking loss, collisions) and ground assets (re-entry zone).A full description of the project is given in the Science Document, which is available on the website.
Besides serving as benchmark values necessary to quantify the improvement of the ATMOP models over the a-priori DTM2009 model, a complete evaluation of the above-listed models has never been published using an extensive total density database and applying the same test metric.Recent evaluations of some models or with only a few satellites are given in Marcos et al. (2006) and Doornbos (2012).Specifically, model bias and precision as a function of altitude, latitude, local solar time, season, and solar-and geomagnetic activity are currently only approximately known.A detailed evaluation can actually be performed only recently thanks to the high-resolution total density data inferred from CHAMP (Reigber et al. 1996) and GRACE (Tapley et al. 2004) accelerometer measurements over almost a full solar cycle.These density data sets only allow the fine analysis of the latitudinal structure per-orbit (about 95 min) basis, and from pole-to-pole.However, the density database is augmented for completeness with total densities inferred from: d Accelerometers onboard Atmosphere Explorers-C and -E (AE-C/E); d Two-Line Elements (TLEs) of a selection of specific spacecraft; d Stella and Starlette precise orbit determination; d Deimos-1 precise orbit determination; d GOCE thruster data during drag-free flight; d Densities inferred from energy dissipation rates (EDR densities).
The densities listed under bullets 2-4 are low-resolution observations, averaged over 1 day or longer, but their spatial and solar cycle coverage is good.AE-C/E and GOCE data are of high resolution and moreover at low altitudes.Bruce Bowman has kindly made the EDR data under bullet 6 available as additional validation data in the 200-500 km altitude range.
The models tested in this paper, which are constructed by fitting to the underlying density database as good as possible in the least-squares sense (i.e.semi-empirical model), reproduce the mean climatology of the thermosphere.The spatial resolution of these models is of the order of thousands of kilometers.As a consequence, all density variations at smaller scales are sources of geophysical noise.The solar-and geomagnetic proxies limit the temporal resolution of these models to 1 day and 1-3h, respectively.Therefore, small-scale and high-frequency density perturbations, which are primarily present at high latitudes, cannot be modeled either and contribute to the prediction uncertainty.
The next section reviews the model and data used in this study.The test results are presented in Section 3 and discussed in Section 4. The last section summarizes the findings.Note that a specific analysis under geomagnetic storm conditions has not been done in the present study.This will be done in a separate study next year in order to evaluate new geomagnetic indices, developed within ATMOP, and the model DTM2013.

Model and data
The five distinct types of density data as well as the (external to the project) data made available by Bruce Bowman that are used in this evaluation are described in Section 2.1.Section 2.2 summarily presents the DTM2009 model and lists the notable differences with the other models in this evaluation.Reigber et al. 1996) in the altitude range 450-250 km, and the Gravity Recovery and Climate Experiment (GRACE; Tapley et al. 2004) near 490 km altitude using the methodology described in Bruinsma et al. (2004).The CHAMP data set used in this study covers the period 20/05/2001 through 2/9/2010, i.e. up to 2 weeks before atmospheric reentry.The GRACE data cover the period 1/3/2003 through 31/12/2010.Both data sets cover solar cycle maximum to minimum conditions.CHAMP (GRACE) 24-h local solar time sampling is achieved approximately every four (five) months.The accelerometers provide high-resolution measurements from which densities are inferred with 80 and 40 km in-track resolution for CHAMP and GRACE, respectively, from pole-to-pole.The horizontal resolution of these observations is more than an order of magnitude higher than that of the models.The Atmosphere Explorer (AE) satellites also carried accelerometers in their payload, and densities have been derived from the Miniature Electrostatic Accelerometer (MESA) instruments.These data are made available by NASA's National Space Science Data Center.In this study we use density data from AE-C (68.1°inclination) from 1973.0 to 1976.3, and AE-E (19.7°inclination) from 1975.0 to 1977.8, in the altitude range 150-250 km.The in-track resolution of these measurements is approximately 120 km.

Total density inferred from Starlette and Stella orbit analysis
The cannonball-shaped geodetic satellites Stella and Starlette are in a 96°inclination and near-circular orbit at approximately 813 km altitude, and 49°inclination slightly eccentric orbit (e = 0.02) with a perigee altitude of 800 km, respectively.They are suitable spacecraft for this kind of analysis because of their spherical shape (no attitude-related errors), perfect knowledge of the characteristics (mass, surface, reflectivity), and the very accurate laser tracking by the International Laser Ranging Service (ILRS; Pearlman et al. 2002) for more than a solar cycle.The densities are inferred from the analysis of orbit perturbations, a technique first used in 1962 (Jacchia & Slowey 1962) that essentially ties the observed decay to a mean density, and span the periods 1992-2010 and 1994-2010 for Starlette and Stella, respectively.The GINS precise orbit determination software (Marty et al. 2011), employing a state-of-the-art force model, is used in the reduction of the laser ranging data in 6day orbits.Drag scale factors are estimated on a daily basis and they are connected linearly in the orbit computation, i.e., the background model NRLMSISE-00 densities are rescaled using factors linearly evolving in time.The densities are however not purely scaled copies of NRLMSISE-00 because the model for the drag coefficient C d used in GINS (Sentman 1961) also holds small latitudinal variations because of its dependence on atmospheric composition and temperature.The drag coefficient varies from 2.15 to 2.38 from high to low solar activity.These values correspond to a model for diffuse reflection with full accommodation.For 80% or 60% accommodation, the drag coefficients are approximately 10% and 20% larger, i.e., the inferred densities would be 10% and 20% smaller.Unfortunately, the true accommodation is not known and the topic of drag coefficient modeling is still largely an open question even for spherical satellites.
The Stella densities are downsampled to 2100 km (300 s) spacing along the orbit, which is sufficient in view of the resolution of the models.This procedure, which favors comparisons to NRLMSISE-00, is used because density scale factors estimated in the orbit determination cannot pinpoint the location of the density model error.Computing daily averages of both data and models would solve this problem, but our analysis software can only process one density at a time at a defined location.
The Starlette densities, because of the slightly eccentric orbit, can be referenced to the perigee positions.Similar to the Stella processing, the background model NRLMSISE-00 densities at the perigees are rescaled using factors linearly evolving in time.The orbit determination yields 14 density values per day, which are averaged per day (Starlette perigee positions can be considered constant over 24h).

Total density inferred from TLE data
The TLE data (mean orbital element sets; result of the orbit adjustment using radar tracking data) used in this study are available on the celestrak.comwebsite.Mean total mass densities for the period 1967-2010 are derived for 68 objects using the TLE method described in Picone et al. (2005).The subset of 68 spacecraft is selected because the ballistic coefficients are rather stable and it covers the altitude range of interest here, namely 200-1000 km.The densities obtained with this method are averaged over periods of 3 days and longer and time stamped in the middle of the interval.The accuracy of these data is at the 20-30% (depending on altitude and phase of the solar cycle), and all variations with smaller time scales than the averaging period (e.g.due to geomagnetic storms, but also the daily solar flux) are much attenuated.

Total density inferred from Deimos-1 orbit analysis
Precise orbit determination of Deimos-1, a small Earth observation satellite in a circular Sun-synchronous orbit (LST approximately 10:40/22:40) at about 680 km, using the navigation bulletins of the onboard GPS receiver was performed for 2010-2011.The satellite has a fixed attitude and a simple shape (cube of about 90 kg), thus minimizing errors in the drag computation due to uncertainties in the frontal area perpendicular to the speed.The inferred densities are comparable in quality to the Stella/Starlette densities, since they are computed using the GINS orbit determination software but with a different type of tracking data.The densities are again daily scaled NRLMS-ISE-00 densities along the orbit with an approximate spacing of 1800 km.

Daily total densities by means of the EDR method
Daily-mean densities inferred from satellite drag data of 23 satellites in elliptical orbits, with perigee heights in the 200-500 km range, were kindly provided by Bruce Bowman for this evaluation.Latitude coverage is from pole-to-pole taking all objects into account, and most satellites cover 2-3 solar cycles of which we only use the last (1997)(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010).The densities are computed from the estimated energy dissipation rates, which are derived from the orbits fitted directly to radar and optical tracking data.This method thus benefits from the true accuracy of the classified radar data, which result in much more accurate densities than can be computed with TLE data as will be shown in Section 3. The uncertainty of the daily densities is estimated to be 2-4%.For a complete description of the method and the data, we refer to Bowman et al. (2004).

Data distribution
The distribution of the data sets described in the previous subsections is summarized in Figure 1 as a function of mean solar flux and altitude.The TLE and EDR densities are available for the entire period and the full and 200-500 km altitude range depicted, respectively, and therefore they are not shown explicitly.Although the coverage is not complete, it is sufficient to evaluate the models over the most crucial range of altitudes (200-500 km) and for minimum to maximum solar activity conditions.

Upgrading of the DTM2000 algorithm: DTM2009
The representation of the total density in the altitude range 120-1500 km is achieved by summing the contributions of the main thermosphere constituents (N 2 , O 2 , O, He, H), under the hypothesis of independent static diffuse equilibrium (which is not always attained (Aikin et al. 1993), but its effect on density is not taken into account).The height function f i (z) per constituent i is the result of the integration of the differential equation of diffusive equilibrium: with T 1 = exospheric temperature, a = thermal diffusion coefficient for He and H (À0.38), c i = m i g (120 km)/(rkT 1 ), m i = atomic or molecular mass, g (120 km) = gravity acceleration at 120 km altitude, r = relative vertical temperature gradient dT 120 /(T 1 À T 120 ), It is employed in all DTM models (DTM78, Barlier et al. 1978;DTM94, Berger et al. 1998;DTM2000, Bruinsma et al. 2003).The partial densities specified at 120 km altitude are propagated to higher altitudes employing this function.The exospheric temperature and the partial density variations as a function of the environmental parameters L (latitude, local solar time, solar flux, and geomagnetic activity) are modeled by means of a spherical harmonic function G(L).The total density q at altitude z is then calculated as follows: DTM2009 models the exospheric temperature and the atmospheric constituents each with up to 50 coefficients.The function G is used to describe periodic and non-periodic variations.Periodic variations are defined as annual and semi-annual terms, as well as diurnal, semidiurnal, and terdiurnal terms.The non-periodic terms consist of constant zonal latitude coefficients, and coefficients relating solar-and geomagnetic activities to temperature and density.Noteworthy extensions to the function G with respect to the one used in DTM2000 are direct coupling of diurnal, semidiurnal, and seasonal amplitudes with mean solar activity.
2.3.Relevant information on NRLMSISE-00 and JB2008 NRLMSISE-00 employs F10.7 and the ap index to model temperature and major and minor constituents.JB2008 is constructed using a combination of solar-and geomagnetic proxies and indices, which are not all available before 1997 (SOHO/SEM in particular), to reconstruct temperature and total density.The JB2008 model can only be compared to density data from 1997 through the present.Figure 2 shows the JB2008 and NRLMSISE-00 model predictions for low solarand geomagnetic activities conditions (20 October 2009; 81day mean F10.7 = 72, Ap = 3) for two altitudes, 400 and 800 km.The large difference in structure at 800 km is due to the simpler algorithm used in JB2008.The consequences for the model performance of this simplification are shown in Section 3.2.
The density data described in Section 2.1 have not been assimilated in NRLMSISE-00, whereas a few years of CHAMP data were used in JB2008 for storm-time modeling only.However, nearly all CHAMP, GRACE, Starlette, and Stella data have been assimilated in DTM2009.The EDR densities have been assimilated from 1997 to 2007 in JB2008.Table 1 lists the density data used in this evaluation and specifies if they were assimilated in the models.This has to be taken into account when interpreting the results of the comparisons that are presented in the next section.

Results
The models are evaluated by computing the mean and RMS of the density ratios and residuals, which we have defined as ''observed-to-calculated'' (O/C) and ''observed minus calculated'' (O-C).They reflect relative and absolute precision of the models, respectively.A model bias, i.e. the mean, is most damaging in orbit extrapolation because it causes position errors that increase with time.The RMS represents a combination of the ability of the model to reproduce the observed variations and the geophysical and instrumental noise in the observations.The correlation coefficients R are also computed.Contrary to RMS, they are insensitive to model bias and R 2 represents the fraction of observed variance captured by the model.The density ratios are binned as a function of altitude, latitude, local time, month, mean F10.7, and Kp.Analysis of the mean and RMS of the ratios may give clues about the modeling errors, as well as about data inconsistencies.Specific results are presented in the following subsections.The comparisons to the data described in Section 2.1 are presented in the same order in Section 3.1.
The density data may be biased due to errors and approximations in the satellite macro models, in the drag coefficients, and surface-to-mass ratios used in their derivation.In this study we assume that the JB2008 model (i.e. the COSPAR reference for drag computation) has no significant bias, at least below 500 km, and when necessary the data are rescaled accordingly.

Comparison to CHAMP, GRACE, and AE-C/E densities
The mean and RMS, and the correlation coefficients, are computed per year for these data sets.JB2008 models the CHAMP densities without bias over the period 2001-2009 (the mean density ratio is 1.00).The GRACE densities were scaled in such a way as to be consistent with the CHAMP densities.This was done by comparison of normalized densities (to the average altitude) for two periods (in 2003 and 2005) for which the orbits of both satellites were coplanar, and then scaling the GRACE to the CHAMP densities (Bruinsma & Forbes 2010).The scale factor applied to the GRACE densities is 1.23.The mean and RMS of the CHAMP density ratios and residuals per year are displayed in Figure 3, respectively.The GRACE results per year are shown in Figure 4.The best results are obtained, as expected, with DTM2009.The test statistics for the complete data sets are summarized in Table 2.
A notable feature in the CHAMP and GRACE figures is the dip in 2008 at the solar minimum, which can be explained by the very low solar EUV emissions.The actual EUV flux is overestimated with the F10.7 proxy during the last solar minimum, and consequently models overestimate density (Solomon et al. 2010).Only the 2010 data were not assimilated in DTM2009, but that year all models deviated in a similar way, which may point at data problems.This is very likely due to the much less stringent thermal control of the accelerometers, in case of GRACE caused by insufficient battery power.Since 2011 the GRACE accelerometers have to be turned off during eclipse periods for that reason.The correlation coefficients of the models are all 0.9 or higher for CHAMP, except for DTM2000 (0.86-0.95), and always 3-6% less for GRACE.The smallest correlation coefficients for DTM2009 and JB2008 are for the year 2009.
The RMS of the density residuals follows the solar cycle, i.e., the absolute error is largest during solar maximum, whereas the relative precisions present an opposite variation: the models are least precise relatively during solar minimum.This is due to the amplitude of the relative density variations (variation with respect to a mean or trend), which was found to be inversely proportional to solar activity (Bruinsma & Forbes 2008).This effect is clearly visible for GRACE in Figure 4 and much less in Figure 3, because CHAMP's altitude decreased about 200 km over this period (Fig. 1) thereby counteracting the solar cycle trend.A more general assertion is that the absolute error is proportional to the magnitude of density; the magnitude increases with solar activity but also as the satellite decays.The relative precision is inversely proportional to solar activity and altitude (because relative density variations amplify with altitude).
Binning the density ratios in the model input parameters has tested the quality of the models with respect to specific variations.The most interesting results are obtained when the density ratios are averaged in 2-h local time bins, and secondly in latitude bins of 20°. Figure 5 presents the local time binned CHAMP density ratios when all the data are used, and the bias of NRLMSISE-00 and DTM2000 is clearly visible.DTM2000 has a marked terdiurnal signature, DTM2009 a slight one but a pronounced peak at LST = 13, whereas NRLMSISE-00 displays a more diurnal variation.JB2008 has the smallest variations.Figure 6 shows the latitude binned CHAMP and GRACE density ratios, which we only show for the two best models, DTM2009 and JB2008.The small bias of DTM2009 (most profiles lie within the ±5% interval indicated by the dashed lines) without apparent latitudinal structure is clearly visible, as well as the large biases for 2010 for both models and both satellites.The year 2008 stands out also, except for   DTM2009 and GRACE data.JB2008 has the highest biases at the equator for CHAMP, but at high latitude for GRACE.For CHAMP, we also observe a clear and unexplained separation between the profiles for years 2001-2005 and 2006, 2007, and 2009.As stated in Section 2.2, the AE-C/E densities cannot be compared to JB2008 because all indices are not available before 1997.The best results are obtained with DTM2000, which model assimilated these data.The DTM2009 and NRLMS-ISE-00 results are slightly more biased, underestimating the densities by 5-10%.However, if we assume that DTM2009 is unbiased (supported by Sects.3.3 and 3.5) and in view of that rescale the MESA data, best results are obtained with DTM2009.The test statistics of the comparison to AE data are listed in Table 3.

Comparison to Starlette and Stella densities
The Starlette and Stella densities from before 1998 cannot be compared to JB2008.All data are assimilated in DTM2009 except for 2010, and Starlette from 1992 to 1993.The Stella data are used per year, whereas the Starlette densities are compared in four batches, corresponding to high-medium-low solar activity, and the 2-year validation set 1992-1993.The data have not been rescaled using JB2008 because they are accurate thanks to the satellite shapes and quality of the orbit fits.This choice is justified in Section 4.
The mean and RMS of the Stella density ratios and residuals per year are displayed in Figure 7, respectively.The Starlette results are shown in Figure 8.The best results are obtained, again as expected, with DTM2009.Most importantly, it has the smallest means and RMS of the residuals (bias) during solar cycle maximum 2000-2002.The RMS of the density ratios computed for the entire period 1994-2010 is also the smallest, 0.14, NRLMSISE-00 being the second best with 0.15.
A prominent trait in the figures is the rather low relative precision (RMS of the density ratios of about 0.60, compared to 0.15-0.20 for the other models) and the systematically higher RMS of the residuals of JB2008 for medium-to-low solar activity (years 2003-2010), whereas the mean of the residuals is small.This is due to the modeling of variations in solar-and geomagnetic activities, local solar time, and season for each constituent separately, Helium and Oxygen in particular (DTM and NRLMSISE-00), as compared to the modeling of the variations in exospheric temperature only (JB2008).
The models' latitudinal structures as a function of solar activity and season are very different because Helium mainly sits in the winter hemisphere, which causes the density maximum to be over the winter pole at Stella/Starlette altitude during solar minimum conditions.This so-called winter Helium bulge (Keating & Prior 1968) is absent in JB2008, which model predicts density maxima always in the summer hemisphere.The resulting erroneous structure causes the larger RMS of the residuals and the small correlations shown in Figure 9.However, on average (i.e. the mean of the density residuals) JB2008 is better than NRLMSISE-00 and DTM2000, and only slightly worse than DTM2009.
A second notable point is the means of the residuals during the minima of solar cycles 22 (1995-1996) and 23 (2007-2009) in Figures 7 and 8.For NRLMSISE-00 and DTM2000, the means during the cycle 22 minimum are higher than for cycle 23, which is probably due to the lower densities during the latter cycle caused primarily by the lower solar EUV emissions (Solomon et al. 2010).The same behavior is not seen in DTM2009, which has assimilated the Stella and Starlette data and therefore has accommodated this factual density decrease in some (invalid) way.This flaw can be corrected for in a model update employing a proxy or index that takes, contrary to F10.7, the lower EUV emissions during solar cycle 23 well into account.
The Starlette and Stella data have also been binned.Because of the complete solar cycle coverage, the most interesting results are obtained when the density ratios are averaged in mean solar flux bins.Figure 10 shows this for Stella, which due to its orbit configuration is more suitable to reveal solar activity related modeling errors than Starlette.The JB2008 model has  highly variable bias, ranging from 1.02 to 1.30, but without any recognizable signature.NRLMSISE-00 biases increase about linearly from low to high solar activity, whereas DTM2000 appears to have a quadratic component (due to not correctly modeling the nonlinear effect of mean F10.7 on density).The biases are rather constant with DTM2009, the largest excursion being for the 190-200 sfu bin.

Comparison to TLE densities
The TLE densities are only used in the interval 1998-2010 imposed by the availability of the JB2008 indices.Secondly, five altitude intervals are selected that are representative of different chemical composition and cover the accelerometerinferred densities: 200-400 km, 400-500 km, 500-600 km, 600-700 km, and 700-1000 km.The JB2008 model, assuming that it is unbiased on average, is used to rescale the densities.
The scale factors for the densities determined in this procedure and applied in the comparisons are 1.22, 1.16, 1.12, 1.05, and 1.05 for the four altitude intervals, respectively.Figure 11 gives the results of the comparisons, and it reveals that DTM2009 has nearly the same absolute scale as JB2008 (i.e. the mean of the density residuals and ratios).On the other hand, the NRLMS-ISE-00 and DTM2000 absolute scales are quite different below 600 km (and presumably wrong).The relative precision (RMS of the density ratios) of DTM2009 is best, whereas JB2008, despite its unbiasedness due to scaling, performs significantly worse than the other models at all altitudes.However, the absolute precision of JB2008, which is most important for the objective it was developed for, namely orbit extrapolation, is equivalent to the other models.The correlation coefficients of the models are displayed in Figure 12.Except for the 200-400 km altitude range for which they are equal, DTM2009 and NRLMSISE-00 always have higher correlation coefficients than JB2008 and DTM2000.The correlation decreases with altitude for all models, but more notably so for JB2008 for reasons explained in Section 3.2.The JB2008 correlation coefficients are noticeably higher for the TLE data in the 700-1000 km interval than for the pointwise Starlette densities (Fig. 9), because they are mean densities at the equator.Binning of the TLE data revealed some interesting model differences.Figure 13 shows the TLE density ratios in the 200-400 km range binned in day-of-year.There is a clear offset of almost 10% between JB2008 and DTM2009 on the one hand and NRLMSISE-00 and DTM2000 for the second half of the year.All models present a peak around the March equinox, whereas nothing is visible around the September equinox.Annual or semi-annual signals are not visible, indicating that    these variations are well modeled at the equator (note: the models contain symmetric and asymmetric, seasonal terms).The TLE density ratios in the 700-1000 km range binned in solar local time are presented in Figure 14.The scatter of all models except DTM2009 is much larger at this altitude than for CHAMP (Fig. 5).However, DTM2009 exhibits a small (a few percent) semidiurnal signal.A diurnal signal is revealed in the JB2008, NRLMSISE-00, and especially DTM2000 (about 20%) density ratios.The TLE densities were not used in the construction of DTM2009, or NRLMSISE-00, and only indirectly in JB2008, so these comparisons constitute independent validation.

Comparison to Deimos-1 densities
The Deimos-1 densities have not been rescaled because Helium is the main constituent for a large part of each orbit, and therefore JB2008 is not the best choice to do so for reasons elaborated in Section 3.2.Secondly, the TLE densities in the 600-700 km and 700-1000 km intervals compare about equally well on average with DTM2009, JB2008, and NRLMSISE-00, i.e., the models have nearly the same absolute scale for the highest altitude intervals of the evaluation; only JB2008 has a significant bias with Deimos-1 data.The Deimos-1 densities equally were not assimilated in any of the tested models, so these comparisons constitute independent validation.Table 4 lists the results of the tests, and it shows that NRLMSISE-00 has the highest relative and absolute precision.The large bias and RMS values for JB2008 are again due to the erroneous latitudinal structure during low solar activity at altitudes above 500 km.This leads to a correlation coefficient of 0.87, compared to 0.89, 0.93, and 0.96 for DTM2000, DTM2009, and NRLMSISE-00, respectively.
Binning the Deimos-1 data is only meaningful in day-ofyear because of the Sun-synchronous orbit and the limited solar cycle coverage.This is presented in Figure 15.Again a peak is detected close to the March equinox for all models.We also note a rather similar evolution of the means of NRLMSISE-00 and DTM2009, and quite different for DTM2000 and more so for JB2008 (see, e.g., days 145-235 and 315-365).

Comparison to preliminary GOCE densities
The high-resolution and low-altitude preliminary GOCE densities have not been ingested in any model yet, and give insight specifically into the Nitrogen (N 2 ) and Oxygen (O) modeling.Table 5 gives the results of the comparisons per month without rescaling of the densities, which is justified in view of the small biases in 2009 for JB2008.The models performed considerably worse in 2011.This can be explained by the mean solar flux (F10.7) in 2009 and 2011 periods, which was 71-75 and 90-106, respectively.Secondly, geomagnetic activity was much higher in 2011 also, with several days of enhance storm activity.The overall performance of JB2008 is best for GOCE, followed by DTM2009.
Binning the GOCE data in latitude is most informative, and Figure 16 displays this for November 2009 and March 2011.The latitude profiles are flattest (uniform relative precision) with JB2008.DTM2009, DTM2000, and NRLMSISE-00 have differences from pole to equator of the order of 15%, 30%, and 20%, respectively, in November 2009.For March 2011, DTM2009 displays a significant hemispherical asymmetry that is most likely due to a weakness in the Nitrogen modeling.

Comparison to EDR densities
JB2008 has been fitted to the EDR densities, whereas the other three models have not.Therefore, we may expect a better fit to JB2008 in this case.The densities are only used in the interval 1998-2010, imposed by the availability of the JB2008 indices, in two altitude intervals: 200-400 km, 400-500 km.These intervals are representative of the accelerometer-inferred densities from GOCE and CHAMP (i.e.no significant Helium contribution), and GRACE, respectively.The results, averaged over all satellites in the respective altitude interval, are presented in Table 6.The significantly better performance of JB2008 and DTM2009 compared to that of NRLMSISE-00 and DTM2000 is confirmed for the low altitudes.Secondly, the relative precision of JB2008 and DTM2009 is equal for the 200-400 km range, but for the 400-500 km interval DTM2009 becomes more precise.In terms of density residuals however, JB2008 has approximately a 25% smaller RMS for the  200-400 km range, which slinks to 5% for the 400-500 km interval.The lesser performance of JB2008 with increasing altitude is in agreement with the findings presented in the previous subsections.
Considering the results of the comparisons for the individual spacecraft, we have done some additional tests with JB2008 and DTM2009 by separating the high (1999-2004) and low (2005-2010) solar activity years in the 200-400 km altitude interval.These comparisons reveal (not shown) that JB2008 is more accurate for the high solar activity years and DTM2009 for the low solar activity years; because the errors are about four times bigger for 1999-2004, JB2008 performs better taking the entire period into account (Table 6).

Summary and conclusions
The latest version of DTM, DTM-2009, has been constructed using CHAMP and GRACE accelerometer-inferred densities      Unsurprisingly, DTM2009 is most precise for the data sets that were assimilated.For CHAMP and GRACE it performs significantly better than JB2008, whereas NRLMSISE-00 and DTM2000 are much less precise and biased.For the higher altitude Starlette and Stella data, DTM2009 performs significantly better than NRLMSISE-00, whereas JB2008 and DTM2000 are much less precise and appear to be biased.
Based on the only independent low-altitude density data set, GOCE, JB2008 is the most accurate and unbiased model below 300 km, closely followed by DTM2009.However, JB2008's precision decreases with altitude, and this is already visible in the comparison results with the EDR densities.These data confirm that JB2008 is most accurate for altitudes below 400 km and for high solar activity, whereas during low solar activity, from 2005 through 2010, DTM2009 is more accurate.For altitudes above 500 km DTM2009 and NRLMSISE-00 are clearly more accurate.This is in large part due to the modeling of individual constituents versus total density only in JB2008, and probably also due to the decreasing accuracy with altitude of the mean densities used to construct the model.
Taking all of the above into account, DTM2009 is the most accurate model overall and a significant improvement over DTM2000 under all conditions.JB2008 is more accurate below 300 km.A higher accuracy will be achieved as soon as GOCEinferred densities at 250 km altitude will be assimilated.NRLMSISE-00 is slightly more accurate than DTM2009 for altitudes above 500 km and during solar low-to-medium activity conditions, which suggests a better Helium modeling.The next update of the model in the summer of 2012 will be made with more density data, and moreover employing the S10.7 index, which is more representative of solar EUV activity than F10.7 is (Dudok de Wit & Bruinsma 2011).The low levels of density during the last solar minimum compared to 1996, which are due to lower solar EUV emissions but not radio emissions, can then also be modeled correctly.
2.1.5.Total density inferred from thruster data of the drag-free GOCE satelliteThe Gravity field and steady-state Ocean Circulation Explorer(ESA 1999), the first of ESA's Earth Explorer missions, was launched in March 2009 in a 96°inclination, dawn-dusk orbit.The mission altitude of 255 km was reached in October 2009 and has been maintained since thanks to the ion thrusters that compensate for atmospheric drag continuously.The attitude, controlled in a 3°dead band, is very accurately measured, making precise calculation of the frontal area possible.The satellite positions are determined by GPS.The thrust is recorded onboard every 8 s with an accuracy at the few percent level.The evolution of the satellite mass is monitored accurately, which allows the calculation of the drag acceleration.The resulting preliminary (official products are expected end 2012) GOCE densities, only for November-December 2009 and March-April 2011 at the time of writing, have a resolution of 64 km along the orbit with an accuracy of a few percent.The accuracy was established through comparison of November 2009 with densities from the High Accuracy Satellite Drag Model (HASDM; Storz et al. 2005), which model is corrected in near real time from observed drag effects, made available by Bruce Bowman from the Air Force Space Command.

Fig. 1 .
Fig. 1.The total density data and the altitude used in the model evaluation superimposed on solar activity (F10.7;left axis).

Fig. 6 .
Fig. 6.The bias of the CHAMP (left) and GRACE (right) density ratios in latitude bins per year for JB2008 (top) and DTM2009 (bottom), in percent.

Table 1 .
The model dependencies on the density data used in this evaluation (-: not used).

Table 2 .
The test results of the models compared to all CHAMP and GRACE data, except for 2010.

Table 3 .
The test results of the models compared to AE-C and AE-E data.Best results are underlined.

Table 4 .
The test results of the models compared to Deimos-1 data.Best results are underlined.

Table 5 .
The test results of the models compared to preliminary GOCE data for November and December 2009 and March and April 2011.Best results are underlined.

Table 6 .
The test results of the models compared to EDR density data in the 200-400 km and 400-500 km ranges.Best results are underlined.