New thermosphere neutral mass density and crosswind datasets from CHAMP, GRACE, and GRACE-FO

,


Introduction
Accurate knowledge of the atmospheric density is critical for orbit predictions at altitudes below 600 km (Vallado & Finkleman, 2014) and for studying the coupling between the thermosphere and the ionosphere (Palmroth et al., 2021). Neutral mass density observations derived from precise accelerometer measurements on board satellites are the only current in-situ data source in the altitude range of 200-550 km with a high resolution along the orbit . The accelerometer measurements allow deriving crosswind when the aerodynamic acceleration is large enough (Sutton et al., 2007;Doornbos et al., 2010), which gives additional information on the dynamics of the thermosphere.
Since 2000, several satellites carrying precise accelerometers have been launched. The first was the CHAllenging Mini satellite Payload (CHAMP) satellite, which was launched in 2000 and re-entered in 2010 (Reigber et al., 2002). The Gravity Recovery And Climate Experiment (GRACE) twin satellites followed in 2002 and provided data until decommissioning in October 2017, although they remained in orbit for another few months (Tapley et al., 2004). Then, the Gravity Field and Steady-State Ocean Circulation Explorer (GOCE) satellite was launched in 2009 and re-entered in November 2013 (Floberghagen et al., 2011), the same month in which the three Swarm satellites were launched (Olsen et al., 2013). Finally, the twin GRACE-Follow-On (GRACE-FO) satellites were launched in 2018 and provide measurements until today. All of these satellites were launched into circular, near-polar orbits with an initial altitude of roughly 500 km, except for the GOCE satellite, which was in a circular, sun-synchronous orbit at dawn/dusk local time at an altitude of 256 km for the largest fraction of its lifetime.
Early CHAMP and GRACE thermosphere density and crosswind datasets were produced by Bruinsma et al. (2004), Sutton (2008), and Doornbos (2011). They relied on panel models to derive the aerodynamic coefficient vector and evaluate the radiation pressure. More recently, Mehta et al. (2017) provided CHAMP and GRACE neutral mass density observations until 2010, and March et al. (2021) released new versions of the CHAMP and GRACE neutral mass density observations until 2009. Both authors used detailed geometry models of the satellites to simulate the aerodynamic coefficient vector by direct simulation Monte Carlo methods (Bird, 1994). The GOCE data was processed similarly by Doornbos et al. (2014) to obtain neutral mass density and crosswind observations. The Swarm satellites are a special case because the accelerometer measurements suffer from perturbations that require a substantial effort to correct (Siemes et al., 2016). Nevertheless, Swarm C neutral mass density observations have been produced with a nearly complete temporal coverage until today, and, recently, also Swarm A neutral mass density observations for 2014 have been released (Iorfida et al., 2022). Though some authors presented GRACE-FO neutral mass density observations, e.g., Krauss et al. (2020), we are unaware of systematically produced GRACE-FO neutral mass density datasets.
All these datasets have been produced with different methods for (a) the calibration of the accelerometer data, (b) the modeling of the radiation pressure to extract the aerodynamic acceleration, (c) accounting or not accounting for thermosphere wind in the neutral mass density retrieval, and (d) different assumptions on the gas-surface interaction, which affects the aerodynamic coefficient vector modeling and, hence, the scale of the neutral mass density observations. We aim to establish a database of thermosphere density and crosswind observations that is as complete, consistent, and accurate as possible. Such a database is the basis for thermosphere models like the Drag Temperature Model 2020 (Bruinsma & Boniface, 2021) and investigations of the gas-surface interaction (Bernstein & Pilinski, 2022).
This paper focuses on two critical aspects of neutral mass density and crosswind processing: the calibration of the accelerometer data and the modeling of radiation pressure. The neutral mass density and crosswind observations are derived from the along-track and cross-track accelerations, respectively (Doornbos, 2011). Therefore, the accuracy of the accelerometer data calibration and the radiation pressure modeling directly affects the accuracy of the neutral mass density and crosswind observations. The data from electrostatic accelerometers are affected by biases and small instrument imperfections, which include a deviation of the scale factors from unity (Touboul et al., 2012(Touboul et al., , 2016. Thus, the accelerometer data need to be calibrated, and we refer to Kang et al. (2003) and Vielberg et al. (2018) for comparing the different calibration methods. The ones based on precise orbit determination (POD) give the best calibration results for the acceleration in the flight direction, although they perform less well in the other directions. van Helleputte et al. (2009) reported an accuracy of the GRACE along-track and cross-track biases of 0.5 nmÁs À2 and 22 nmÁs À2 , respectively. Further, the reported accuracy of along-track and cross-track scale factors is 1-3% and about 15%, respectively. The accuracy of the cross-track bias is of concern for the crosswind observations since Doornbos et al. (2010) demonstrated that a 10 nmÁs À2 cross-track acceleration error leads to a crosswind error of about 1 km Ás À1 , which is larger than the expected average crosswind signal (Drob et al., 2015). A further point of attention is the thermally-induced bias variation of the GRACE accelerometer data, which is significant after the deactivation of the accelerometer thermal control in April 2011 (Klinger & Mayer-Gürr, 2016).
The size of the radiation pressure acceleration exceeds the size of the aerodynamic acceleration in the along-track direction when a satellite orbits at and above altitudes of 500 km during solar minimum, which was shown for the Swarm satellites by Van den IJssel et al. (2020). Since the GRACE and GRACE-FO satellites spent several years at such high altitudes during solar minimum, the accuracy of the radiation pressure modeling is essential for the neutral mass density calculation. Because the aerodynamic acceleration in the cross-track direction is much smaller than the one in the along-track direction, we expect that the accuracy of the radiation pressure modeling is relevant for the crosswind observations throughout the entire mission lifetime.
The key elements of the radiation pressure modeling are the outer satellite geometry, the satellite surface properties, the radiation flux, and the satellite thermal emissions. Solar radiation flux is near constant at a fixed distance to the Sun, with variations of only 0.1% (Dewitte & Clerbaux, 2017). Earth albedo and infrared radiation contribute less than 1 nmÁs À2 to the along-track and cross-track accelerations, which was shown by Vielberg & Kusche (2020) for the GRACE satellites. This is possibly a consequence of the symmetric satellite shape and Earth-pointing satellite attitude. Since the situation is very similar for the CHAMP and GRACE-FO satellites, we do not expect to gain much accuracy for the neutral mass density and crosswind observations by improving the radiation flux modeling.
The logical next step for advancing our processing is to augment the detailed satellite geometry models from March et al. (2019a) with thermo-optical surface properties to replace the presently used panel models. These models can then be used in ray-tracing simulations to determine the radiation pressure force coefficients. The key features of ray-tracing techniques for radiation pressure modeling have already been described in the 1990s (Klinkrad et al., 1991), which includes shadowing and multiple reflections.
We have two options for the implementation of the raytracing simulations. The first is to perform the simulations for each position along the orbit, where the calculations are accelerated by the graphics processing unit as suggested by Kenneally & Schaub (2020). Its advantage is the simplicity of calculating the amount of absorbed radiation per surface element, which we could use as an input to calculate the satellite's thermal emissions and account for moving parts such as steerable solar arrays. The disadvantage is the need to perform one simulation per incidence angle of incoming radiation, which quickly escalates when considering Earth's albedo and infrared radiation. More importantly, we intend to process the entire CHAMP, GRACE, and GRACE-FO mission data, which renders this option infeasible due to the very long runtime of the calculations. Therefore, we perform the ray-tracing simulations in advance and store the results in a lookup table. Such an approach has been adopted by the University College London, which developed a tool for radiation pressure modeling based on ray tracing (Bhattarai et al., 2019). It has been extensively used for radiation pressure modeling within the context of precise orbit determination of GNSS satellites (Ziebart, 2004;Li et al., 2020;Bhattarai et al., 2022). Similarly, Darugna et al. (2018) used ray tracing to evaluate the radiation pressure for the first satellite of the Japanese Quasi-Zenith Satellite System, QZS-1. These methods have in common that they use detailed geometry models and generate grids of rays that illuminate the satellite, where one grid is generated for each incidence angle. Another example is provided by Wöske et al. (2019), who used ray tracing with a detailed, finite element model of the GRACE satellites to precalculate the radiation pressure for all incidence angles. Wöske et al. (2019), as well as Vielberg & Kusche (2020), demonstrated for the GRACE satellites that the satellite thermal emissions are the second largest contributor to the radiation pressure acceleration, only surpassed by the solar radiation pressure. The thermal emissions are more substantial in the cross-track and radial directions due to the solar arrays on the sides, the satellites' top panels, and the radiator panels on the bottom. Thus, we need to model the thermal emissions accurately to obtain usable crosswind observations. Montenbruck et al. (2015) model thermal emissions as an instant reradiation of the absorbed radiation. However, the results of Wöske et al. (2019) show that thermal inertia is significant for the GRACE satellites. Therefore, we need to account for the history of absorbed radiation, implying that the assumption of an instant reradiation is inaccurate.
Though the methods to model radiation pressure have advanced significantly and detailed satellite geometry models were produced recently, our knowledge of the thermo-optical surface properties and thermal properties is not accurate for the satellites under consideration. For instance, the reflection and absorption coefficients for the GRACE and GRACE-FO satellites are specified identically (Bettadpur, 2012;Wen et al., 2019). In contrast, photos of the satellites clearly show that at least the solar arrays are different. Thus, empirically updating these properties based on the in-flight data was necessary.
The paper is structured as follows. Section 2 describes the methods we used to derive the neutral mass density and crosswind observations. We emphasize the accelerometer data calibration and the radiation pressure modeling, presented in Sections 2.1 and 2.4, respectively. The input data is specified in Section 3. We present and discuss the results of the accelerometer data calibration, radiation pressure modeling, and neutral mass density and crosswind retrieval in Section 4. We conclude with Section 5, which also provides an outlook for future improvements.

Methods
The method for deriving neutral mass density and crosswind observations is described in detail by Doornbos (2011), which we briefly summarize here. First, we preprocess the accelerometer data to remove the effects of thruster activations and other artificial accelerations generated by the satellite, e.g., "twangs" in the case of the GRACE mission (Flury et al., 2008). The acceleration due to the thruster activations is stored since it is a perturbation when deriving density but a signal when estimating accelerometer data calibration parameters as described in Section 2.1.1.
Then, the measured acceleration vector a meas transforms into the calibrated acceleration vector a cal by where S is a diagonal scaling matrix and b is the bias vector. The calibrated acceleration vector is the sum of the aerodynamic acceleration vector a aero and the radiation pressure acceleration vector a rp , i.e.
It should be noted that this equation holds only when the accelerometer is placed into the satellite's center of mass so that accelerations due to rigid body dynamics and gravity gradients are zero by design. This is the case for CHAMP, GRACE, and GRACE-FO satellites. The radiation pressure acceleration needs to be modeled to extract the aerodynamic acceleration. Therefore, we decompose the radiation pressure acceleration vector into several contributions: where a srp , a alb , a ir , and a te are the acceleration vectors due to solar radiation pressure, Earth's albedo, Earth's infrared radiation, and the satellite's thermal emission, respectively. The aerodynamic acceleration vector is related to neutral mass density q by where v is the velocity of the atmosphere relative to the satellite, m is the satellite mass, and C aero is the aerodynamic coefficient vector, which is intrinsically multiplied by the crosssection area. The relative velocity v is the Euclidian norm of the relative velocity vector v, which is the sum of the satellite velocity vector v sat , the velocity vector of the corotating atmosphere v atm , and the wind vector v wnd : In a nominal situation, the satellite's x-axis is aligned with the flight direction within a few degrees so that the largest fraction of the drag signal is contained in the x-component of a aero (the orientation of the satellite reference frame is shown in Fig. 1). Further, the x-component is the most accurately calibrated component. Therefore, we derive the neutral mass density observations by projecting equation (4) onto the x-axis of the satellite reference frame and solving for q, which gives The crosswind observations are also derived from equation (4), where the aerodynamic coefficient vector C aero is a function of the relative velocity v defined in equation (5). The algorithm of (Doornbos et al., 2010) retrieves crosswind observations by C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 iteratively adjusting the relative velocity vectors' directions such that the modeled aerodynamic acceleration in equation (4) aligns with the observed aerodynamic acceleration in equation (2). The observed wind vector v obs is the difference between the initial relative velocity vector and the adjusted one. We must interpret this difference as the (scalar) wind speed v obs = |v obs | into the direction of the unit vector u obs = v obs /v obs . Since the satellite velocity dominates the relative velocity vector, and the adjustments are small rotations about the satellite z-axis, which approximately points into the nadir direction, the observed wind direction u obs is roughly in the horizontal plane in the cross-track direction. Thence, we speak of crosswind observations. We elaborate on the methods for the accelerometer data calibration and the radiation pressure modeling in Sections 2.1 and 2.4, respectively. We complement the information by describing the satellite geometry models, augmented by surface properties, in Section 2.2 and the simulation of the aerodynamic coefficient vector of the GRACE-FO satellites in Section 2.3.

Estimation of calibration parameters
We used the method described in Visser & van den IJssel (2016) to calibrate the accelerometer data, which we employed earlier to calibrate the individual GOCE accelerometers. The method is based on dynamic orbit determination with the NASA Goddard Space Flight Center's GEODYN software (Pavlis et al., 2006), which uses an iterative Bayesian weighted least-squares method to solve the linearized observation equations. Table 1 gives an overview of the comprehensive gravitational force models. Since the accelerometer observations represent the total non-gravitational accelerations, we exclude non-gravitational models from the estimation. We processed the orbits in daily 30-h arcs, centered on a day with a 3-h overlap at the day boundaries, and based the satellite orientation on the star tracker observations. For efficient processing, we use precise orbits resulting from a GPS-based POD as input observations instead of the actual GPS observations. To avoid calibration terms absorbing part of the aerodynamic signal, we estimate the limited set of calibration parameters in equation (1) in addition to the initial position and velocity, resulting in an estimation close to a purely dynamic orbit determination. Constant daily biases do not represent the sizable temperature-induced bias variations well. Therefore, we remove the thermally-induced bias variations using the accelerometer's temperature data as detailed in Section 2.1.2, after which a constant daily bias parameterization is suitable again. Since the z-component of the CHAMP accelerometer data is not usable (Perosanz et al., 2005), we estimate a set of constant and one-cycle-per-revolution empirical accelerations in this direction instead of using the actual measurements.
To reduce the high correlation between the scale factor and bias parameters, we use a multi-step approach for the calibration, based on a stacked normal matrices approach (Van Helleputte, 2011). The first step determines unconstrained daily calibration parameters for the entire mission. Although the estimated scale factors are very noisy, we identify periods in which they show stable behavior. For all missions, we expect hardly any variation of the scale factors over time. Therefore, we calculate the mean and standard deviation of the time series of daily scale factors, remove all scale factors from the time series that deviate by more than three times the standard deviation from the mean, and recalculate the mean scale factor in each direction. In the next step, we constrain the daily scale factors to these values and estimate unconstrained daily biases. In that estimation process, we store the part of the normal equations related to the scale factors. Then, we sum these parts of the normal equations to estimate new unconstrained scale factors (one set for the entire time series). This method has the advantage that days with a strong acceleration signal contribute more to the estimation. In the final step, we estimate constant daily biases using these newly estimated scale factor values. If necessary, we apply additional constraints to the bias estimation.

Modeling thermally-induced bias variations
The GRACE accelerometer biases are subject to a significant variation that correlates with the instrument's temperature.  C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 McGirr et al. (2022) used the lowpass filtered cross-track acceleration as a proxy for the along-track bias variation. While that approach is effective in the gravity field model estimation context, it is not suited for our purposes since it implicitly removes low-frequency signals in the cross-track acceleration and, hence, the crosswind observations. Therefore, we developed a correction that accurately models the acceleration's delayed response to changes in the measured temperature (Klinger & Mayer-Gürr, 2016). The reason for the delayed response is the thermistor's location outside the sensor structure, whereas the bias is sensitive to the temperature inside the sensor structure. To model this delay, we assume a discretized equation for radiative heat transfer where T is the absolute temperature measured by the thermistor on the sensor unit (variable TSU_Yp contained in the AHK1B files) and U is the modeled absolute temperature. 0 is the time step of the discretization, and k u is the model parameter that controls the delay in the response. To achieve a good fit, we assume a second heat transfer where V is the second modeled absolute temperature. The thermally induced bias variations are where S U and S V are diagonal matrices with temperature sensitivity coefficients for the three accelerometer axes. Since we correct for the thermally induced bias variations prior to the accelerometer data calibration, equation (1) expands to

Cross-track acceleration data calibration
As discussed in the introduction, the accuracy of the calibration of the cross-track acceleration is too low to derive meaningful crosswind observations. Therefore, we transferred the high accuracy of the along-track to the cross-track acceleration as follows. From solving the x-component of equation (4) for neutral mass density and inserting the result into the y-component follows: We used this equation to calculate the cross-track acceleration from the along-track acceleration and our knowledge of the aerodynamic coefficient vector. However, equation (11) requires knowledge of the crosswind because the aerodynamic coefficient vector is a function of the relative velocity vector, C aero (v). Changing the wind vector v wnd changes the relative velocity vector v and, hence, the aerodynamic coefficient vector. We used the Horizontal Wind Model 2007 (HWM-2007) Drob et al. (2008) to obtain the wind along the x-and z-axes of the satellite reference frame and set the wind along the y-axis to zero to avoid a biasing of the wind observations toward the model. With these assumptions about the wind, we first calculated the aerodynamic coefficient vector and then which is not zero, mainly because of the inaccurate y-axis bias in equation (10). Finally, we applied a lowpass filter with a cut-off frequency of 1 d À1 to Da aero,y to obtain a more accurate estimate of the bias, noting that the choice of the cut-off frequency is linked to the daily bias estimation.
To test the impact of setting the wind along the y-axis to zero, we calculated the aerodynamic coefficient vector using the HWM-2007 for the y-axis for GRACE A in 2014. Even though 2014 coincides with a solar maximum, it is a year with moderate geomagnetic activity, noting that the latter drives HWM-2007 while solar activity does not. Therefore, we selected it as a representative test case. After lowpass filtering, we found a rootmean-square (RMS) between the original and recalculated Da aero,y of 0.15 nmÁs À2 , which demonstrates that the assumption of zero wind along the y-axis has a negligible influence on the bias estimation. Before lowpass filtering, we found an RMS of 3.6 nmÁs À2 , which means that a aero,y contains a significant signal due to crosswind at periods shorter than one day.

Satellite geometry models
The basis for the radiation pressure and aerodynamic simulations are the detailed geometry models for the CHAMP and GRACE satellites created by March et al. (2019a). The models for the CHAMP and GRACE satellites consist of 2248 and 5640 triangular facets, respectively. The model for the GRACE-FO satellites is new, consists of 23,746 facets, and was created specifically for this paper.
We augmented the models with thermo-optical surface properties for the radiation pressure modeling, as illustrated in Figure 1. In the augmentation process, we used the absorption, specular, and diffuse reflection coefficients for the visible and infrared wavelengths from the missions' documentation (Bettadpur, 2012;Wen et al., 2019). We used the material properties provided by Fortescue et al. (2011) for the parts of the satellites for which the material properties are not specified in the documentation, e.g., the antennas.

Aerodynamic coefficient vector simulation
The simulation of the aerodynamic coefficient vectors using the detailed geometry model of the CHAMP satellite is described in detail by March et al. (2019b). Assuming diffuse reflections with incomplete energy accommodation as the model for gassurface interaction (Sentman, 1961), the authors found an energy accommodation coefficient of 0.85 to be a suitable choice. This value was later adopted for the GRACE A satellite (March et al., 2021). For this paper, we used the same approach and value of the energy accommodation coefficient for the simulation of the aerodynamic coefficient vector of the GRACE-FO satellites.

Radiation pressure coefficient vector simulation
We formulate the radiation pressure acceleration a src ða; bÞ ¼ P src ða; bÞ m C w ða; bÞ ð 13Þ C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 as the product of the radiation pressure coefficient vector C w and the radiation pressure P src , divided by the satellite mass m. Subscript "src" denotes the external radiation sources in equation (3), i.e., solar radiation, Earth's albedo, and Earth's infrared radiation, and subscript "w" indicates the wavelength, which corresponds either to visible light or infrared radiation. The direction of incidence of the radiation is described by the angles a and b. They can be calculated from the unit vector that is defined in the satellite reference frame and points from the radiation source to the satellite, i.e., in the direction of the ray propagation, by The definition of the angles follows from the setup of the ray tracing simulation illustrated in Figure 2, where the transformation matrix cos a 0 sin a relates the reference frame of the simulation and the satellite. The rays are emitted in the direction such that u sat = R sim?sat u sim . In the case of the solar radiation pressure, we have where r is the distance between the Sun and the satellite and P 1AU = 4.56 Â 10 À6 is the solar radiation pressure at a distance of 1 astronomical unit (AU), i.e., the mean distance between the Earth and the Sun (Montenbruck & Gill, 2012). We refer to Vielberg & Kusche (2020) for calculating the radiation pressure due to Earth's albedo and infrared radiation. The advantage of the formulation in equation (13) is that we can use the same radiation pressure coefficient vectors for solar radiation pressure and Earth's albedo. We calculate the radiation pressure coefficient vector C w (a, b) in a ray tracing simulation, where our method is very similar to the one described by Bhattarai et al. (2019). The simulation assumes that a radiation source emits parallel rays in the direction of the satellite and that the radiation source is larger than the satellite. With these assumptions, it is straightforward to simulate the radiation pressure that the source exerts on the satellite. Conceptually, we place the satellite into a rectangular simulation box, which is just large enough to contain the satellite. Then, we subdivide one side of the box into a square grid, where the centers of the grid cells are the origins of the rays, which are emitted perpendicularly away from the grid into the simulation box. This configuration is illustrated in Figure 2.
We employ a simple geometric ray tracing algorithm to identify the nearest ray intersection with a satellite surface element. When the ith ray intersects the jth surface element, it adds an increment to the radiation pressure coefficient vector C w . The factors c w,a,j , c w,d,j , and c w,s,j are the coefficients of absorption, diffuse reflection, and specular reflection, respectively, of the j-th surface element. Subscript "w"denotes again the wavelength, i.e. visible light or infrared radiation. Note that these coefficients must sum to one. The coefficient vector increments on the right-hand side are defined as for absorption, for diffuse reflection, and for specular reflection, where A i is the size of a grid cell, r i is a unit vector pointing into the propagation direction of the ith ray, and n j is the outward normal of the jth surface element. Note that the distance between the radiation source and the satellite is irrelevant in the simulation. We loop over all grid nodes and sum all coefficient vector increments to obtain the radiation pressure coefficient vector: Grid configuration for the ray tracing simulation. The orientation of the grid's axes is indicated by +X, +Y, and +Z along the edges of the box. The grid's origin coincides with the satellite center of mass. The red line indicates a ray emitted from one of the grid cells on the +X side of the simulation box. The yellow patch indicates the cell area that is associated with the ray. The angles a and b describe the satellite's orientation to the radiation source.
The ray tracing simulation accounts not only for self-shadowing but also for multiple reflections. Each ray intersecting the surface spawns new rays if the surface element is reflective. The amount of radiation the initial ray carries is proportional to the cell size A i . Unless the surface is a perfect mirror, a spawned ray with index k will carry a smaller amount of radiation, which we consider by assigning a smaller area A k . The direction of the spawned rays is determined from vectors r i and n j and the reflection mode, which can be diffuse and specular. In the case of a specular reflection, the new direction is and its new area is where c w,s,j is the same specular reflection coefficient as in equation (20). For diffuse reflections, we subdivide the half sphere centered around the surface normal into sections of size Df Â Dk as illustrated in Figure 3. The angles f and k are the local zenith angle and azimuth, respectively, where the surface normal points to the local zenith and zero azimuth are defined arbitrarily. We spawn a new ray in the center of each section, calculate the fraction of radiation that passes through the section according to Lambert's cosine law, and multiply the fraction by the amount of diffusely reflected radiation: where k 1 , k 2 , f 1 , and f 2 define the corners of a section and c w,d,j is the same diffuse reflection coefficient as in equation (20).

Modeling the satellite's thermal emission
We need the temperature of the outer surfaces to calculate the satellite's thermal emission. Since no measurements of the surface temperatures are available at the time of writing, we rely on modeling to infer the surface temperatures. For that purpose, we use a thermal satellite model that consists of a set of panels and the inner body of the satellite, which is conceptually very similar to the model of Wöske et al. (2019). In our model, the panels gain heat by absorption of radiation, lose heat by diffusely emitting infrared radiation, and conductively exchange heat with the inner body. The latter will not only exchange heat with the panels but also generate heat P gen because a fraction of the electric power is converted to heat by the payload and other electric parts of the satellite. We select P gen such that the inner body's temperature is about 25°C, noting that most instruments operate at room temperature.
The heat gain due to the absorption of radiation is where subscript j indicates the panel, U is the radiation flux, c a,j is the coefficient of absorption of visible light or emissivity in the case of infrared radiation, A j is the area of the panel, and h is the angle between the panel normal and the vector from the satellite to the radiation source. Note that we need to sum the heat gain due to the absorption of visible light and infrared radiation from all radiation sources and angles of incidence.
The heat loss due to the emission of infrared radiation follows the Stefan-Boltzmann law, where P emit,j is the emitted power, e j is the emissivity, r is the Stefan-Boltzmann constant, and T j is the absolute temperature of the panel.
The conductive heat exchange between the panels and the inner body is where k j is the thermal conductivity and T body is the inner body's temperature. Therefore, the total heat exchange for a panel is For the inner body, it is At each time step, we update the panels' and inner body's temperature based on their total heat exchange: and where C j and C body are the thermal capacity of the jth panel and the inner body, respectively. At the first epoch, we initialize the panels' temperatures with 273 and the body temperature with 298. The initial temperatures no longer influence the thermal model after about six hours. When calculating the temperatures, we use a time step of 1 for convenience because this is the sampling of the accelerometer data.
After calculating the temperatures, we insert them into equation (29) to obtain P emit,j and calculate the radiation pressure due to the diffuse thermal emission by where c is the speed of light. The thermal models for the CHAMP, GRACE, and GRACE-FO satellites are provided in the Appendix in Tables A1, A2, and A3, respectively.
3 Data 3.1 Input data for neutral mass density and crosswind data processing The input to neutral mass density and crosswind data processing consists of acceleration, attitude, position, velocity, thruster, satellite mass, and accelerometer temperature data. We take these data from the Level 1 data archives of the CHAMP, GRACE, and GRACE-FO missions, all accessible from FTP ftp://isdcftp.gfz-potsdam.de. For the CHAMP, GRACE A, and GRACE C satellites, we used Level 1A data, allowing for cleaner removal of thrust events due to the higher temporal resolution compared to the Level 1B data. Only for the GRACE B satellite, we relied on Level 1B data because Level 1A data was unavailable to us.
The CHAMP acceleration data was measured by the Space Three-axis Accelerometer for Research (STAR) instrument developed by the Office National d'Études et de Recherches Aérospatiales (ONERA). The GRACE and GRACE-FO satellites were equipped with the more advanced SuperSTAR and SuperSTAR-FO accelerometers, respectively, described by Touboul et al. (2012) and Christophe et al. (2015). The nominal noise level of these accelerometers is reported in Table 2, noting that the value applies to the along-track acceleration measurement, whereas the cross-track acceleration measurement is about ten times worse by the design. In the case of the GRACE mission, perturbations stemming from the satellite platform limit the accuracy of the accelerometer data (Flury et al., 2008). The noise level of the acceleration data is about ten times higher than the nominal noise level of the accelerometers (Murböck et al., 2023), i.e., about 1 nmÁs À2 , noting that this value depends on the satellite operations. As part of preprocessing the acceleration data, we applied a low-pass filter, whose cut-off frequency was carefully selected to suppress the perturbations. However, this low-pass filtering limits the resolution of the density and crosswind observations. The resolution along the orbit is the orbital velocity divided by the cut-off frequency, multiplied by two following the Nyquist-Shannon sampling theorem. Table 2 reports relevant characteristics of the initial orbits, acceleration data, and density and crosswind observations.
The neutral mass density and crosswind data processing require modeling of the atmospheric composition and temperature, for which we use the NRLMSISE-00 model described in Section 3.3.1. Wind in the direction of the satellite's x-axis is accounted for by the HWM07 model (Drob et al., 2008).

Input data for accelerometer data calibration
The input to the calibration consists of the same satellite attitude and accelerometer data as mentioned above. However, thrust events are not removed from the accelerometer observations, as they contain an actual signal influencing the satellites' orbits. We use the GRACE and GRACE-FO missions' navigation orbits available in the GRACE Level 1 data as pseudoobservations (Case et al., 2010). For CHAMP, we use the precise orbit solutions determined at the Astronomical Institute of the University of Bern (AIUB) (Prange, 2010). This dataset is, unfortunately, not available for the early mission phase. For the period 2001-2002, therefore, the GeoForschungs Zentrum (GFZ) Rapid Science Orbits (RSO) are used as input (König et al., 2005). All input orbits are determined using a reduced-dynamic approach, which does not use any accelerometer data. We use the calibrated accelerations for thermospheric density retrieval, and therefore, ideally, the input orbits should not contain prior dynamical model information. However, when using only a minimal number of calibration parameters and with differences of only a few cm between both types of orbits, the impact of using reduced-dynamic instead of kinematic orbits on the calibration is small. To assess this, we also used kinematic orbits computed by the Graz University of Technology 1 (Zehentner & Mayer-Gürr, 2016) as input observations for the CHAMP calibration for 2003-2008. The median RMS of the calibrated acceleration differences in flight direction is

Thermosphere models
In-situ observations of neutral mass density above 200 km altitude are very sparse. While this emphasizes the importance of the observations presented in this paper, it also highlights the challenge of finding suitable data for validation. Therefore, we choose three thermosphere models to evaluate the new observations presented in this paper. The first is the Naval Research Laboratory Mass Spectrometer Incoherent Scatter radar Expanded model 2000 (NRLMSISE-00), a semi-empirical model with a wide user base. The second is the Drag Temperature Model 2020 (DTM-2020), one of the most recent semiempirical models. The NRLMSISE-00 and DTM-2020 models output neutral mass density, but no wind. Finally, we use the Thermosphere-Ionosphere-Electrodynamics General Circulation Model (TIE-GCM), a physics-based model providing neutral mass density and wind. We provide some more information on these models is in the following sections.

NRLMSISE-00
The semi-empirical NRLMSISE-00 model is constructed using atmospheric composition from mass spectrometer data, thermosphere temperature derived from incoherent scatter radar data, and molecular oxygen (O 2 ) number density derived from solar ultraviolet occultation data amongst others (Picone et al., 2002). Further, it is based on neutral mass density datasets derived from accelerometer measurements and orbital decay data from the 1970s and 1980s. The model drivers are the daily F10.7 solar flux index and its 81-day centered mean F 10:7 for solar activity, the 3-hourly ap index, and its daily average, the Ap index, for geomagnetic activity. There are two reasons to include the NRLMSISE-00 model in the evaluation. First, it is largely independent of the neutral mass density observations of the CHAMP, GRACE, and GRACE-FO satellites. The only minor dependency stems from using atmospheric composition and temperature as specified by the NRLMSISE-00 model in the calculation of the aerodynamic coefficient vector (Doornbos, 2011). Second, the model is widely used in science and operations.

DTM-2020
We use the operational version of the semi-empirical DTM-2020, driven by the F10.7 solar radio flux and geomagnetic K p index (Bruinsma & Boniface, 2021). In sharp contrast to the NRLMSISE-00 model, it was constructed using neutral mass density observations derived from the accelerometer data of the GOCE, CHAMP, and GRACE satellites, GNSS tracking data of the Swarm A satellite, and laser ranging data on the Stella satellite. It is worthwhile to note that the neutral mass density observations presented in this paper were not used in the construction of the DTM-2020 model.

TIE-GCM
TIE-GCM is a three-dimensional, time-dependent, physicsbased model of the Earth's upper atmosphere (Richmond et al., 1992). The website, www.hao.ucar.edu/modeling/tgcm hosts the open-source TIE-GCM code. Kodikara (2019) and Qian et al. (2014) provide summaries of the recent developments in the model. This study uses TIE-GCM version 2.0 (released on 21 March 2016) with a horizontal resolution of 2.5°Â 2.5°i n geographic latitude and longitude, and a vertical resolution of 0.25 scale height. The solar irradiance input to the model is specified by the average of the daily solar flux F 10.7 , and its 81-day centered mean F 10:7 . The high latitude mean energy, energy flux, and electric potential are described by the default ion convection model and auroral particle precipitation scheme. The tidal forcing from the lower atmosphere is specified via perturbations from the Hagan et al. (2001)  For comparison with the satellites, the model neutral mass densities are interpolated in logarithmic space to the satellite location, first horizontally and then vertically to the satellite altitude. To minimize errors due to extrapolation, model estimates are not provided where the satellite altitude is 20 km above the model top. The zonal (east-west) and meridional (north-south) wind components of the model are only interpolated to the satellite altitude. In other words, horizontal winds are not extrapolated vertically upward beyond the model domain. To compare the model winds with the satellite crosswind speeds, the former is projected onto the observed crosswind directions.

Accelerometer data calibration
To extract the thermally-induced bias variations, we calculate the residual acceleration a res ¼ a meas À S aero;mod a aero;mod À a rp À b trend ð36Þ by subtracting the scaled aerodynamic acceleration a aero,mod based on the NRLMSISE-00 model, where S aero,mod is a diagonal matrix, the radiation pressure acceleration a rp from equation (3), and a linear trend b trend . Scaling of the modeled aerodynamic acceleration is necessary to compensate for biases in the thermosphere model, and removing the trend reduces the part of the bias unaffected by the temperature. The scaling matrix S aero,mod and the trend b trend are fitted to a meas in the intervals from 2007-01-01 00:00 UTC to 2007-01-14 21:20 UTC and 2007-01-24 03:33 UTC to 2007-02-01 00:00 UTC to fitting to the thermally-induced bias variations in-between. Then, we fit the model for the thermally-induced bias variations in equation (9) to a res to obtain the temperature sensitivity coefficients S U and S V and the temperature delay parameters k U and k V . The co-estimation of a linear trend ensures that the other parameters fit the temperature swing instead of the mean temperature. We illustrate the results in Figure 4, in which the left and center panels show the fit to a res,x and a res,y and the right panel illustrates the measured and modeled temperatures, T and U, respectively.
In the center panel, we see that the acceleration in the y-direction is most sensitive to temperature variations, showing a peak of À800 nmÁs À2 in response to a temperature swing of about 4.5°C. Further, the measured temperature (red curve) does not fit the residual acceleration (blue curve, which is behind the yellow curve). In contrast, the model temperature (yellow curve) fits almost perfectly, demonstrating the need to model the temperature delay as described in Section 2.1.2. The acceleration in the x-direction shows a response of 30 nmÁs À2 to the same temperature swing. Even though the acceleration in the x-direction is much less sensitive to temperature variations, we can still see that the model temperature fits better than the measured temperature. The acceleration in the z-direction (not shown) has an almost negligible temperature sensitivity of a few tenths of nmÁs À2 .
We analyzed several similar temperature swings in the same way, and the results combined to obtain the values reported in Table 3. We modeled the thermally-induced bias variations with two temperatures for GRACE B to get a better fit. The need for two temperatures suggests that the heat is transferred through two paths. However, when summing the coefficients for GRACE B, e.g., S Uy + S Vy = 69.3 nmÁs À2 ÁK À1 + 65.0 nmÁs À2 ÁK À1 = 134.3 nmÁs À2 ÁK À1 , we obtain qualitatively similar values as for GRACE A, which is not surprising as the satellites and the instruments are identical.
Finally, we point out that the model for the thermallyinduced bias variations acts as a lowpass filter on the temperature data. Even after deactivating the accelerometer's thermal control on the GRACE satellites in April 2011, the modeled bias variations at the orbital period have a tiny amplitude of only a few tens of pmÁs À2 , which is negligible. Table 4 summarizes the scale factors and their formal standard deviations determined with the stacked normal matrices approach for all satellites. Since the unconstrained estimation of daily scale factors for GRACE A has a stable behavior over time (not shown), we applied one set of constant scale factors for the entire lifetime. For GRACE B, we observed a small drop in the time series of daily scale factors after the accelerometer's thermal control deactivation (not shown). Therefore, we estimated two sets of constant scale factors for this satellite.
The scale factors' formal standard deviations are all smaller than 10 À3 , indicating a high estimation precision. Considering that the scale factors and the aerodynamic coefficient vectors in the x-direction both have a scaling effect on the density observations, we conclude that the scale factors' formal standard deviations are negligible compared to the uncertainty of at least a few percent in the aerodynamic coefficient vector 0.83 ± 2 Â 10 À5 0.94 ± 2 Â 10 À5 0.95 ± 2 Â 10 À5 0.93 ± 3 Â 10 À5 1.05 ± 2 Â 10 À4 S Y 0.85 ± 4 Â 10 À4 0.96 ± 4 Â 10 À4 0.97 ± 6 Â 10 À4 0.96 ± 6 Â 10 À4 0.94 ± 9 Â 10 À4 S Y -0.93 ± 1 Â 10 À4 0.95 ± 5 Â 10 À4 0.95 ± 9 Â 10 À4 0.98 ± 7 Â 10 À4 C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 (Mehta et al., 2022. Thence, we rounded the scale factors to two digits. For GRACE A, the standard deviations of the scale factors are 2 Â 10 À5 , 4 Â 10 À4 , and 1 Â 10 À4 for S x , S y , and S z , respectively, which indicates that especially in the x-direction, the scale factors are well-determined with the POD-based calibration. We obtained similar standard deviations for GRACE B and CHAMP. For GRACE C, the uncertainty in the scale factor calibration is substantially larger, with standard deviations of 2 Â 10 À4 , 9 Â 10 À4 , and 7 Â 10 À4 for S x , S y , and S z , respectively. The worse results are due to the shorter dataset and the small non-gravitational acceleration signal encountered at about 520 km altitude, the highest of all satellites considered in this paper (cf. Fig. 6, bottom panel). We expect to improve these results when more data at lower altitudes and higher solar activity becomes available. After determining the scale factors, the next step in the accelerometer data calibration is the estimation of the biases. We illustrate the results of the bias estimation for GRACE A in Figure 5. The blue curve shows the unconstrained estimates of the daily biases without modeling the thermally-induced bias variations. As expected, the accelerometer biases have large offsets on the order of 1 for the x-and z-directions and 30 in the y-direction (Touboul et al., 2016). On top of the offsets, the biases show an exponential behavior until mid-2010, likely due to instrument aging. In mid-2010, we observed a step caused by a 5°C drop in the sensor unit's temperature due to a change in the accelerometer thermal control. In April 2011, the accelerometer's thermal control was entirely deactivated, leading to temperature variations up to 10°C, which has a significant impact on the bias estimates (cf. also Fig. 4).
The red curves show the unconstrained estimates of the daily biases when modeling the thermally-induced bias variations, noting that the red curves are covered by the yellow/ purple curves when they are not visible. The large offsets to the unconstrained estimates without modeling the thermallyinduced bias variations (blue curves) are due to using absolute temperatures in equation (9). The offsets are compensated by the daily biases and, thus, are of no concern. More importantly, when modeling the thermally-induced bias variations, the biases are much less affected by temperature variations, except for the wiggle in 2013-2014, where the GRACE A accelerometer temperature was unavailable. After experimenting, we found that using the GRACE B temperature was the best solution. The RMS in the bottom panel is improved, particularly from April 2011 onward. This reduction indicates that the calibration parameters are more accurate when modeling the thermallyinduced bias variations (red dots are behind the yellow dots).
The yellow curve results from applying a constraint on the z-bias, suppressing artificial bias variations of about 200 nmÁs À2 in that component. This suppression reduces the number of outliers in the x-and z-components and removes a slight artificial variation in the x-bias with an amplitude of 1 nmÁs À2 as shown in the zoom box in the top panel.
The purple line is only relevant for the y-bias, which we adjusted as described in Section 2.1.3. The most significant adjustments have a size of 100 nmÁs À2 , which we highlight in the zoom box in the second panel from the top. We can also see that the purple line is much smoother than the yellow, demonstrating improved precision. After data gaps, the purple line shows a spike. The spikes are due to the imperfect C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 modeling of the increasing accelerometer sensor temperature after the instrument is switched on, noting that the y-axis is most sensitive to temperature (cf. Table 3). In that case, the heat source is the instrument's electronics, whereas the heat source is external to the instrument otherwise, which explains the poor modeling of that feature. We provide the crosswind observations with a flag that marks the affected periods as invalid.
We obtained very similar results for the GRACE B satellite. We also modeled the thermally-induced bias variations for the CHAMP satellite, though they had a much smaller impact than for the GRACE A satellite. In the case of the GRACE C satellite, the accelerometer's thermal control improved a lot, eliminating the need to model thermally-induced bias variations. Further, we did not apply z-bias constraints for the GRACE C satellite because they did not significantly affect the already smooth x-bias. The appendix provides figures of the estimated biases of the CHAMP, GRACE B, and GRACE C satellites.

Radiation pressure modeling
The radiation pressure acceleration is significant compared to the aerodynamic acceleration at altitudes above 450 km during low solar activity. Figure 6 compares the size of the radiation pressure acceleration to the size of the aerodynamic acceleration, where the RMS calculated within a one-month time window indicates the size. For the along-track direction, we see that the size of the radiation pressure acceleration is about 100% of the aerodynamic acceleration in 2007-2009 for GRACE A and even 200% in 2018-2020 for GRACE C. Suppose that radiation pressure modeling has an uncertainty of 5%, which is a realistic assumption. Since the neutral mass density observations are approximately proportional to the along-track acceleration, the resulting uncertainty in the neutral mass density observations will be 5% for GRACE A and 10% for GRACE C. The situation is much more difficult for the cross-track accelerations, where the radiation pressure acceleration is 20 times larger than the aerodynamic acceleration for the GRACE A and GRACE C satellites. In this context, we point out that lift causes the most significant fraction of the cross-track aerodynamic acceleration, and only a small fraction is due to crosswind. Thus, accurately modeling the radiation pressure is crucial to obtain usable crosswind observations for the GRACE A and GRACE C satellites. The radiation pressure acceleration modeling is less critical for the CHAMP satellite because the aerodynamic acceleration is larger throughout the mission lifetime.
The CHAMP, GRACE, and GRACE-FO density and crosswind observations presented in this paper are available on the FTP ftp://thermosphere.tudelft.nl, designated as version 2 (V2). On the same FTP, we also provide the previous version presented by March et al. (2021), designated as version 1 (V1). We implemented two significant upgrades in V2 to improve the radiation pressure modeling compared to V1. First, we switched from commonly used panel models to augmented Fig. 6. Size of the radiation pressure acceleration relative to the aerodynamic acceleration, where the size is measured by the RMS of the acceleration within a sliding one-month window. The curves show the ratio of the RMS of the radiation pressure acceleration over that of the aerodynamic acceleration in the along-track (top) and cross-track direction (center). The bottom panel shows the altitude evolution of the satellites and the solar activity as indicated by the F 10.7 solar radio flux index.
C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 detailed geometry models described in Section 2.2 and finetuned the absorption and reflection coefficients of the solar array (c a = 0.65 ? 0.72, c d = 0.30 ? 0.05, c s = 0.05?0.23) and the nadir panel (c a = 0.12 ? 0.12, c d = 0.20 ? 0.06, c s = 0.68 ? 0.82). We verified these changes by inspecting photos of the satellites, which confirmed that the surfaces were indeed more specular reflecting. Second, we model the radiation pressure due to the satellite thermal emission as described in Section 2.4.2, which is a new feature of the V2 dataset. We will elaborate on these changes and their effects in another paper and only present the total effect of the changes here. Figure 7 shows the GRACE A radiation pressure acceleration in the along-track and cross-track directions as modeled in V1 and V2. The magnitude of the radiation pressure acceleration is about 30 nmÁs À2 in the along-track direction and 50 nmÁs À2 in the cross-track direction. The periodic behavior results from the beta angle progression, i.e., the angle between the orbital plane and the sun's direction, which is 360°in about 11 months. The differences between V1 and V2 amount to several nmÁs À2 in both directions, which is particularly significant for the cross-track direction. Table 5 reports the RMS of the radiation pressure acceleration for the contributions from solar radiation pressure, Earth's albedo, Earth's infrared radiation, and the satellite's thermal emission. The difference between V1 and V2 radiation pressure acceleration a rp is 1.9 nmÁs À2 and 4.4 nmÁs À2 in the along-track and cross-track directions, respectively. These differences are 6% and 9% of the radiation pressure acceleration in along-track and cross-track directions, respectively. Given the relative size of the aerodynamic acceleration illustrated in Figure 6, we may expect differences of 6% in the neutral mass density observations and more than 200% in crosswind observations for GRACE during 2007-2010. Further, Table 5 demonstrates that the changes in solar radiation pressure and the newly implemented thermal emission cause the differences. This insight narrowed down the search space in fine-tuning the radiation pressure model.

Density observations
Neutral mass density observations q 1 are best compared to other observed or model densities q 2 in log space (Sutton, 2018). The mean density ratio quantifies deviations in the scale of the density observations, which could be related to incorrect scale factors in the  accelerometer data calibration, mismodeling of the aerodynamic and radiation pressure coefficient vectors, or a biases in the density observations. The standard deviation of the density ratio measures if the variability in the densities is the same. It is more convenient to express the deviation of r(q 1 , q 2 ) from unity as a percentage drðq 1 ; q 2 Þ ¼ ðrðq 1 ; q 2 Þ À 1Þ Â 100: ð39Þ Figure 8 compares the old (V1) and new (V2) neutral mass density observations of GRACE A, where we calculated the mean l(q V1 , q V2 ) and the standard deviation dr(q 1 , q 2 ) in a sliding window of the length of one orbital revolution. The mean is very close to 1 in the mission's first years but begins to deviate from 1 from 2006 onward, reaching values as low as 0.8 in 2009. The change in the radiation pressure modeling has a more considerable impact when the size of the radiation pressure acceleration approaches the size of the aerodynamic acceleration (cf. Fig. 6) and, thus, causes the deviation. The standard deviation also reflects this, which is about 10% in 2002-2003 and then gradually increases until 2009 when it reaches values up to 100%. Another cause of the large standard deviation in 2006-2010 is the improved accelerometer data preprocessing that reduces acceleration spikes due to attitude control thruster activations and other artifacts more effectively, which was not possible when producing the V1 dataset. Further, several large spikes in the mean are associated with spikes in the temperature, which may occur, e.g., when the accelerometer is power-cycled. Figure 9 shows the V1 and V2 neutral mass density observations in April 2008, when one such temperature spike occurred (black line). The V1 density observations (blue line) show an artificial variation, even dropping to negative values, which is not the case for the V2 density observations (red line). This comparison demonstrates that modeling the temperature-induced bias variations (cf. Sect. 2.1.2) improved the V2 dataset substantially.
We compare the neutral mass density observations to models in terms of the annual mean observation-model ratio in Figure 10 and annual standard deviation in Figure 11. We will not discuss the difference between the observations and the models, for which we refer to Bruinsma et al. (2018). Instead, we will focus on the difference between new (V1) and old (V2) observations and how the previously unavailable GRACE A and GRACE C observations (beyond 2011) compare to the existing datasets (until 2010).
First, we notice that the mean of the new and old CHAMP observations is practically identical. This agreement is not surprising since the newly calibrated accelerometer data is almost identical to the old version, the radiation pressure acceleration is small compared to the aerodynamic acceleration (cf. Fig. 6) and tends to average out to zero over long periods, and the aerodynamic model is the same for both versions. The standard deviation of the new observations in 2005-2009 is about 0.5% smaller than that of the old version. For the NRLMSISE-00 model, we notice that the standard deviation of the new observations in 2001-2004 is higher than that of the old version. However, this feature is not evident in the comparison to the DTM-2020 and TIE-GCM models.
We find the best insights for the GRACE A satellite, for which we have old and new observations for 2002-2009. We see a decrease of the mean density ratio from 2006 to 2009 of about 10% for the DTM-2020 model and more than 20% for the NRLMSISE-00 and TIE-GCM models (Fig. 10, middle panel). We attribute this mainly to the models overestimating density as the solar activity decreased towards the deep solar minimum in 2008/2009. However, the decrease of the mean density ratio is a few percent smaller for the new observations, resulting in a more constant time series. This feature is most evident for the DTM-2020 model, where the decrease of the mean density ratio from 2005 to 2009 is about 6% smaller for the new density observations than the old version. We interpret this as a clear improvement in the radiation pressure modeling. From 2011 onward, the mean density ratio of the new observations and the DTM-2020 model varies between 1.0 and 1.2, which agrees well with the early years of the GRACE A satellite. The NRLMSISE-00 model shows a similar agreement, though the mean density ratio is lower, varying from 0.7 to 0.9 in 2011-2018 and from 0.8 to 0.85 in 2002-2005. The standard deviation of the GRACE A observation-model density ratios (Fig. 11, middle panel) provides the strongest evidence of the improvements in the new observations. The standard deviation of the old observations increased from 2002 to 2009 by several tens of percent for all models, whereas the increase is 10-15% smaller for the new observations. In the case of the DTM-2020 model, the peak standard deviation in 2009 is 30%, which is only marginally larger than the standard deviation in [2004][2005][2006][2007], which is about 25%. Beyond 2010, the standard deviation for the DTM-2020 decreases to the lowest value of about 18%. Even though the standard deviation is 5% larger for the TIE-GCM model, the time series show the same variations. The standard deviation resembles the radiation pressure and aerodynamic acceleration ratio presented in Figure 6 (top panel, blue curve). This resemblance confirms that the lower standard deviation of the new observations is due to   C. Siemes et al.: J. Space Weather Space Clim. 2023, 13, 16 the advanced radiation pressure modeling. However, reducing artifacts in the new preprocessing likely contributed to the improvement.
Since the GRACE C density observations are new, we compare them to the GRACE A observations focusing on the continuity from GRACE A to GRACE C, noting that the observation gap is only seven months. In the case of the DTM-2020 and the NRLMSISE-00 models, the mean density ratio connects well from GRACE A to GRACE C: The DTM-2020 model shows nearly identical values in 2017 and 2018, and the decrease in the NRLMSISE-00 mean density ratio from 2016 to 2017 continues to 2018. This agreement indicates that there are no significant offsets between GRACE A and GRACE C in the mean density ratio. We show the mean density ratio for the TIE-GCM model for completeness, noting that most GRACE C observations are above the upper model boundary and, therefore, are difficult to interpret.
We discussed above that the standard deviation is sensitive to the radiation pressure and aerodynamic acceleration ratio. There is a large discontinuity in the ratio of GRACE A and GRACE C because of the vast different altitudes near the re-entry of GRACE A and the launch of GRACE C (cf. Fig. 6, bottom panel). Therefore, we compare the GRACE C standard deviation to that of GRACE A from 2008-2009 since the radiation pressure and aerodynamic acceleration ratios are closest then (cf. Fig. 6, top panel) due to comparable altitude and solar activity. The GRACE C standard deviation is approximately 25-31% in 2018-2021 (Fig. 11, right panel). These values are consistent with the GRACE A standard deviation for DTM-2020DTM- , which was 30% in 2008DTM- -2009. We observe a similar consistency for the NRLMSISE-00 and TIE-GCM models. In this context, we point out that the GRACE C accelerometer data contains substantially fewer artifacts and has a much lower measurement noise level than the GRACE A accelerometer data. Thus, we can only explain the consistency of the GRACE A and GRACE C standard deviation by the accuracy of the radiation pressure model. Therefore, we attribute the improvement observed for GRACE A mostly to the advanced radiation pressure model.

Crosswind observations
When comparing the crosswind observations with the TIE-GCM model, we project the model wind vector v mod onto the direction of the observed crosswind u obs (unit vector) to obtain the model wind in the direction of the observed wind. Figure 12 shows the GRACE A observed crosswind speed in July 2013 in the top panel, the TIE-GCM model wind in the direction of the observed wind in the center panel, and the difference between the observed and model wind speed in the bottom panel. The data is organized by time on the horizontal axis and argument of latitude on the vertical axis. The latter is the angle within the orbital plane, measured from the ascending equator crossing (Montenbruck & Gill, 2012). Each "column" in this plot represents one orbital revolution, starting from the ascending equator crossing at 0°argument of latitude, through the northernmost point in the orbit at 90°argument of latitude, the descending equator crossing at 180°argument of latitude, the southernmost point in the orbit at 270°argument of latitude, until the next ascending equator crossing at 360°= 0°argument of latitude.
We observe high wind speeds up to 800 mÁs À1 near the northern (90°) and southernmost (270°) points in orbit, which are associated with the auroral oval (Lühr et al., 2007). Outside of the polar regions, the observed wind speed ranges from 0 mÁs À1 to 200 mÁs À1 . The almost horizontal stripe of fast wind speeds, starting at 180°argument of latitude on 1 July 2013 and lasting until 10 July, is an artifact of the eclipse entry, stemming inaccurate modeling of Earth's shadow. The pattern of the TIE-GCM model wind is very similar to the observed wind. In the polar regions, we see primarily positive model wind speeds, which means that the direction of the model and observed crosswind is the same, although the magnitude of the model wind is smaller than the observed one. Near the equator crossings (0°and 180°), the model wind is much smaller than the observed wind, which indicates that the inherent noise in the crosswind observations is about 200 mÁs À1 . However, we will demonstrate in the following that the noise in the crosswind observations is far from constant. Figures 13 and 14 provide long-term comparisons of the GRACE A and CHAMP observed crosswinds to the TIE-GCM model wind. The top panels show the mean crosswind speed, the center panels show the standard deviation, and the bottom panels show Pearson's correlation coefficient, all calculated within a sliding window of one day. The GRACE A observed crosswind mean (Fig. 13, top panel, blue curve) shows a large variability during the mission lifetime. , the observed crosswind mean agrees well with the model one, although it is slightly larger by a few tens of mÁs À1 as visible in the zoom on 2015-2016. This bias in the observed crosswind mean reflects the noise in the crosswind observations, which cannot assume negative values because the crosswind observations are positive by definition. The observed crosswind mean in 2004-2011 is unrealistically large compared to model one. The cause is remaining errors in the accelerometer data calibration, radiation pressure modeling errors, and cross-track acceleration measurement noise, which play a more considerable role when the radiation pressure to aerodynamic acceleration ratio is significant. Inspecting the ratio illustrated in Figure 6 (top panel, blue curve), we find that a ratio lower than 0.2 is necessary to derive meaningful GRACE A crosswind observations. Finally, the mean's magnitude follows a cyclic pattern of about 5.5 months, most apparent in 2012-2018. This pattern is caused by the beta angle progression of roughly 360°in 11 months, noting that the thermospheric wind is strongly organized by local time (Lühr et al., 2007).
The standard deviation of the GRACE A crosswind observations shows the same features as the mean, namely a good agreement with the TIE-GCM model in 2002-2003 and 2012-2018, unrealistically large values in 2004-2011, significant spikes after data gaps, and the cyclic pattern depending on the beta angle. The only difference to the mean is that the The correlation between the GRACE A and TIE-GCM crosswinds varies between À0.2 and 0.8. When inspecting the zoom box, we notice that the correlation is low when the mean and the standard deviation are small. When that happens, it is impossible to reliably quantify the correlation, meaning we should not conclude those values. With that in mind, we state that the correlation between observed and model crosswinds is about 0.5 in 2002 and continuously decreases to 0.0 in 2007, indicating that observed crosswinds are essentially random. The decreasing aerodynamic acceleration signal ultimately caused the decrease in correlation. The zero correlation persisted until 2009 when the solar activity reached a minimum (cf. Fig. 6, bottom panel). In 2010-2011, the correlation increased to 0.3, and from 2012 onward to 0.6.
In summary, the GRACE A crosswind observations show a good agreement with the TIE-GCM model in 2002-2003 and 2012-2018. This agreement is a remarkable result considering the processing complexity required to mitigate the adverse effects of the deactivated accelerometer thermal control, noting that the cross-track acceleration measurements are most sensitive to temperature variations.
The CHAMP crosswind observations, compared to the TIE-GCM in Figure 14, behave more consistently than the GRACE A crosswind observations. For CHAMP, the mean, standard deviation, and correlation also show a cyclic behavior with a period of 4.5 months related to the beta angle progression of roughly 360°in 9 months. The mean of the CHAMP crosswind observations follows the same pattern as the TIE-GCM winds. However, the observed crosswind mean is about 100 mÁs À1 , whereas the model crosswind is only about 50 mÁs À1 on average. The observed and model crosswinds' standard deviations are approximately 100 mÁs À1 and agree reasonably well. The correlation is 0.5-0.6 throughout the CHAMP mission lifetime. These consistently good results are a consequence of the consistently large signal-to-noise ratio in the CHAMP aerodynamic accelerations.

Conclusions and outlook
We aim to provide the scientific community with the most complete, consistent, and accurate database of thermosphere density and crosswind observations derived from accelerometer measurements. The results presented here have filled the remaining gaps in our database, which is an important milestone in achieving this goal. In particular, we have produced the first GRACE-FO neutral mass density dataset extended the GRACE neutral mass density observations from 2009 until decommissioning in 2017 and generated a new GRACE crosswind dataset. The latter is published only for 2002-2003 and October 2011 to October 2017 due to teh signal-to-noise ratio.
When producing the results of this study, we used a gas-surface interaction model with a constant value for the energy accommodation as described by March et al. (2021). This disputable choice affects the mean of the density observations as described by Mehta et al. (2022), with the largest effects at the highest altitude of 500 km during low solar activity. Incidentally, radiation pressure modeling is essential for deriving neutral mass density observations under the same conditions. Recent investigations by Bernstein et al. (2020) and Bernstein & Pilinski (2022) demonstrate that we need more accurate knowledge of the gas-surface interaction. The results of this study open up new possibilities for investigating gas-surface interaction modeling, which will benefit from the advances in radiation pressure modeling. After thoroughly investigating the model parameters and the resulting consistency between missions, we plan to employ a variable gas-surface interaction model. This upgrade will further improve the accuracy of the observations, especially those of the GRACE and GRACE-FO satellites, which have been orbiting near 500 km altitude for a significant fraction of the mission lifetime.
The advances in radiation pressure modeling may also prove helpful for gravity field modeling. The accelerometer on the GRACE B satellite was switched off at the end of the mission because of battery issues, necessitating the use of the GRACE A accelerometer measurements for both satellites . Since the satellites flew in nearly identical orbits separated by approximately 220 km and in different attitudes, they experienced slightly different non-gravitational accelerations. Applying several corrections to the GRACE A accelerometer measurements to account for that difference was referred to as "transplanting" the GRACE A accelerometer measurements to the GRACE B satellite. In the case of the GRACE-FO mission, the GRACE D satellite suffered from an anomaly shortly after launch, after which the accelerometer performed much worse than expected , requiring the same "transplantation" approach. Advanced radiation pressure modeling could increase the transplant's accuracy. Similarly, it would be interesting to test whether the accelerometer calibration and the model for the thermally-induced bias variations can improve the retrieval of gravity field models in the case of the GRACE mission.
Although we have made considerable progress in radiation pressure modeling, there is still room for improvement. For example, we currently neglect the conversion of absorbed radiation into electricity by solar arrays. While implementing this process is straightforward, it requires non-trivial fine-tuning of the absorption and reflection coefficients, as well as the thermal capacitance and conductivity parameters, which we have optimized without considering this effect. This fine-tuning is underway and will be implemented in the next release of the neutral mass density and crosswind observations. Finally, we plan to perform a comprehensive uncertainty quantification that takes into account all relevant sources of error and noise: accelerometer measurement noise, calibration errors in the accelerometer data, errors in the radiation pressure and aerodynamic models, including the gas-surface interaction model and thermosphere models providing atmospheric composition, temperature, and in-track wind . The uncertainty quantification will facilitate the assimilation of neutral mass density observations into thermospheric models.  Table A1. CHAMP thermal radiation panel model. We used an internal heat generation P gen = 30 W and a heat capacity C body = 100 kJÁK À1 .
Panel A j (m 2 ) n x,j (-) n y,j (-) n z,j (-) c vis,a (-) c ir,a (-) C j (JÁK À1 ) k j (WÁK  Table A3. GRACE-FO thermal radiation panel model. The absorption coefficients of the panels "Port (outer)" and "Starboard (outer)" are taken from the Swarm mission documentation (Siemes, 2020) because the solar arrays on these panels resemble those of the Swarm satellites rather than those of the GRACE satellites on photos. We used an internal heat generation P gen = 55 W and a heat capacity C body = 100 kJÁK À1 .