Assessment of CESE-HLLD ambient solar wind model results using multipoint observation

For a three-dimensional magnetohydrodynamics solar wind model, it is necessary to carry out assessment studies to reveal its ability and limitation. In this paper, the ambient solar wind results of year 2008 generated by the CESE-HLLD 3D MHD model are compared with multipoint in-situ measurements during the late declining phase of solar cycle 23. The near-ecliptic results are assessed both quantitatively and qualitatively by comparing with in-situ data obtained at the L1 point and by the twin STEREO spacecraft. The assessment reveals the model’s ability in reproducing the time series and statistical characteristics of solar wind parameters, and in catching the change of interplanetary magnetic field polarity and the occurrence of the stream interaction regions. We find that the two-stream structure observed near the ecliptic plane is reproduced, but the differences among observations at L1 and the twin STEREO spacecraft are not caught by the model. The latitudinal variation of the results is assessed by comparing with the Ulysses observation. The characters of variation in different latitudinal ranges are duplicated by the model, but biases of the results are seen, and the boundary layers between fast and slow solar wind are sometimes thicker than observation.


Introduction
Three-dimensional (3D) magnetohydrodynamics (MHD) ambient solar wind models are critical tools for space weather forecasting (Feng et al., 2011a(Feng et al., , 2013Wu & Dryer, 2015;Gombosi et al., 2018;Feng, 2020). The first 3D MHD solar wind model transitioned into operations is the Wang-Sheeley-Arge-ENLIL (WSA-ENLIL) model (Odstrcil, 2003), which is responsible for providing 1-4 day solar wind condition forecasts upstream of the Earth, and has been updated to its 2.0 version recently (Note that version number 2.0 of the WSA-ENLIL model is different from the individual version numbers of WSA and ENLIL models). Meanwhile, 3D MHD solar wind models with associated frameworks of physical components and software modules, including the magnetohydrodynamics around a sphere (MAS) (Riley et al., 2012) and space weather modeling framework (SWMF) (Tóth et al., 2012) model, have been installed at the Community Coordinate modeling Center to get them ready for the transition from research to operation (Jian et al., 2015). Targeted at space weather forecasting, the "European Heliospheric FORecasting Information Asset" (EUHFORIA) is developed with the capability to provide MHD modeling of the inner heliosphere (Pomoell & Poedts, 2018). The space-weather-forecast-usable system anchored by numerical operations and observations (SUSANOO) MHD solar wind model is routinely providing solar wind forecast at Earth, Venus and Mars on its website (Shiota et al., 2014). Once the ambient solar wind structures are obtained, these models can be further utilized to forecast the coronal mass ejections (CMEs) by introducing proper disturbances (e.g., Jin et al., 2017;Zhou & Feng, 2017;Török et al., 2018). It can be expected that the ability of solar wind models will be enhanced incrementally with continuous feedback between research and operation, a process that has been demonstrated in the development of numerical meteorology prediction models (Bauer et al., 2015).
To make a model useful and its forecast valuable, it is necessary to carry out assessment studies to understand the capabilities and limitations of the model, and to inspire modelers' path to eliminating or reducing the model's shortcomings (Spence et al., 2004;MacNeice, 2018). In such assessment, various metrics are calculated to quantify the difference between modeled results and observed data, and general trends of deviations are analyzed by inspecting the visual and statistical profiles (e.g., MacNeice et al., 2018;Wold et al., 2018). For the ambient solar wind models, such assessments are usually carried out utilizing near-earth in-situ observations (e.g., Lee et al., 2009;MacNeice, 2009a;Owens et al., 2013;Gressl et al., 2014;Jian et al., 2015;Reiss et al., 2016;Shen et al., 2018).
In our previous study (Li & Feng (2018), referring as 2018 Paper hereafter), the ambient solar wind simulation results of year 2008 were obtained by the Conservation Element Solution Element-Harten-Lax-van Leer Discontinuity (CESE-HLLD) 3D MHD ambient solar wind model in a data-driven way. The results were assessed visually and quantitatively by comparing with imaging observations of the corona and in-situ data at the Sun-Earth L1 point. The assessment results indicated that the model successfully reproduced the large-scale structures of the solar wind near Earth. The solar wind speed was better reproduced than all other parameters with a correlation coefficient of 0.7. The success rate of stream interaction region (SIR) reproduction was 74.19%, and the polarity of the interplanetary magnetic field (IMF) was correctly reproduced for 85.73% of the data points. However, the strength of the IMF and the temperature of the fast solar wind stream were underestimated.
Since inflow conditions for magnetiospheric models are usually determined by solar wind parameters at the L1 point (Eastwood et al., 2017), assessments of solar wind results at the L1 point can reflect the ability and usefulness of solar wind models for terrestrial space weather prediction. In-situ data at the L1 point have been continuously archived for tens of years, and are reliably available in the foreseeable future (Schrijver et al., 2015). Improvements of the model's ability can be objectively tracked by metrics calculated against L1 point data. However, to validate a 3D model covering the 4p domain of the heliosphere more convincingly, it is necessary to assess its results utilizing multipoint spacecraft data. Such multipoint assessment will help to further verify that the solar wind is correctly described at positions other than the L1 point, and provide more comprehensive information about the model's strength and weakness in reproducing the global structure of ambient solar wind in the heliosphere (e.g., Pahud et al., 2012;Broiles et al., 2013;Shiota et al., 2014;Wiengarten et al., 2014;Jian et al., 2016;Merkin et al., 2016).
During year 2008, the heliographic longitudinal separation of the STEREO-A(STA) and STEREO-B(STB) spacecraft increases from about 45°to 90°. The heliographic latitudinal separation between the twin spacecraft varies between about 0°and 10°due to the inclination of their orbital planes with respect to the ecliptic plane. The Earth is always located in the middle of STA and STB. The latitudinal separation among the three points and the temporal evolution of the solar wind structure during its propagation make the in-situ solar wind features exhibit differences among the three points (Simunac et al., 2009;Wiengarten et al., 2014). At the same time, despite approaching the end of its life, the Ulysses spacecraft still generated scientific data during year 2008 at the northern heliosphere. Although data gaps appear occasionally, in-situ observation can still cover the whole year and be utilized to evaluate the overall quality of the simulated results at latitudes away from the ecliptic plane.
In this paper, a multipoint assessment of the solar wind simulation results of year 2008 obtained by the CESE-HLLD 3D MHD model is carried out. The paper is organized as follows: In Section 2, we describe how the observed data and simulated results are obtained. In Section 3, the near-ecliptic simulated results are assessed both qualitatively and quantitatively by comparing with in-situ data of STB, L1 and STA. In Section 4, the latitudinal variation of the modeled results is assessed by utilizing mapped-back in-situ measurement of the Ulysses spacecraft. In Section 5, we make some concluding remarks and discussions.
2 Observational data and model overview

Observational data
The in-situ data of STA, STB and Ulysses are acquired via the Coordinate Data Analysis Web (CDAWeb) Service 1 . For the STEREO twin spacecraft, the solar wind plasma and magnetic field in-situ measurements are obtained by the PLASTIC (Galvin et al., 2008) and IMPACT-MAG (Luhmann et al., 2008) instrument respectively. Hourly averaged STEREO data generated by NASA's Space Physics Data Facility are used to assess the simulated results in this study. For the Ulysses spacecraft, the solar wind plasma and magnetic field in-situ measurements are obtained by SWOOP (Bame et al., 1992) and VHM/ FGM (Balogh et al., 1992) instrument respectively. The temporal resolution of the Ulysses data is also 1-h. The hourly averaged in-situ data at the Sun-Earth L1 point are provided by the OMNI database 2 (King & Papitashvili, 2005).

Model overview
The ambient solar wind simulation results are generated by the CESE-HLLD solar wind model in a data-driven way (Mikić et al., 1999;Feng et al., 2012Feng et al., , 2015Feng et al., , 2017. Continuously updated synoptic maps of the photospheric magnetic field are input into the model to drive the evolution of the corona and the inner-heliosphere. Comparing with steady-state ambient solar wind simulation, which usually produces a time-independent global solution to represent the solar wind structure of one Carrington Rotation (CR), the data-driven simulation generates time-dependent solutions, and therefore may better reflect the evolving nature of the ambient solar wind. The simulated results presented in this paper are obtained with 6-h-cadence radial magnetic field synoptic maps of the photosphere, which are provided by the Global Oscillation Network Group (GONG) 3 . The projected normal characteristic boundary treatment method (Yang et al., 2012;Feng et al., 2015) is used to couple observational input with information inside the computational domain. The simulation domain is divided into a coronal region and an inner-heliospheric region. The coronal region is covered by the Yin-Yang spherical grid, while the inner-heliospheric region is covered by the Cartesian adaptive mesh refinement (AMR) grid. The ideal MHD equations are solved numerically by the conservation element and solution element (CESE) method in the coronal region, and by the HLLD approximate Riemman solver with second-order Monotone Upstream-centered Schemes for Conservation Laws (MUSCL) reconstruction in the inner-heliospheric region. The corona-heliosphere coupling is realized by employing an auxiliary coupling layer.
A volume heating term Q e is added for the energy equation in the coronal region. This volume heating term is responsible for generating fast and slow solar wind flow patterns according to the coronal magnetic field topology (Feng et al., 2010(Feng et al., , 2011b. It is given as: where The values of Q 0 1 and Q 0 2 used in obtaining the simulation results of this paper are 1.60 Â 10 À8 Jm À3 s À1 and 2.14 Â 10 À8 Jm À3 s À1 respectively. L 1 and L 2 are set to be 1 R s and 0.8 R s . C a = C 0 a /800 is a normalized profile factor closely related to the widely used Wang-Sheeley-Arge (WSA) model, with The specific form of the WSA formula used here is adopted from McGregor et al. (2011). Here, f s stands for the expansion factor, and h b represents the minimum angular separation between an open magnetic field foot point and its nearest coronal hole boundary. f s and h b on each open field line are determined by magnetic field topology of the MHD solution, and updated every 24 h to include the effects of magnetic field evolution. Values of the WSA formula's free parameters used here are v s = 240, v f = 675, / = 1.0 and b = 1.2. These free parameters, as well as other free parameters in the heating source term including Q 0 1 and Q 0 2 , are adjusted by trial-and-error. The initial set of these free parameters are based on our previous experience and their values in the references. The adjustments are carried out by comparing results of a selected CR with coronal remote sensing observations (e.g., those analyzed in the 2018 paper) and L1 point in-situ solar wind data. The parameters obtained are used to carry out long-term simulation. To be specific, we select CR 2068 to determine the free parameters used for year 2008. For more detailed descriptions of the CESE-HLLD model and the process of obtaining the simulated results, one may refer to the 2018 paper. Note that except obtaining in-situ results for the twin STEREO spacecraft and Ulysses spacecraft, the results presented in this paper are identical to those in the 2018 paper, i.e., the same model configurations and input data are used.
Simulated solar wind parameters at the L1 point and the position of the twin STEREO spacecraft are interpolated from time-evolving global solutions of the model. During year 2008, the heliocentric distance of the Ulysses spacecraft increased from about 2 AU to 4 AU. However, the model's simulation domain only covers a cube region with a size of [À250 R s , 250 R s ] 3 . To compare the model's results with in-situ data obtained by Ulysses, in-situ solar wind measurements are mapped back to 1 AU under the assumption that the radial speed remains constant for the entire mapping interval (Neugebauer et al., 1998). Density data are scaled assuming the mass flux is conserved. The mapped-back position of the in-situ measurement at 1 AU is used for the interpolation. The time of interpolation is earlier than the time when the data is measured. The amount of time advance equals to the propagation time of the solar wind from 1 AU to the position where it is measured.
3 Assessment by STEREO and L1 data 3.1 Visual and statistical assessment Comparisons between simulated and observed solar wind speed are shown in Figure 1, where the data of STB, L1 and STA are plotted from top to bottom in the first three rows. This sequence is arranged for the convenience of analysis, as certain co-rotating solar wind structure would be first encountered by STB, then L1 and finally STA. In the fourth and fifth rows, the heliolatitude and heliolongitude in the heliocentric Carrington coordinate system are plotted for STB, L1 and STA to illustrate their positions.
Although a few interplanetary coronal mass ejections (ICMEs) are identified during year 2008 at L1 (Richardson & Cane, 2010), STA and STB , the maximum speed of these CMEs is no higher than 500 km/s, as indicated by the ICME lists for L1 4 and the twin STEREO spacecraft 5 (ICMEs mentioned in the following of this paper are all referred form these two lists). After examining the flow pattern during the periods of ICMEs' influence, we conclude that these ICMEs are weak and have no major influence on the variation trend of the solar wind speed. This is a natural result of the low solar activity level during year 2008, and provides an ideal condition for assessment of ambient solar wind results.
In Figure 1, it can be seen that the most pronounced feature in the observed time series is the two prominent fast streams (FSs) that appear recurrently in every CR. The analysis of the 2018 paper concluded that the first and second recurrent FS were associated with an isolated low latitude coronal hole (CH) and the low latitude extension of the south polar CH respectively. Overall, the model replicates the recurrent FSs at STB, L1 and STA, though the timing and shape have some discrepancies.
Some obvious errors of the modeled results can be found visually. At the beginning of CR 2069, an observed FS is missed by the model at STB and L1. At STA, this FS is absent in both observation and modeled results. A similar situation is found at the beginning of CR 2070, but the FS is missed for STB only. At the end of 2073, an FS is falsely created for L1 and STA. For STB, no well-defined FS can be found in simulated results and observed data. At the end of CR 2074, the model creates an FS not seen in the observation for STB and L1, and almost misses the second recurrent FS for STB.
To quantitatively assess the simulated results of the continuous solar wind parameters, two statistic metrics, the correlation coefficient (CC) and root mean square error (RMSE), are computed for each CR and the whole year. The CC, which reveals how well the simulated results follow the evolutionary trend of the observed data, is calculated by where Cov(X S , X O ) is the covariance between simulated results and observed data; Var[X S ] and Var[X O ] are the standard deviation of simulated results and observed data. The RMSE, which is a general measurement of the difference between the simulated results and observational data, is defined as where x S i and x O i are the ith simulated result and observed data respectively. The CC and RMSE for each CR are plotted in Figure 2, and their values for the whole year are marked at the top of each panel. Note that since lower RMSE means better quality of the simulated results, the axis direction of RMSE is reversed.
From the CC of the whole year marked in Figure 2, it is seen that the best correlation between simulated results and observed data is obtained for the parameter of speed, and the worst is obtained for magnetic field strength. For all of the four solar wind parameters, the highest CCs are obtained for STA, while the lowest CCs are obtained for STB. Meanwhile, even for STB, the CC of speed is above 0.6. From the perspective of CC, we may conclude that the best overall results are obtained for STA, and the worst are obtained for STB during 2008. However, in the perspective of RMSE, this conclusion may not hold for all parameters. From the values of RMSE of the whole year, we can see that for density and magnetic field strength, the smallest RMSEs are all obtained by STB, and the largest MSEs are obtained by STA and L1 respectively. The performance of the model is variable between CRs. For each CR, the performance of the model differs at STB, L1 and STA, and the differences are significant at some CRs. For example, at CR 2069, apparent difference between CCs of the three points can be seen for speed, density and magnetic field H. Li et al.: J. Space Weather Space Clim. 2020, 10, 44 strength. Larger differences between RMSEs can be seen between CRs 2072-2074 for speed, at CR 2071 and CR 2076 for density, at CR 2071 and 2075 for temperature, and at CR 2071 for magnetic field strength.
In Figure 2, low and negative CCs occur at CR 2066 and 2067 for density and magnetic field strength, which means that the modeled results of the two parameters do not reproduce the evolutionary trend of the observation well. To investigate the reason, in-situ data of density and magnetic field strength for CR 2066 and 2067 are plotted in Figure 3.
We first focus on the situation at L1. From Figure 1, we can see that the two recurrent FSs are reproduced for CR 2066. For the first recurrent FS, the observed one arrives around Day 32, and the simulated one arrives about 2 days later. From the density profiles in Figure 3, a density ridge can be found in the observed data around Day 31.5 and the simulated results around Day 34. The density ridge is generated by compression of the first recurrent FS to slow stream in front of it, and is a typical character of the stream interaction region (SIR). Later arrival of the simulated FS leads to displacement of the simulated density ridge from the observed one. Since the density enhancement structure itself lasts for only about 3 days, such displacement would result in anti-correlation between observed data and simulated results during some periods. For example, when the observed density starts to decrease at Day 32, the simulated density only starts to increase. For the second recurrent FS which is centered around Day 42 in the observation, it can be seen from Figure 1 that the simulated speed of the slow stream in front of it is overestimated, which results in an underestimate of the density at the arrival of this FS. The arrival time of this simulated FS is earlier than its observed counterpart, which leads to displacement of the density ridge and anti-correlation between simulated and observed density data during certain periods. One another flaw of the simulated results is the falsely created density ridge around Day 29. Displacement and false creation of density ridges can also be seen in Figure 3 at STB, but the displacement is less severe than that at the L1. The situation at STA is even better than that at STB. During CR 2066, CC of density at L1 has a negative sign and is the lowest among the three points. CC of density at STB is positive, but its value is close to zero. CC of density at STA is the highest among the three points. The information revealed by the visual profile in Figure 3 is consistent with the relative magnitude of CC at STB, L1 and STA. H. Li et al.: J. Space Weather Space Clim. 2020, 10, 44 For CR 2067, the cause of the low and negative value of density's CC is similar to that of CR 2066. For the magnetic field strength, the low and negative values of CC mainly result from displacement of magnetic field strength enhancements, which is also associated with the arrival time error of FSs. On one hand, the error of the simulated results is somewhat apparent for CR 2066 and 2067. This is also reflected by RMSEs. During CR 2066 and 2067, relatively large RMSEs of both density and magnetic field strength are obtained for L1, STA and STB. But on the other hand, as can be seen from the time series profile, the model does not fail to duplicate the basic feature of the observation.
In Figure 2, negative CCs are also observable for density, temperature and magnetic field strength in some other CRs. We have checked the time series of the solar wind parameters during these intervals, and found that the negative CCs can be attributed to the larger arrival time error as well as the shape error of the FS.

Comparison of temporally shifted data
To compare the difference among in-situ data of STB, L1 and STA, the STB and STA data are mapped to the L1 point by using temporal shifting. Such a shift removes the effect of co-rotation and radial propagation so that the differences among the data can be seen. Assuming that the propagation of the solar wind is purely radial, the shifted time (Dt) between two points located at longitude u B and u A and heliocentric distance r B and r A can be approximated by (Richardson et al., 1998;Gómez-Herrero et al., 2011): where X is the sidereal rotational angular velocity of the Sun, and v is the solar wind speed. Figure 4 presents the shifted solar wind speed. The first and second panels show the observed and simulated data respectively. The difference between STB/L1, L1/STA and STA/STB shifted speed are presented from the third to the fifth panel. In Figure 5, the CC between STA/L1, L1/STB and STA/STB shifted data are presented in the first to the third panel. The CC is calculated using a running 27-day-long time window to provide the degree of coincidence in the solar wind speed structures. The differences in the latitude and longitude among the three points are showed in the fourth and fifth panel.
We can see from Figure 4 that although the well-defined two-stream structures are observable at all of the three points, there are still notable differences between the observed speed during the whole study interval, which fluctuate between ±200 km/s. Several factors contribute to these differences. The shape and maximum speed of the fast streams are not exactly the same among the three points. The actual propagation time of the fast streams deviates from the "expected" time derived using equation (5). Some non-persistent fast streams are not sampled at all of the three points. While these differences occur during the whole study interval, the CCs between shifted observation show a tendency of less agreement among data of the three points between Day 86-178 (Interval I) and Day 317-360 (Interval II). During these two periods, the latitudinal separation among the three points is larger.
Although the comparison between observed and simulated solar speed is reasonable at each point, the differences between the speed of the three points are not well reproduced by the model. There are cases when the observed and simulated differences match. For an example, as can be seen in Figure 4, the fast stream between Day 21-29 (before the beginning of CR 2066) is broader at STA than at STB in observation, and the simulated results duplicate this feature. The observed and simulated speed difference is thus comparable during this period. However, during most of the study interval, the simulated speed differences do not follow those of the observation. From Figure 5, such inconsistency is more apparent. The variation of CC in observed data during intervals I and II are not shown in simulated CC, and change of CC between Day 240-300 is not seen in the observation. During the whole year, the modeled results are more similar among the three points than observation, as the magnitude of speed differences is lower and the CCs are higher than observation during most of the study interval. model underestimates the occurrence at values below 350 km/s. At STB and L1, the occurrence in the range of 350-650 km/s is mostly overestimated, and the occurrence above 650 km/s is slightly underestimated by the model. At STA, the underestimate and overestimate of occurrence can be found alternately between 340-600 km/s. Above 600 km/s, the model roughly reproduces the observed distribution. For the distribution of density, the single-peak shape of the observed distribution profile is reproduced by the model at STB and L1, but the occurrence below about 3 cm À3 is apparently underestimated. The occurrence above 3 cm À3 is mostly overestimated, and the extent of overestimate appears to be similar at each value bin. At STA, the problems are also underestimate of the low density occurrence and overestimate of the higher density occurrence. However, the simulated distribution has multiple local peaks, which are not seen in the observed profile. The variation range of the simulated temperature is much smaller than that of the observation at all of the three points. While the observed temperature mainly distributes between 0.1 Â 10 5 K and 2.5 Â 10 5 K, the simulated temperature rarely occurs below 0.3 Â 10 5 K or above 1.5 Â 10 5 K. The distribution profile of magnetic field strength looks like the left shift of the observed profile at all of the three points, indicating general underestimate of the magnetic field strength. Figure 7 plots the ratios of modeled to observed maximum (R(max)), mean (R(mea)) and minimum (R(min)) solar wind parameters for each CR. Following Jian et al. (2015), the ratio for the whole study interval is computed by taking the average of the ratios of each CR, and is marked at the top of each plot. This figure can provide the bias of the results in a more quantitative way. From the plots of speed, it can be seen that the observed value range of speed is well reproduced by the model. No ratio lies outside the ±20% region around value 1. The mean and minimum values are slightly overestimated in most CRs.

Distribution assessment
The maximum values are generally underestimated in most CRs, and the underestimate is more pronounced between CR 2071-2075. For the density at all of the three points, the mean values are overestimated by about 15-60% in each CR. The minimum values are mostly overestimated, especially after CR 2070 when all R(min)s are larger than 1.6. The maximum density of each CR, which is in conjunction with the density enhancement in front of the SIR, is also overestimated in most CRs. Comparison between results of the three points reveals that the overestimate for average density at STA is more severe than at L1 and STB. This may partially explain why STA has the largest CC and MSE for density among the three points at the same time. Though the evolution tendency is reproduced best for STA, the larger error of the average value cancels out this superiority and leads to the largest RMSE. The overestimate of the minimum temperature and the underestimate of the mean and maximum temperature reveal that as the simulated results have a systematic bias, their variations are also insufficient, and this has already been revealed by the distribution profiles in Figure 6. Both the maximum and mean magnetic field strength are underestimated by a factor of about 2, except for CR 2070 at the L1 point, where the maximum magnetic field is slightly overestimated. Figure 8 presents the joint distributions of observed data and simulated results in 2D space. This type of figure provides another way to compare distributions and analyze bias in different value ranges. The diagonal line in the contour marks the location of perfectly one-to-one correlation between simulated results and observed data. Above (below) this line, the model overestimates (underestimates) the solar wind parameter. It should be noted that the deviation between simulated and observed distribution may be attributed to multiple factors, including structural errors, arrival time errors and discrepancies of parameter averages. In the contours of speed, the model produces a smooth distribution around the diagonal for all of the three points, indicating that the modeled distribution reasonably correlated with the observed distribution. Nevertheless, the contours also exhibit tendencies of overestimating the observed speed in the range of 250-600 km/s. Above 600 km/s, the model has more chance to underestimate the speed at all of the three points, but for L1 and STA, the possibility of overestimate is also non-negligible. The distribution profiles of density reveal general tendencies of overestimating observed values by the model for all of the three points. The overestimate seems to be more significant for observed values between 2 and 4 cm À3 at STA. The problems reflected by Figure 8 for temperature and magnetic field strength results are similar to those revealed by Figure 6. The simulated temperature has an insufficient range of variation, and the magnetic field strength is generally underestimated by the model.
From the analysis in this subsection, we can see that the speed profile is reproduced satisfactorily by the model, and the results can still be improved. The main problem for the simulated speed is the magnitude of large-scale fluctuation is somewhat insufficient. This problem is reflected by the lack of occurrence in the lowest and highest speed value ranges in Figure 6, and R(max) < 1 as well as R(min) > 1 in most of the CRs. Since the anti-correlation between the solar wind speed and density is widely accepted (e.g., Wang, 2010;Le Chat et al., 2012), the solar wind of the lowest density usually has high speed. The occurrence of simulated results in the low-density range is much lower than the observation in Figure 6. The values of R(min) for density is much higher than 1, and larger than the values of R(mea) and R(max) in most CRs. Nearly all of the points with observed density lower than 3 cm À3 in Figure 8 have corresponding overestimated simulated density. Together with this information, it can be inferred that the most pronounced problem of the density results is the overestimate of density for the fast solar wind. At the same time, the general overestimate of the average density should also be investigated for further improvements. For temperature, information revealed by all of the three figures points to the same problem, that is, the insufficient range of variation. Similarly, the general underestimate of the magnetic field strength is also reflected by all of the figures analyzed in this subsection. Figure 9 shows the interplanetary magnetic field (IMF) polarity sectors of the observed data and simulated results, where the inward and outward polarity of the IMF are represented by 1 and À1 respectively. The method that filters the Fig. 7. The ratios of maximum (red), mean (green), and minimum (gray) solar wind parameters between modeled results and observed data (modeled/observed) at each CR. From left to right, the ratio for speed, number density, temperature and magnetic field strength are plotted. From the first to the third rows, the ratios obtained at STB, L1 and STA are presented. main IMF sectors from raw in-situ data with small scale fluctuations is adopted from MacNeice (2009b) and Jian et al. (2015), and has been described in Section 4.3 of the 2018 paper. The observed IMF sectors of STB, L1 and STA have a similar two-sector structure, i.e., two polarity reversals are observed within each CR. The exceptions are CR 2071 for STB, CR 2070 for L1 and STA, as well as CR 2077 for STA. The outward polarity sector lasts longer before CR 2070 and shorter after CR 2070 than the inward polarity sectors at all of the three points. As analyzed in our 2018 paper, the large-scale IMF polarity sector reversals result from the sweep of the HCS. The shrink of the southern polar CH leads to the southward movement of the HCS between 0°and 180°longitude, which in-turn changes the relative temporal length of the polarity sectors after CR 2070. Overall speaking, the observed polarity sectors are reasonably reproduced by our model. However, there are also noticeable discrepancies, including missing, false creation and large arrival time error of the IMF polarity reversals.

IMF polarity assessment
Quantitative metrics are computed for the IMF polarity results. The total match rate (TMR) represents the fraction of simulated data points that match the observation. An observed polarity reversal is defined as hit when it is reproduced by the model and miss when it is not. A simulated polarity reversal is defined as correct alarm when an observed polarity reversal associated with it can be found. Otherwise, it is defined as false alarm. The number of hit, miss, correct alarm and false alarm are denoted by N HIT , N MISS , N PA and N FA respectively. It should be noted that N PA = N HIT always holds. We use event-based metrics to measure the ability of the model to reproduce the polarity reversals at STB, L1 and STA. These metrics include the probability of detection which is the ratio of hit reversals in the observed reversals. The miss rate is defined similarly. The positive predictive value is the ratio of positive alarm reversals in the predicted reversals. In a similar way, the false alarm ratio is defined The threat score is a widely used event-based metric, which measures the overall performance of the model to forecast key events. It is defined as One another metric is the bias which indicates whether the model tends to over-predict (Bias > 1) or under-predict (Bias < 1) the observed events. The arrival time error of polarity reversal, Dt, is computed by Dt = t P M À t P O , where t P M and t P O are modeled and observed arrival time of the polarity reversal. The average of Dt and its absolute value |Dt|, that is, hDti and h|Dt|i, are then computed to represent the overall arrival time error.
The values of these metrics are listed in Table 1. From the value of TMR and TS, it may be concluded that the best polarity prediction is obtained at STB. At the same time, the metrics listed in Table 1 do not show remarkable differences between the three points. The values of Bias indicate that the model tends to under-predict polarity reversals at STB and STA, and overpredict them at L1. Meanwhile, since the values of Bias do not deviate from 1 much, these tendencies are not significant. The average arrival time errors are in the range of 1.4-1.8 day, as indicated by the value of h|Dt|i. The model tends to predict early arrival of the polarity reversal, since the signs of hDti are all negative at the three points.

SIR and FS assessment
During the declining phase of the solar cycle, SIRs and their following FSs are the predominant drivers of geomagnetic activities (Richardson & Cane, 2012;Vršnak et al., 2017). Although the SIR-driven geomagnetic storms are generally weaker than CME-driven storms, their effects on spacecraft surface charging and accumulated orbit decay are more severe (Borovsky & Denton, 2006;Chen et al., 2014). In the 2018 paper, a method that automatically identifies SIRs in observational data and simulated results is adopted (MacNeice, 2009b;Jian et al., 2015). The number and ratio of hits/misses among observed SIRs and correct/false alarms among the modeled SIRs are calculated to reflect the successfulness in catching SIRs. In this paper, the same SIR identification method is applied to in-situ data at STB, L1 and STA. The ability of the model in catching SIRs is indicated by the event-based metrics introduced in the previous subsection. The arrival time error of SIRs is represented by the temporal discrepancy between observed and modeled stream interfaces (SIs), where SI is defined to be the position of the highest speed gradient within each SIR. All identified SIs are marked in the speed profiles in Figure 1.
In this paper, in order to assess the model's capability of catching SIRs more adequately, new metrics for SIRs are also calculated. These metrics include the discrepancies of maximum speed, minimum speed and duration time between observed and modeled SIRs, which evaluate how well the model reproduces the shape of SIRs. It has been found that the FSs following the arrival SIRs also have significant geoeffect, such as the prolonged high-latitude AE activity (Tsurutani et al., 2006;Kilpua et al., 2017). To assess the FS results, the duration time as well as the average speed of FSs are compared between simulation and observation. Here, a threshold of v > 450 km/s is used to identify FSs following the arrival of SIRs (Richardson & Cane, 2012;Cranmer et al., 2017), and each FS region is restricted between the start position of two adjacent SIRs.
The assessment results are listed in Table 2. The highest TS is obtained at STA while the lowest is obtained at STB. At STB, the value of Bias indicates that the model has more chance to miss observed SIRs. At STA and L1, the chances of missing and falsely creating SIRs are almost the same. The modeled SIRs tend to arrive earlier than observation, as the signs of hDt SI i are all negative at STB, L1 and STA. The average arrival time error represented by h|Dt SI |i is around 1.2 day. From the value of h|Dt D-SIR |i and hDt D-SIR i, we can see that the modeled duration time of SIRs tends to be shorter than observation at STB and L1, and slightly longer than observation at STA. The average error of SIR duration time is around one day. The tendency of the model to overestimate the minimum speed and underestimate the maximum speed is revealed by the signs of hDv min i and hDv max i. The extent of deviation is characterized by the value of h|Dv min |i and h|Dv max |i. For the FSs following the arrival of SIRs, the model tends to predict longer duration than what is observed, and the average duration time errors range from 1.9 to 2.7 days.

Selected period assessment
In this subsection, we will focus on two special periods, i.e., CR 2070-2071 and CR 2076-2077. As can be seen from Figure 1, the latitudinal separation between STB, L1 and STA is most significant during these two periods, which provides chances to analyze possible causes of the deviations in the in-situ data and better understand the ability of our model by conducting comprehensive analysis based on both global structure and multipoint in-situ data. Before we continue, it should be noted that in the 2018 paper, the position and shape of the large-scale coronal structures have been validated by comparing modeled global structures with remote sensing images. These features include coronal holes, streamer belts formed around the HCS, and pseudo-streamer belts. The validation results indicate that the modeled structures are in basic agreement with observation. Since it is commonly accepted that coronal holes are associated with fast solar wind streams, and streamer and pseudostreamer belts are associated with solar wind streams of lower speed, we will assume in the following analysis that the overall large-scale solar wind speed structures are reliable, and try to find possible detailed discrepancies. Figure 10 presents the simulated low-latitude solar wind structures at 1 AU and the in-situ speed profile of STB, L1 and STA for CR 2070-2071. The trajectories of STB, L1 and STA are superimposed on the synoptic maps of the speed. To gain more information of the results, we sample solutions for virtual spacecraft. The heliolatitude of two such spacecraft are 5°(STB-S5) and 10°(STB-S10) south of STB. Similarly, another two spacecraft, namely STA-N5 and STA-N10, which are 5°and 10°north of STA, are set similarly.
As can be seen from Figure 10, at the beginning of CR 2070, a short observed FS is missed by the model at STB. This FS is ahead of the first main recurrent FS and not seen at the beginning of the next CR. The simulated speed sampled by STB-S5 and STB-S10 is significant higher than that of STB, indicating that STB is already at the edge of the simulated slow wind band. However, the sharp increase of the solar wind speed is not seen by STB-S5 and STB-S10, and the miss of this FS cannot be totally attributed to the positional error of the slow solar wind band. At the same time, the positional and structural error of the slow wind band is not so significant to cause other apparent discrepancies at L1 and STA, which are at the north of STB.
The first recurrent FS centered around Day 142-145 is well reproduced at all of the three points during CR 2070, but the difference among the arrival time of the shifted observation is not caught. However, arrival time difference can be seen among the results of STB, STB-S5 and STB-S10. This recurrent FS is associated with the high-speed region around longitude 225°. From the first speed map, we can see the leading edge of this high speed region around longitude 250°is oblique in the south of STB. Such an oblique edge would make STB, STB-S5 and STB-S10, which are separated in latitude but have the same longitude, encounter this simulated high speed region at different time. However, at the latitude of STB, L1 and STA, the edge is nearly vertical. The simulated shifted arrival times  H. Li et al.: J. Space Weather Space Clim. 2020, 10, 44 may therefore have no significant difference. During CR 2071, apparent underestimate of the maximum speed of this recurrent FS centered around Day 170-172 can be identified at all of the three points. The underestimate is most pronounced at STA and least pronounced at STB. The speed sampled by STB-S5 and STB-S10 does not differ much from that of STB, but the speed of STA-N5 and STA-N10 is much lower than that of STA. These data indicate that STB is at the center of this simulated high speed region, and STA is at the edge of it. The underestimate of solar wind speed at STB indicates that the speed at the center this high speed region is underestimated, and the underestimate at STA and L1 implies that the structural error also exists during CR 2071. The shifted results show more difference in the arrival time than shifted observation, and the deviation of the maximum speed between L1 and STB is more pronounced in the observation.
During CR 2070, the second recurrent FS centered around Day 155 is reasonably reproduced, but the simulated one arrives earlier and lasts longer than the observed one at all of the three points. This FS is associated with the low-latitude high-speed region extended from the south pole around longitude 90°in the synoptic maps of Figure 10. STA-N5 show more comparable duration time with observation, but the maximum speed of the FS is underestimated. In the shifted observational data, the arrival time of this FS for STB is later than that for L1 and STA. Although this feature is seen in the shifted modeled results, the differences among the arrival times are not so pronounced as observation. During CR 2071, this high speed region shrinks poleward in the east of longitude 90°. In the in-situ profile, the simulated duration time of the second recurrent FS around Day 180 is comparable to the observation at all of the three points, but the maximum speed is underestimated for STB and L1. The virtual spacecraft do not sample significant better results during this period. For this FS, the differences between the shifted results of the three points are much less than observation.
As revealed by Figure 11, during CR 2076, the first recurrent FS is observed around Day 302 at STB, Day 305 at L1 and Day 307 at STA. The arrival time error of this recurrent FS is not significant, but its maximum speed is underestimated at all of the three points. In the shifted data, the maximum speed of the FS at STA and L1 is higher than that at STB, and the model catch this feature. During CR 2077, the first recurrent FS appears around Day 330 at STB, Day 333 at L1 and Day 336 at STA in observed data. The simulated FS lasts longer and has a higher maximum speed than the observed one at STB and STA, but is more comparable with the observation at L1. For this FS, the virtual spacecraft do not sample significantly better results at CR 2076 and 2077. Note that the virtual spacecraft used during CR 2076 and 2077 are 5°(STB-N5) and 10°(STB-N10) north of STB, and 5°(STA-S5) and 10°(STA-S10) south of STA.
The second observed recurrent FS appears around Day 313 at STB, 315 at L1 and 317 at STA during CR 2076. The model reasonably reproduces this FS at L1. The arrival time error of this FS is larger at STB, and the observed two-peak structure of this FS is not reproduced at STA. During CR 2077, the second recurrent FS is centered around Day 338 at STB, Day 342 at L1 and Day 345 at STA. It should be noted that an ICME is identified between Day 339 and 340 at L1 with a  Fig. 11. Same as Figure 10, now for CR 2076-2077. Pink scatters in the second row represent results sampled by STB-N5 and STA-S5, while the purple scatters represent results sampled by STB-N10 and STA-S10.
H. Li et al.: J. Space Weather Space Clim. 2020, 10, 44 maximum speed of 510 km/s. Form the observed profile at the L1 point, it seems that the peak of the second recurrent FS can still be seen at about Day 343, but the overall profile of this FS may be disrupted by the ICME. For this FS, the model grossly overestimates the speed at STB, but produces more comparable results at L1 and STA. From the shifted data, we can see that the duration time of this FS is much shorter for STB than for L1 and STA, but the simulated results do not catch this feature.
Around Day 317 at STB and Day 322 at L1, an FS is seen in the observation. At these times, STB and L1 are in the north of the equator around longitude 60°. As can be seen from the first and second speed map in Figure 11, the simulated band of solar wind variability in the northern hemisphere bulges towards the equator in the vicinity longitude 60°. In the in-situ simulated results of STB and L1, the speed does raise when the observed FS appears, but it does not raise high enough to reproduce the observation. STB-N5 and STB-N10 sample higher solar wind speed, but its duration is much longer than observation of STB. Figures 12 and 13  As can be seen from the comparison, the position and the shape of the observed open field regions are roughly reproduced by the model. Such a conclusion was also drawn from the 2018 paper. However, some fine details of the observed open regions are lost by the model. During CR 2070-2071, the isolated low latitude coronal hole between longitude 270°and 315°has an arrow-like shape in the observation. However, the simulated coronal hole is much larger than observation and does not reproduce the arrow-like shape. At the beginning and end of CR 2071, the simulated CH even connected with the northern polar CH, while the observation show this CH is still isolated. During this period, the southern polar CH extends more to low latitude than the observation between longitude 0°and 90°. During CR 2076-2077, the isolated low latitude CH drifts to longitude 315°. As this CH tilts to the upper right in the observation, the simulated one tilts to the upper left. The simulated low latitude extension of the southern polar CH is much larger than observation.
The analyses in this subsection indicate that the results would not be significantly improved by only changing the latitude of the sampling point, and there are structural errors in the simulated high speed regions. When the same high-speed region is sampled by the spacecraft separated in latitude and longitude, the difference in the in-situ data of the spacecraft is determined by its shape and temporal evolution (e.g., Simunac et al., 2009;Riley et al., 2010;Gómez-Herrero et al., 2011). An example can be seen in the analysis of this subsection. When the edge of the high speed region is oblique in the latitude of STB, STB-S5 and STB-S10, the arrival time of the first recurrent FS is different among the three points at CR 2070. Therefore, the inability of the model to reproduce the difference among shifted data of STB, L1 and STA may also be partially attributed the structural error of the high speed region. As the CHs are the source region of the high speed region, the deviation of the simulated CHs in reproducing the fine detail of the observation should be responsible for the structural errors of the high speed regions, though it may not be the only cause of the problem.  a function of latitude are plotted in Figure 15 to support the validation. The latitudinal change profiles are sampled at longitude 45°, 135°and 225°. At longitude 135°and 225°, high-speed regions appear near the equator. At longitude 45°, the solar wind structure resembles the typical bipolar structure.

Assessment by Ulysses data
During year 2008, the Ulysses spacecraft cruises from the northern polar region above latitude 80°to the low latitude region around latitude 30°. Before the mid of CR 2069, the Ulysses spacecraft is above latitude 60°. Both the observed and simulated solar wind speed are around 750 km/s and show little variation, which is typical characteristics of the solar wind in the polar region during solar minimum and declining phase. Variation trend of the density is also comparable between simulation and observation, but the average density is underestimated by 0.530 cm À3 , which is about 29.0% of the average observed density. As can be seen from Figure 15, for the three selected CRs, both the southern and northern polar regions are dominated by relatively uniform high-speed solar wind. The magnitude and latitudinal change profile of the results are similar above 60°and below À60°, except that gradual decrease of speed and increase of density towards lower latitude are seen below À60°. Although Ulysses did not sample the southern polar region during 2008, data obtained during its first and third fast latitude scan indicated that the solar wind structures were largely symmetric between the poles (McComas et al., 2000;Sokół et al., 2013). Therefore, basic correctness of the solar wind solution at both polar regions may be concluded from the comparison here.  H. Li et al.: J. Space Weather Space Clim. 2020, 10, 44 Between the mid of CR 2069 and the mid of CR 2075, limited variations appear in the observed and simulated profiles. The model underestimates the solar wind speed during this interval by about 50 km/s. The average of the simulated density is 2.190 cm À3 and 8.7% higher than observation. During this interval, Ulysses is between latitudes 60°and 40°. As can be seen from the global structures in Figure 15, the solar wind speed within this latitudinal range is lower at CR 2071 than at CR 2066 and 2077. As the global solar wind structure is relatively stable during solar minimum, this change is likely to be unrealistic, and leads to the deviation between results and Ulysses data.
After the mid of CR 2075, more variable solar wind speed can be seen in the observed and simulated profile, and their magnitudes of variation reasonably match. During this interval, the Ulysses spacecraft is between latitude 40°and 30°, and at the edge of the band of solar wind variability in simulation results presented in Figure 15. As the simulated results and observed data start to show apparent variations at the same latitudinal range, it seems that the latitudinal extension of the simulated band of solar wind variability does not significantly deviate from reality. The thickness of the boundary layer between fast and slow wind is observed to be about 20°~30°during Ulysses's first and third fast latitude scan. However, at longitude 45°, where no complex structures like low latitude high-speed structures are sampled as latitude changes, the thickness of the simulated boundary layer is sometimes larger. For example, the southern boundary layer of CR 2071 has a thickness of about 40°, and the northern boundary layer of CR 2077 has a thickness of about 50°.
During the whole study interval, the CC between observed data and simulated results is 0.714 for speed and 0.317 for density. The RMSE is 0.495 Â 10 2 km/s for speed and 0.857 cm À3 for density.

Summary and discussion
In this paper, the ambient solar wind results generated by the CESE-HLLD 3D MHD model during year 2008 are assessed by comparing with multipoint in-situ measurements. Systematic assessment for the results near the ecliptic plane (also near the solar equator) is carried out by utilizing in-situ data obtained at the L1 point and by the twin STEREO spacecraft. The latitudinal variation of solar wind away from the ecliptic plane is assessed by utilizing mapped-back in-situ measurement obtained by the Ulysses spacecraft.
During year 2008, the longitudinal and latitudinal separation between STB, L1 and STA allow us to assess the near-ecliptic results from multiple sampling points, and therefore make the assessment more credible than single point assessment at L1. The two-stream structures are reproduced at all of the three points, which further confirm the basic validity of the model that has been demonstrated in the 2018 paper. However, deficiencies of the model are also found by comparison of data among the three points. There are notable differences among the observed data of the three points. Overall speaking, the model can hardly reproduce these differences, which are reflected by the fluctuation of speed differences and the variation trend of the CC between shifted observation. More detailed analysis in Section 3.6 reveal that the differences of the maximum speed, duration time and arrival time of the FSs among the three points are seldom caught by the model.
The common problems revealed by comparison at all of the three points are underestimate of the temperature for the fast solar wind and underestimate of the magnetic field strength. While these problems can be identified from single-point comparison, the multipoint comparison further confirms that these problems result from the common defects of the near-ecliptic solution. The solution of the ambient solar wind are sensitive to the latitudinal change, especially during the solar minimum (Owens et al., 2020). Some authors exploit this property to generate ensemble forecast to provide the range of uncertainty for the solar wind forecast at the L1 point (Owens & Riley, 2017;Reiss et al., 2019). The values of metrics of continuous solar wind parameters, IMF polarity and SIR events are different among STB, L1 and STA. Since the three points are latitudinally and longitudinally separated at the same time, metrics obtained at STB, L1 and STA by using historical data would also provide information on the possible ranges of metrics values for future forecasts. Once the model turns to operation, the multipoint validation carried out in this paper may become part of the forecast pipeline. The accumulation of validation results from real-time forecast data will provide more abundant information for the improvement of the model.
Comparison between modeled results and mapped-back Ulysses observation in Section 4 confirms that our model successfully duplicates the characters of variation in different latitudinal ranges. The magnitudes of speed and density are comparable, but systematic errors also appear during some intervals. Although the Ulysses only sampled solar wind data between latitude 80°and 30°, the solar wind profiles in other latitudinal ranges are analyzed qualitatively using the information provided by data obtained during Ulysses's first and third fast latitude scan. While the overall trend of latitudinal change is duplicated by the model, the boundary layers between fast and slow solar wind are sometimes thicker than observation.
The volume heating term specified by equation (1) plays an important role in determining the modeled coronal and heliospheric structure. Some of the problems found in this paper are likely to be solved by further optimization of the free parameters in the heating term. Examples of such problems are general underestimate or overestimate of speed for certain lowlatitude fast speed regions, and overestimate of the thickness of the boundary layer between fast and slow wind. The optimization can be carried out by manually tuning or adopting more advanced automatic optimization methods (e.g., Koziel & Yang, 2011). However, as we can see from Section 3.6, there are structural errors in solutions of our model, which are responsible for errors of the in-situ data at each point, and the inability of the model to reproduce differences among observation of the three points. To improve the quality of the heliospheric simulation results, the model should reproduce not only the overall structures of the corona and heliosphere, but also their fine structures. The volume heating term is ad-hoc and has its intrinsic limitation. In recent years, some more sophisticated and physically meaningful coronal heating and solar wind accelerating methods have been implemented for 3D MHD models successfully (e.g., van der Holst et al., 2014;Mikić et al., 2018), and produce promising detailed coronal structures. To further improve our model, it is necessary to implement more advanced coronal heating and solar wind accelerating method.
Another important factor that influences the modeled results is the photospheric synoptic maps.It has been found that modeled results obtained by using synoptic maps of different sources can differ significantly (Riley et al., 2012). Besides GONG, the updated synoptic maps can also be obtained from other data sources, like the Helioseismic and Magnetic Imager (HMI) onboard the Solar Dynamic Observatory (SDO), and the Air Force Data Assimilative Photospheric Flux Transport (ADAPT) model. In the future study, it will be beneficial to compare results obtained using synoptic maps from different data sources to further understand the influence of the synoptic map and choose the one that performs better with our model. RMSE and CC are widely used metrics to quantitatively assess the simulated results. Despite their simpleness, they are calculated on a point-to-point basis without regard to spatial information. A forecast feature with basic correctness in terms of size and structure might yield poor metric values if the feature is displaced in space and time. This phenomenon is known as "double penalty", as the forecast feature will be penalized once for missing the observation and again for falsely creating it in other position (Gilleland et al., 2009). In Section 3.1, we have analyzed the cause of low and negative CCs for density during some CRs. The primary reason is the displacement of the simulated density enhancements from those in the observation, which may be viewed as an example of the double penalty phenomenon. What's more, simple statistical metrics like CC and RMSE provide no feedback on the physical nature and source of the error. One way to resolve this problem is comprehensively assessing the results by utilizing a variety kinds of metrics and figures (e.g., Lee et al., 2009;Gressl et al., 2014;Jian et al., 2015Jian et al., , 2016. Another way is applying new metrics design principles (Casati et al., 2008;Ebert et al., 2013), such as the features-based approach, the field deformation verification, the neighborhood method and the scale separation method, to overcome the double penalty problem and offer more effective diagnosing tool for modelers and forecasters. One such work has been carried out by Owens (2018), who applies the neighborhood method to determine the time scales over which a solar wind forecast is or is not valuable.
After collecting treasure troves of data on solar wind under different solar activity levels for almost 20 years, the Ulysses mission ended in June 2009. Communication with STEREO-B was lost in October 2014. Since then, continuous in-situ solar wind observations are mainly available from the L1 point and STEREO-A spacecraft. However, data obtained by new spacecraft are coming. The Parker Solar Probe (PSP) launched in August 2018 has already brought in in-situ solar wind measurements in the inner heliosphere (e.g., Bale et al., 2019;Kasper et al., 2019). The perihelion of PSP will gradually reduce down to 9.5 R s in the coming years, which will enable the spacecraft to measure solar wind characteristics even closer to the Sun (Fox et al., 2016). The solar orbiter (SO) launched in February 2020 has an out-of-ecliptic orbit, and will measure solar wind parameters at different solar distance and latitudes (Müller et al., 2013). With these new data, the multi-point assessment may further reveal the model's ability to reproduce the radial and latitudinal variation of the solar wind, and in turn promote the improvement of the model.

Metrics of each CR
In Tables A.1