An operational solar wind prediction system transitioning fundamental science to operations

We present in this paper an operational solar wind prediction system. The system is an outcome of the collaborative efforts between scientists in research communities and forecasters at Space Environment Prediction Center (SEPC) in China. This system is mainly composed of three modules: (1) a photospheric magnetic field extrapolation module, along with the Wang-Sheeley-Arge (WSA) empirical method, to obtain the background solar wind speed and the magnetic field strength on the source surface; (2) a modified Hakamada-Akasofu-Fry (HAF) kinematic module for simulating the propagation of solar wind structures in the interplanetary space; and (3) a coronal mass ejection (CME) detection module, which derives CME parameters using the ice-cream cone model based on coronagraph images. By bridging the gap between fundamental science and operational requirements, our system is finally capable of predicting solar wind conditions near Earth, especially the arrival times of the co-rotating interaction regions (CIRs) and CMEs. Our test against historical solar wind data from 2007 to 2016 shows that the hit rate (HR) of the high-speed enhancements (HSEs) is 0.60 and the false alarm rate (FAR) is 0.30. The mean error (ME) and the mean absolute error (MAE) of the maximum speed for the same period are 73.9 km s 1 and 101.2 km s , respectively. Meanwhile, the ME and MAE of the arrival time of the maximum speed are 0.15 days and 1.27 days, respectively. There are 25 CMEs simulated and the MAE of the arrival time is 18.0 h.


Introduction
It is well accepted that large-scale interplanetary solar wind structures, mainly stream interaction regions (SIRs; Belcher & Davis, 1971;Sheeley et al., 1977;Smith & Wolfe, 1976;Pizzo, 1985;Tsurutani et al., 2006) and coronal mass ejections (CMEs; Gosling et al., 1974;Gosling, 1990;Tsurutani & Gonzalez, 1997;Richardson et al., 2001;Schwenn, 2000;Schwenn et al., 2005), can cause geo-space environmental disturbances, which may affect infrastructure systems and technologies in space and on Earth: satellite and airline operations, communication networks, navigation systems, and the electrical power grids (Horne et al., 2013;MacAlester & Murtagh, 2014). SIRs are the consequential results of interactions between high-speed streams originating from coronal holes on the Sun and the ambient slow solar wind in the interplanetary space. Due to the relatively long lifespan of coronal holes and their co-rotation with the Sun, co-rotating interaction regions (CIRs) often re-visit the Earth every 27 days approximately. CIRs are predominant during the declining phase and minimum phase of solar cycles (Tsurutani et al., 1995(Tsurutani et al., , 2006. CMEs are enormous eruptions of plasma ejected from the Sun into the interplanetary space over the course of minutes to hours. Very often, they substantially disturb the interplanetary medium. Once the CME is Earth-directed, it may trigger severe magnetic storms when colliding with Earth's magnetosphere. Gonzalez et al. (1999) found that the south component (Bz) of the interplanetary magnetic field (IMF) plays an essential role in causing intense geomagnetic storms and magnetic clouds with very intense core magnetic field can drive extreme storms. Although CMEs frequently occur during times of high solar activities, they also occur during other periods of solar cycles and even in solar minimum. Due to the adverse effects of large-scale interplanetary solar wind structures, it is of significant benefit to predict solar wind conditions near the Earth with at least several days in advance.
In research, simulation of the solar wind is often separated into two parts: the simulation of the inner corona (for example the magnetohydrodynamic algorism outside a sphere (MAS) model presented by Linker et al., 1999) and the simulation of the heliosphere (for example the ENLIL model presented by Odstrcil et al., 2003). The coronal part uses magnetic synoptic maps as input data and extrapolates it to the source surface. The outer boundary conditions derived by the coronal model are then used as the inner boundary conditions for the heliospheric part that simulates the solar wind propagating to 1 AU and beyond. When a CME is detected, its parameters can be derived by empirical models and are successively injected into the preexisting ambient conditions. Subsequent transient evolution of the heliospheric model provides the basis for predicting the CME arrival time on Earth. For each part, many research models or tools have been developed by space physicists.
Many numerical models of the background solar wind include two steps. First, one needs to extrapolate the solar magnetic field from the photosphere to the source surface. Representatives of such algorithms include potential field source surface (PFSS; Schatten et al., 1969), Schatten current sheet (SCS; Schatten, 1971), horizontal current-current sheet (HCCS; Zhao & Hoeksema, 1994), and current-sheet source surface (CSSS; Zhao & Hoeksema, 1995) models. Such extrapolation provides a three-dimensional global configuration of the large-scale magnetic field for the inner corona. The solar wind speed on the source surface is subsequently calculated using the extrapolated coronal magnetic field configuration. A representative of this method is the Wang-Sheeley-Arge (WSA) model. Wang & Sheeley (1990, 1991 found that solar wind speed was correlated with the expansion of solar coronal magnetic field. Arge & Pizzo (2000) and Arge et al. (2004) continued to make essential contributions to the development, implementation, and improvement of the algorithm, with the latter considering the effect of angular separation to the nearest coronal hole at the photosphere. A similar discussion alternatively measured by the distance to the coronal hole boundary is proposed by Riley et al. (2001Riley et al. ( , 2015. Since then, the empirical relationship between solar wind speed and coronal expansion factor was investigated to improve the accuracy of the WSA model (Owens et al., 2005(Owens et al., , 2008. Heliospheric models have been developed either by magnetohydrodynamic (MHD) or kinematic approaches. These models can be used to simulate the propagation of solar wind structures in the interplanetary space. A representative of the MHD model is the ENLIL model (Odstrcil, 2003). It uses an ideal-fluid approximation and solves the equations of ideal magnetohydrodynamics using a total variation diminishing Lax-Friedrich scheme algorithm. The Hakamada-Akasofu-Fry (HAF) model is a kinematic model that projects the fluid parcels of the solar wind from inhomogeneous sources near the Sun outward into the interplanetary space. The HAF model adjusts the flow empirically for stream-stream interactions as faster streams overtake slower ones (Hakamada & Akasofu, 1982;Sun et al., 1985;Akasofu & Fry 1986;Fry et al., 2001Fry et al., , 2003Wang et al., 2002).
As stated earlier, CME parameters are needed before injected into the pre-existing ambient solar wind conditions as inputs for heliospheric models. Researchers have also developed several tools to fit the CME parameters using coronagraph images taken by Large Angle and Spectrometric Coronagraph (LASCO; Brueckner et al., 1995) onboard the Solar and Heliospheric Observatory (SOHO) spacecraft (Domingo et al., 1995). The most widely used technique is the cone model (Fisher & Munro, 1984;Zhao et al., 2002;Xie et al., 2004;Xue et al., 2005), while different authors have used slightly different morphology assumptions. The shape, angular width, propagation direction, and speed of a CME can be estimated using these models, which empirically indicate the corresponding geoeffectiveness of the CME. Another widely used CME model is the graduated cylindrical shell (GCS) model (Thernisien et al., 2006). It is a forward fitting routine to the morphological white-light structure of the CME as it appears in the coronagraph and produces more parameters when compared to the cone model.
The research models or tools described above provide a foundation for the integration of an operational solar wind prediction system. A successful example is the WSA-ENLIL operational model at the National Oceanic and Atmospheric Administration (NOAA) Space Weather Prediction Center (Parsons et al., 2011). It is the first publicly reported operational space weather model, which provides 1-4 day advance warning of solar wind structures and Earth-directed CMEs. Clearly, multiple approaches exist by coupling these models that can provide a similar solar wind prediction as the WSA-ENLIL model does. Researchers have tested the performance of different combinations, for example the MAS/ENLIL model (Owens et al., 2008;Gressl et al., 2013). The MAS code solves the 3D-MHD equations numerically using finite difference over a logically rectangular non-uniform spherical grid (Linker et al., 1999).
At the Space Environment Prediction Center (SEPC; Liu & Gong 2015) affiliated with the National Space Science Center (NSSC) of the Chinese Academy of Sciences (CAS), a collaborative effort between scientists from the research communities and forecasters from SEPC was carried out recently, aiming to develop an operational prediction system for solar wind disturbances. For background solar wind estimation on the source surface, we used PFSS for magnetic field extrapolation and the WSA method for solar wind speed calculation. To gain the best performance, we tested several empirical functions published in literatures relating expansion of magnetic field to solar wind speed. For CME parameter fitting we utilized the ice-cream cone model by Xue et al. (2005). Automatic CME detection approach was developed. When the derived projected angular width of an automatically detected CME exceeds a specified threshold (e.g. 180°), we proceed to CME fitting process to obtain other CME parameters. Noticeably, this approach is capable to detect CMEs continuously and send immediate alert. It aims for providing a prompt warning for the Earth-directed CMEs. Considering that the characteristic parameters of CMEs play a very important role in the forecast accuracy of the CMEs' arrival time, manual CME detection approach was also imbedded to trace the CME fronts. This approach aims for providing a better forecast accuracy. We adopted a modified HAF model for the heliospheric component because of time-cost consideration.
The rest of the paper is organized in the following way. Section 2 describes the structure and components of the system. Section 3 describes the preliminary results of the system, especially with respect to forecasting whether or not the CMEs will arrive at the Earth and their arrival times. Section 4 summarizes the paper and presents the lessons learned during the research to operation (R2O) process.
2 Modules of the operational solar wind prediction system The operational solar wind prediction system consists of three components: the inner coronal module, the heliospheric module, and the CME detection module. The framework of the system is illustrated in Figure 1 along with the flow chart showing the inputs and outputs. The research models (marked by Lozenges in Figure 1), i.e., PFSS, WSA, HAF, and Cone models, are adopted and modified to satisfy the requirements of operational forecast services. An operational forecast system should have three features to meet the application requirements: 1. the system can provide the needed products; 2. the system must run the code automatically and the program is robust; 3. manual intervention at interfaces among individual modules should be allowed.
Our system conforms to these requirements. The original HAF model is modified to calculate the CME's arrival times, which are part of the core products in space weather services. The input data, including the real-time solar magnetic field synoptic charts and coronagraph images, are downloaded and injected into the system routinely, yielding a daily forecast. Both the inputs and the outputs are marked by the black rounded corner shapes in Figure 1. In our system, once there is an Earth-directed CME detected automatically, the manual detection procedure is applied by forecasters as soon as possible, composing ensemble CME forecast as supplementary references.
The transition process by which research models migrate to an operational system (R2O) is described in detail.

Background solar wind on the source surface
The inner coronal module is used to calculate the background solar wind on the source surface. The PFSS model (Schatten et al., 1969;Hakamada, 1995) extrapolating the magnetic field to the source surface (2.5 solar radii) is adopted. The magnetic vector potential is expressed by spherical harmonic expansion, which is truncated to the order of 90 in our calculation. We apply the empirical WSA method (Wang & Sheeley, 1990, 1991Arge & Pizzo, 2000;Arge et al., 2004) to compute the solar wind speed at the source surface. Further out, the empirical HAF model is then utilized to simulate the propagation of the solar wind. The real-time solar magnetic field synoptic charts obtained from the National Solar Observatory (NSO) Global Oscillation Network Group (GONG; Harvey et al., 1996) are fed into the PFSS model. The magnetic field and solar wind speed at the source surface are then derived, providing part of the initial inner boundary conditions for the heliospheric module (Sect. 2.2).
There are several functions published in literatures relating the expansion of magnetic field to solar wind speed empirically. The three functions described below were tested and coupled to our modified HAF model (Fry et al., 2001(Fry et al., , 2003, simulating the background solar wind from the Sun to 2 AU: where f s is the flux tube expansion factor and h b is the minimum angular separation (at the photosphere) between the foot point of an open field and its nearest coronal hole boundary. The resolution of h b is 5°. In Equation (3), the coefficient is changed from ''4.4'' in Owens et al. (2008) to ''1.4''. In fact, the coefficient ''4.4'' sometimes causes the value of in the bracket of Equation (3) to be negative, which is unreliable and unreasonable. In the rest of the paper, functions 1, 2, and 3 refer to Equations (1), (2), and (3), respectively.
The number density on the source surface is assumed to be uniform and adjustable. We will discuss this in detail in the next section.

Heliospheric model
As mentioned above, the HAF model (Fry et al., 2001(Fry et al., , 2003Wang et al., 2002) was used to simulate the background solar wind in the interplanetary space by taking the solar wind conditions on the source surface as an input. Further out, the interplanetary magnetic field is calculated through the conservation of the magnetic flux based on the so called ''frozenin'' condition. The number density of solar wind particles at a given point fixed in space is derived by determining the geometry of the spiraling streamlines and the magnetic field lines from different longitudes (with 0.59-degree interval) and using the conservations of particle flux (see in Hakamada & Akasofu, 1982). The number density on the source surface is assumed to be uniform and adjustable, and is currently set to be 4.236 · 10 4 cm À3 in our model. Specially, the number density at a point of 1 AU is given by where N 0 denotes the total number of magnetic field lines which cross a fixed solar radial line at distances between 1.0 AU and 1.1 AU; V / denotes the azimuthal component of the solar wind speed at 1 AU; V r0 and V /0 denote the radial and azimuthal component of the solar wind speed at 2.5 solar radii and 1 AU in the frame of reference rotating with the Sun, respectively. In Equation (4), the coefficient ''0.2'' is changed from ''0.125'' in Hakamada & Akasofu (1982). However, the south component of magnetic field cannot be obtained in the module, although it is essential to trigger a magnetic storm. The forecast of south component of magnetic field should be one of the priorities in predicting the geoeffectiveness of CMEs. We have modified the original HAF model to introduce the CME disturbances by taking into consideration the background solar wind simulation on the source surface, the eruption source information, and the three-dimensional characteristic parameters of CMEs. The eruption source information includes the beginning, maximum, end time of the flare, integrated X-ray flux maximum, and the location. The CME parameters include the start time, propagation direction, angular width, and the velocity. Hakamada & Akasofu (1982) suggested that flare disturbances can be included by superposing the flare-generated velocity disturbances on the background streams (see Fig. 2.7 in Hakamada & Akasofu, 1982). Consequently, the total distance traveled by the fluid parcels is given by (Hakamada & Akasofu, 1982) R where R f and R b are the distances contributed by the background solar wind and the transient disturbances, respectively. However, such a treatment may not necessarily be true because the combined speed can be an overestimation for the erupted fluids at the source surface. Fry et al. (2001) utilized the shock speed derived from type II radio frequency drift as one of the HAF model inputs. SOHO-LASCO images were used as a supplement by Fry et al. (2003) to improve the event source locations and speeds. The original HAF model (Fry et al., 2001) adopted the shock speed inferred from type II radio burst to mimic the CME speed and predict the CME arrival time. Gopalswamy et al. (2008) found that the CME speed is slightly less than the shock speed. Therefore, in our modified HAF model, we adopted the fitted CME speed from coronagraph images to predict the arrival time of CMEs. We first determined the CME speed from coronagraph images obtained by SOHO-LASCO and then, we adopted this value as the initial speed on the source surface and replace the background speed by the derived CME speed. Other CME parameters derived from coronagraph images such as the angular width and propagation direction were also injected into the modified HAF model as input for the CME propagation simulation.  (left), auto-detection results (middle), and manual detection results (right) of the CME eruption on April 18, 2014. The green asterisks represent the CME front derived by auto-detection, and the red plus signs represent the CME front derived by manual detection.
Furthermore, a traveling CME decelerates when impinging on the interplanetary medium in its path. The deceleration is complicated and still not fully understood (Yamashita et al., 2003). Nonetheless, the deceleration process is nonlinear and radially dependent; i.e., the speed falls faster close to the Sun and tends to be asymptotic further away. To reflect this and simultaneously simplify the problem, we assume the CME speed follows the definition where V f is the initial speed at the source surface, V s is the asymptotic value, and s is the characteristic time scale representing the asymptotic property. The time integral of Equation (6) gives where R f ¼ V f t and R s ¼ V s t are the distances that the fluid parcel has traveled when a CME is absent. The characteristic time scale s in Equation (7) can be replaced in terms of the characteristic length scale L. Subsequently, Equation (7) is rewritten as The characteristic length scale L depends on individual CME speed as well as the ambient solar wind conditions. However, we assume the length to be 0.9 AU in our model at current stage, which needs improvement in the future. The initial CME speed at the inner boundary (source surface) is timedependent with a characteristic decay time with maximum speed occurring at the maximum time of the related flare. Equation (8) replaces the original R-t relation of the HAF model and is also empirical. The angular width of the CME is used to calculate the initial CME speed spatial distribution over the source surface.
We follow a similar procedure as in the original HAF model (Fry et al., 2001(Fry et al., , 2003 for the CME speed temporal profile and the initial speed distribution over the source surface.

CME detection and 3D parameter derivation
The CME detection and 3D parameter derivation module utilizes SOHO-LASCO images to provide CME parameters for our heliospheric model. First, the CME fronts are detected and identified from coronagraph images through both automatic and manual detection procedures. Then, the 3D characteristic parameters of the CME, i.e., propagation direction, angular width, and velocity, are obtained by an ice-cream cone model (Xue et al., 2005). The derived CME propagation direction, angular width, and propagation velocity are subsequently loaded into the heliospheric model to simulate the evolution and propagation of the disturbance in the interplanetary space. The propagation direction and angular width of the CME, together with the simulation output, help the forecasters to predict whether it will reach the Earth or not.
Many auto-detection algorithms are based on coronagraph images obtained by SOHO-LASCO, including the Computer-Aided CME Tracking catalog (CACTus; Robbrecht et al., 2009), the Solar Eruptive Event Detection System (SEEDS; Olmedo et al., 2008), the Automatic Recognition of Transient Events and Marseille Inventory from Synoptic maps (ARTE-MIS; Boursier et al., 2009), and Coronal Image Processing (CORIMP; Morgan et al., 2012;Byrne, 2015). These tools can help detect and identify all kinds of CMEs quickly and easily and play an important role in the statistical study of the occurrence frequency of the CMEs and their characteristics. An algorithm using J-maps (Sheeley et al., 1999;Davis et al., 2009) and the Hough transform (Duda & Hart, 1972) for CME auto-detection is used by CACTus (Robbrecht et al., 2009). Hough transform is an image processing method and can identify lines or other shapes by voting procedure. Zhuan et al. (2017) also used this algorithm to detect and identify CMEs automatically. A comparison with the manual CME catalog in May 2011 at the Coordinated Data Analysis Workshop (CDAW) Data Center (Gopalswamy et al., 2009) reveals that the detection rate of major CMEs is 95% when using this algorithm (Zhuan et al., 2017). Therefore, we adopt in our operational system the same algorithm as in Zhuan et al. (2017). It runs in real-time and provides a rapid detection results as one of the important references for our forecaster. We have considered so far 25 CMEs that occurred between 2014 and 2016. All CMEs are listed in Table 1.
However, flaws may exist with the auto-detection algorithm when detecting partial-halo and halo CMEs. First, some partialhalo and halo CMEs are so faint and ambiguous that the contrast is too small in coronagraph images, which makes the detection of the CME front difficult (for example, the five ''no detection'' CMEs listed in Tab. 1). Second, it appears that the auto-detection algorithm usually detects only the brightest part, rather than the entire CME front, especially for those CMEs that have a large angular width in coronagraph images. Take the example of the halo CME of April 18, 2014 shown in Figure 2. The green asterisks represent the auto-detected CME front. In this case, the auto-detected angular width is 150°. A broken CME front rather than the entire front is identified by auto-detection process. The detected CME front is discrete and unreliable. Third, a CME event may be recognized as multiple events or vice versa. The auto-detection process cannot distinguish between multiple CMEs when their bright fronts appear in coronagraph images in rapid succession. Consider the halo CME of June 22, 2015 shown in Figure 3. Although parts of the entire CME front are detected, the automatic module yields three different CMEs with projected angular widths of 42°, 129°, and 18°, respectively. Fourth, it cannot distinguish whether a CME is coming toward Earth or traveling in the opposite direction, depending only in the auto-detection results of the CME by coronagraph images. In brief, although the auto-detection approach is useful in providing an early warning for the Earth-directed CMEs, the auto-detected CME fronts may differ greatly from manual recognition.
A human-computer interaction tool has been developed, as the detection of the CME front is essential to the CME geometry fitting and the subsequent propagation simulation. It is expected to detect the entire front of partial-halo and halo CMEs quickly and reliably. After selecting an interval of interest, a time series of coronagraph images obtained by SOHO-LASCO are displayed. Manual detection involves three steps: (1) drawing lines that describe the projected angular width of the CME in the difference images; (2) drawing dots on the lines that represent the CME fronts; and (3) submitting the detected CME fronts. The detection results will then be returned and displayed on the screen. In Figures 2 and 3, the manually detected fronts are shown as red plus signs over the blue lines that cover the projected angular width of the CMEs on April 18, 2014 and June 22, 2015. For these two cases, the manually detected CME fronts were found to be more reasonable than that identified by auto-detection. This manual detection tool can help us to identify the fronts of a halo-CME when it is rather faint in coronagraph images, making it difficult for the auto-detection algorithm to work. This also improves the fitting accuracy of a halo-CME, and consequently improves the forecast accuracy for the disturbance propagation simulation. Both the autodetection algorithm and manual detection tools are used at SEPC for CME detection and identification.
The detected CME information is then fed into the icecream cone model (Xue et al., 2005) for the geometry fitting.  The fronts of three CMEs derived by auto-detection are represented by green asterisks, crosses, and triangles. The red plus signs represent the CME front derived by manual detection. Xue et al. (2005) calculates the radial speed vis fitting the projected speed assuming that the geometrical shape of a CME is an ice-cream cone. Figure 4 shows the fitting results for the CME events on April 18, 2014 and June 22, 2015. The red plus signs represent the projected velocity derived by manual detection at each position angle (PA), while the red line represents the optimal projected velocity fitted by the cone model. For the upper panel, the fitted three-dimensional parameters derived based on the auto-detected CME front are as follows: the propagation direction is S53W53, the angular width is 143°, and the velocity is 1190 km s À1 . However, the calculated projected velocities shown by the green asterisks in Figure 4 are discrete and unreliable due to the incorrectly auto-detected CME front in Figure 2. Therefore, the fitted three-dimensional parameters based on the auto-detected CME front cannot be delivered to the solar wind propagation simulation. The fitted three-dimensional parameters derived from the manually detected CME front are as follows: the propagation direction is S02W19, the angular width is 133°, and the velocity is 1210 km s À1 . Such an event is expected to reach Earth and disturb the geomagnetic field.
We have run CME detection and parameter derivation for each event listed in Table 1. Information on the source, the geoeffectiveness, the auto-detection results, and the cone model fitting results obtained by both automatic and manual detection are listed in Tables 1 and 2. We have confirmed that the manual-detection tool helps to detect those halo CMEs that cannot  Fig. 9). The red plus signs represent the derived projected velocity in each PA by manual detection, and the green asterisks, crosses, and triangles represent the derived projected velocity by auto-detection. a The CME parameters taken as input are fitting results by cone-model of SEPC, using the source location as the CME propagation direction. b The CME parameters taken as input are fitting results by cone-model of SEPC. c The CME parameters taken as input are from Met office in CCMC CME scoreboard. d The CME parameters taken as input are from SWRC in CCMC CME scoreboard. e The CME is assumed to be not arrive or can't be recognized as arrive at the Earth.
be entirely detected by auto-detection, which improves the fitting results of the characteristic parameters by the cone-model. Let us take the ''no fit'' CME of November 5, 2016 as an example; it has an angular width of 18°as derived by autodetection, which is too small to be identified as either a halo or partial-halo CME, and therefore will not be applied to cone-model fitting in our operational system. Only 3 out of the 25 cases have an auto-detected projected angular width greater than 180°. Another example is the CME on June 18, 2015. We find that the auto-derived source location of the CME is S76E10, exhibiting a latitudinal separation of 88°, relative to the corresponding flare location of N12E33. The manually derived source location for this CME is N21E21, which appears to be appropriate. A CME is assumed to have a propagation direction, which is along the radial line connected to the center of the Sun. As we all know, the related flare location is not like the propagation direction of a CME event. For some cases, the fitted CME propagation direction based on the cone shape assumption may help to provide a reliable propagation direction. One example is the CME of February 25, 2014. While the flare location is S12E82 to the east of the disk, the manually derived CME propagation direction is S02E33. Assuming a CME propagating radially, a CME having a propagation direction of S02E33 (fitted by Cone-model) is more likely to reach the Earth, while a CME having a propagation direction of S12E82 (source location) is not likely to reach the Earth. For this case, using the related flare location of the CME as the propagation direction would have led to the ''no arrival'' result. In fact, this CME having a projected angular width of 360°fi nally reached the Earth. The selected and optimized models have been integrated by software engineers at SEPC to deliver a platform that is easy for forecasters to use.

Preliminary result 3.1 Background solar wind prediction
Comparing the solar wind simulations to the realistic solar wind conditions from January 2007 to December 2016 measured by the Advanced Composition Explorer (ACE; Stone et al., 1998) spacecraft, we have evaluated and verified function 1, 2, and 3 taking the mean absolute error (MAE) and mean relative error (MRE) as the guideline. The MAE value represents the average of absolute error between predictions and observations. The MRE value represents the average of relative error between predictions and observations respect to the corresponding observations. The less MAE and MRE are, the better the results should be. The comparison is done after eliminating the effects of the CMEs. We have adopted the near-Earth interplanetary CME lists compiled by Richardson & Cane (2010); also see http://www.srl.caltech.edu/ACE/ASC/DATA/level3/ icmetable2.htm, and removed the data relating to the CME arrival from the time series according to the lists. The verifications result is listed in Table 3, including the MAE and MRE of solar wind speed (V sw ), proton density (q), strength of magnetic field (B), and the accuracy of the predicted polarity of the magnetic field. The latter is calculated through a specified value. If the predictions and observations of the radial components of the magnetic field have the same polarity, the value is calculated as 1. On the contrary, it is 0. Subsequently, the accuracy of the predicted magnetic field polarity is calculated by the average of the specified value. The MAE verifications are drawn in Figure 5 illustrating, from top to bottom, the MAE with error bars depicting the standard deviation of solar wind speed (V sw ), proton density (q), strength of magnetic field (B), and the accuracy of the predicted magnetic polarity by month. The red, green, and blue colors represent the results of function 1, 2, and 3, respectively. The histogram of the observations and forecasts is shown in Figure 6. Also shown in the figure is the prediction error (differences between forecasts and observations), denoted by DV sw , Dq, and DB. The black bar represents the ACE observations. The red, green, and blue colors in Figure 6 correspond to function 1, 2, and 3, respectively.
We have identified the following: 1. It can be found from Table 3 that, over the major course of the 10-year period, the prediction error of the solar wind speed is the smallest when using function 1, whereas the prediction error of the density is the smallest when using function 3. It is noticeable in Figure 5 that the results when adopting any one of the three functions perform better in 2009 and 2010 than those during other periods. This implies that the empirical parameters used in the model need to be adjusted and verified over different periods. 2. From Figure 6 the HAF model overestimates the strength of magnetic field in many cases (Fry et al., 2001), similarly to the density. The majority of the predicted solar wind velocity falls between 350 km s À1 and 450 km s À1 , suggesting that the duration of the velocity enhancements may be underestimated. 3. The accuracy of the predicted magnetic polarity is about 0.65. It can be seen in Figures 5 and 6 that the predicted results using function 1-3 is about the same at most of the years.
The solar wind predictions using function 1 are plotted in red in Figure 7, while the observations are represented by black. The longitude of the predicted magnetic field in shown in Figure 8. The ambient solar wind speed and the longitude of magnetic field are successfully reproduced by the HAF model. In this paper, the high-speed enhancements (HSEs; Owens et al., 2005Owens et al., , 2008MacNeice, 2009;Norquist & Meeks, 2010) are also used to estimate the accuracy of the solar wind reconstruction. The HSEs are defined as regions where the solar wind speed increases 20% in no more than 2 days. We have checked whether there is a forecasted HSE associated with each HSE verified by ACE observations. If the difference between the arrival time of the maximum speed of a forecasted and the observed HSEs is no more than 3 days, we count the case as a correctly predicted HSE. The hit, miss, and false of the HSEs are shown in Table 4, as well as the mean error (ME) and MAE of both the maximum speed and the arrival time of the maximum speed. The maximum speed may be underestimated in many cases, and the predicted arrival times of the HSEs are usually later than the observations. The largest hit, and the least miss and false of HSEs usually come from results of function 1. The hit rate (HR; calculated as the ratio of hit to the sum of hit and miss) from 2007 to 2016 is 0.60 J. Wang et al.: J. Space Weather Space Clim. 2018, 8, A39 and the false alarm rate (FAR; calculated as the ratio of false to the sum of hit and false) is 0.30. The mean error (ME) and the mean absolute error (MAE) of the maximum speed for the same period are À73.9 km s À1 and 101.2 km s À1 , respectively. Meanwhile, the ME and MAE of the arrival time of the maximum speed are 0.15 days and 1.27 days, respectively.
A comparison to other literatures has been made in Table 5. There are slightly differences in the testing periods, magnetogram sources, calculation methods of the verifications between these researches, like MacNeice (2009), Norquist & Meeks (2010), Owen et al. (2010 and ours. Our model provides a higher HR of 0.60 and a higher FAR of 0.30 than other researches. In Owens et al. (2008), the ME and the MAE of the maximum speed from 1995 to 2002 by WSA-ENLIL model are À75.1 km s À1 and 97.2 km s À1 , respectively. Meanwhile, the ME and MAE of the arrival time of the maximum speed are 0.94 days and 2.08 days, respectively. The ME and MAE of the maximum speed by our model are consistent with Owens et al. (2008). Furthermore, our model provides a less MAE of the arrival time of the maximum speed than in Owens et al. (2008).
Considering that the velocity of the background solar wind plays an important role in the propagation of the CME into the interplanetary space, function 1 is adopted in our system.

CME arrival time prediction
The heliospheric model should be able to predict the CME arrival time and the magnitude of potential influences, and thus improve the timeliness of warning. Take the April 18, 2014 CME event as an example. The fast halo-CME erupted from the western disk (S16W41), AR12036 at 12:31 on April 18, 2014, and it was accompanied by an M7.3 flare. It reached Earth at 10:22 on April 20, 2014, causing a geomagnetic storm with a Kp index of as much as 5 and a Dst index of À25 nT. Both the background solar wind conditions on the source surface and the information of the associated flare or filament, were fed into the modified HAF model. Also utilized in the model were the 3D parameters of CME: the propagation direction of S02W19, angular width of 133°, and velocity of 1210 km s À1 . The simulated solar wind conditions in Earth's orbit and in the ecliptic plane are shown in Figures 9 and 10, respectively. In Figure 9, the black lines represent the simulated background solar wind, and the red lines represent the simulated disturbed solar wind upon arrival of the CME at Earth. The green lines represent the observations from the ACE satellite. The difference between the red and black lines indicates that the CME arrival affected the background solar wind condition. The top panel of Figure 9 shows that the velocity profile agrees very well with the hourly ACE observations, although the proton density and magnetic profiles have large deviations. The predicted arrival time of the CME was 12:00 on April 20, as shown in Figure 10.
The CME scoreboard (http://Kauai.ccmc.gsfc.nasa.gov/ CMEscoreboard) of the Community Coordinated Modeling Center (CCMC) of the National Aeronautics and Space Administration (NASA) Goddard Space Flight Center (GSFC) is a service that tracks forecasts and CME arrival times as determined by multiple sources within the space weather community. Several CME prediction methods are in widespread use, and the resulting forecasts are submitted in real time. The most commonly used method is the WSA-ENLIL-Cone model , which is implemented by NOAA's Space Weather Prediction Center (SWPC), NASA Goddard Space Weather Research Center (SWRC), and the Met Office Space Weather Operations Centre in the UK. The Solar Influences Data Analysis Center (SIDC) of the Royal Observatory of Belgium also submits their forecasts. In addition, the  Table 2. To obtain the arrival time of a CME, we first simulate the background solar wind for a few days. Then we launch the CME propagation. The difference between the background and disturbed solar wind conditions indicates that the arrival time. The first appearance of the sharp jump in velocity profile is the arrival time of the CME. One should note that when several CMEs are initiated in a close succession, like the five CMEs in June 2015, there will be interactions or even merging of the CMEs, which may alter their geoeffectiveness and the arrival time. For each CME event in such a situation, all previous CMEs will be fed into the heliospheric model in proper order to generate the background solar wind conditions before we launch the present one. The predicted arrival time of the CMEs is given in Table 2, comparing to the report by other agencies.
One should note that to verify the forecast accuracy, both the source location and cone-fit propagation direction of the CMEs are considered while generating the CME propagation direction for the CME simulation. These results are compared to an average prediction error for the CME arrival time in the CCMC list. We first take the source locations related to the CMEs in Table 1 as the input of the CMEs' propagation and feed them into the heliospheric model. On the other hand, we take the fitted CMEs' propagation direction by manual detection and cone-model as the input of the CMEs' propagation direction. Comparing the predicted CME arrival times obtained by both methods, it is very clear that the fitted CME's propagation direction, angular width, and initial velocity play a very important role in determining the prediction error of the arrival time. For the CME events listed in Table 1, the source locations are used as the CME propagation direction to the heliospheric model. Some of them yielded a result that they may not reach the Earth. These cases are shown as ''no arrival'' in the Table 2. However, when the cone-fit propagation directions of the CMEs are put into the heliospheric model, the results show that these CMEs will arrive at the Earth, which agree with the ACE observations. The fitted CME propagation direction based on the Cone shape assumption may help to provide a more reliable propagation direction than the source location in the two cases. For the CME in April 1, 2014 listed in Table 2, the fitted propagation speed exceeds 1700 km s À1 yielding a predicted arrival time at least 45 hours earlier than the observation. Using the CME parameters derived by cone-model, the MAE of the CMEs' arrival time is 18.0 h. For these cases, the MAE of the average of CMEs' arrival time in CCMC scoreboard is 11.0 h. For 11 out of the 25 CME events, the prediction error of the arrival time, using the manually predicted results, is less than the average for the CCMC.
Furthermore, we have also adopted the characteristic parameters of CMEs from SWRC and Met office given in CCMC CME scoreboard into our model and the arrival time prediction is listed for comparison in Table 2. Taking the CME parameters derived by Met office of 14 CMEs as input, the MAE of the CMEs' arrival time of our model and Met office's model is 15.3 h and 11.0 h, respectively. However, the CME of Jun 19, 2015 is not recognized, because the propagation speed from Met office is around 400 km s À1 and less than the background solar wind. The CMEs of September 18, 2015 and April 10, 2016 are not fed into our model, because we assume that the CMEs will propagate radially, and both will not arrive at the Earth given the propagation direction and angular width from the Met office. Taking the CME parameters derived by SWRC of 22 CMEs as input, the MAE of the CMEs' arrival time of our model and SWRC's model is 15.1 h and 12.5 h, respectively. However, the CME of May 2, 2015 is not recognized, because the propagation speed from SWRC is around 400 km s À1 and less than the background solar wind. The CMEs of August 12, 2015 and April 10, 2016 are not fed into our model for the same reason above. In brief, whether a CME will arrive at the Earth depends on the propagation direction and angular width taken as input in our model, and the fitted propagation speed is essential in predicting the arrival time of a CME.

Summary
In this paper, we introduce an operational prediction system for the background and disturbed solar wind. We discussed the performance of the background solar wind prediction. The HR (FAR) of the HSEs in 2007-2016 of our system is 0.60 (0.30). The MEs (MAEs) of the maximum speed and the arrival time of the maximum speed are À73.9 (101.2) km s À1 and 0.15 (1.27) day, respectively. We have simulated 25 CMEs. The performance of our prediction system is evaluated and  Norquist & Meeks (2010) 1997, 2003 The system is now available online. The CME lists determined by the auto-detection approach can be found at http:// eng.sepc.ac.cn/cme. One can use the manual detection approach to detect the CMEs and use the cone-model fitting approach to determine the CME parameters. The heliospehric model can also be running online and the simulation of disturbed solar wind conditions will be provided.
Lessons are learned during the process of migrating from research models to an operational system. Scientific models have been modified and integrated to create an operational forecast system. Research models tend to focus on the solutions of specific science questions and therefore require optimization and robustness when used with real-time (or quick-look) data, rather than scientific data (or corrected/verified data). Moreover, research models often fail to satisfy the requirements of downstream models, which may significantly influence the overall success of the project. Automation also incurs some problems. Operational tools, however, focus on the forecast accuracy and stability of a system, and are developed to satisfy the most urgent requirements of operational services, to forecast halo CMEs for example. In practice, research models may fail in terms of the detection rate and forecast accuracy of the halo CMEs. Therefore, scientific models cannot be put into operational use directly without modification, evaluation, and verification.
The system, however, is crude and requires further improvement. Many appropriate research models and solar-terrestrial observations with good features will undoubtedly emerge in the future. Both will help to improve our system. An ensemble forecasting method can enable a sensitivity analysis for forecast accuracy, and thus improve the accuracy significantly ; dynamic pressure (represented by P*R*R), and solar wind speed (represented by V). R is the radial distance in AU. The units of the D, B, P and V are cm À3 , nT, pPa and km s À1 , respectively. The red point represents the Earth. The inward field lines are shown as blue lines, and outward field lines as green lines. (Fry et al., 2003;Mays et al., 2015). The forecasters at SEPC will continue to test and improve this operational solar wind prediction system. Future work includes the forecasts of the geomagnetic indexes, Ap and Kp.