Analysis of signal to noise ratio in coronagraph observations of coronal mass ejections

We establish a baseline signal-to-noise ratio (SNR) requirement for the European Space Agency (ESA)-funded Solar Coronagraph for OPErations (SCOPE) instrument in its field of view of 2.5–30 solar radii based on existing observations by the Solar and Heliospheric Observatory (SOHO). Using automatic detection of coronal mass ejections (CMEs), we anaylse the impacts when SNR deviates significantly from our previously established baseline. For our analysis, SNR values are estimated from observations made by the C3 coronagraph on the Solar and Heliospheric Observatory (SOHO) spacecraft for a number of different CMEs. Additionally, we generate a series of artificial coronagraph images, each consisting of a modelled coronal background and a CME, the latter simulated using the graduated cylindrical shell (GCS) model together with the SCRaytrace code available in the Interactive Data Language (IDL) SolarSoft library. Images are created with CME SNR levels between 0.5 and 10 at the outer edge of the field of view (FOV), generated by adding Poisson noise, and velocities between 700 km s 1 and 2800 km s 1. The images are analysed for the detectability of the CME above the noise with the automatic CME detection tool CACTus. We find in the analysed C3 images that CMEs near the outer edge of the field of view are typically 2% of the total brightness and have an SNR between 1 and 4 at their leading edge. An SNR of 4 is defined as the baseline SNR for SCOPE. The automated detection of CMEs in our simulated images by CACTus succeeded well down to SNR = 1 and for CME velocities up to 1400 km s 1. At lower SNR and higher velocity of 2100 km s 1 the detection started to break down. For SCOPE, the results from the two approaches confirm that the initial design goal of SNR = 4 would, if achieved, deliver a comparable performance to established data used in operations today, with a more compact instrument design, and a margin in SNR before existing automatic detection produces significant false positives.


Introduction
Coronal mass ejections (CMEs) are arguably the cause of the most extreme space weather at Earth, posing a severe threat to human technological systems (Cannon et al., 2013;Eastwood et al., 2017;Riley et al., 2018). Space weather can affect our communication capabilities, our space-based infrastructure including the accuracy and reliability of satellite positioning services such as the Global Positioning System (GPS), organic life in the air and in space, and cause damage to power grids.
Notable events in the past have caused damage to transformer units that led to power outages, e.g. the loss of 20 GW of power in the Hydro-Quebec power grid in 1989. Several countries have already assessed the potential socio-economic cost of space-weather impacts, and have ranked space weather at a very high level among other natural disasters. For example, the UK national risk register of civil emergencies 2017 1 (Cabinet Office, 2017) places space weather at 3 out of 5 on an impact severity scale and 4 out of 5 on a likelihood of occurrence within 5 years.

Topical Issue -Space Weather Instrumentation
Due to the optically thin nature and low brightness of CMEs, their continuous observation out to large angular distances from the Sun (elongations) can only be made from space. Coronagraphs can observe the solar corona over a field of view (FOV) extending out to some 30 solar radii (R ); heliospheric imagers can observe further from the Sun, potentially up to 1 astronomical unit (AU) and beyond. At the present time, there are only three space missions hosting operating coronagraphs and three hosting operating heliospheric imagers. The Solar and Heliospheric Observatory (SOHO) (Domingo et al., 1995), which was launched in 1995 and is located at the Sun-Earth Lagrange-point L1, features two operating coronagraphs C2 and C3 in the Large Angle and Spectroscopic Coronagraph (LASCO) instrument package (Brueckner et al., 1995; note that the third coronagraph, C1, is no longer operational). The Solar Terrestrial Relations Observatory (STEREO) (Kaiser et al., 2008) consists of two nearly identical spacecraft, STEREO-A and -B, that were launched in 2006 into 1 AU heliocentric orbits (note that, while STEREO-A is functional, STEREO-B lost communications with Earth in 2014). Each STEREO satellite features two coronagraphs, COR1 and COR2, as well as two heliospheric imagers, HI1 and HI2, in the Sun-Earth Connection Coronal and Heliospheric Investigation (SECCHI) instrument package (Howard et al., 2008). Parker Solar Probe, which was launched in August 2018, hosts another functioning pair of heliospheric imagers in the WISPR instrument (Vourlidas et al., 2015). The recently launched solar orbiter spacecraft also hosts a coronagraph (Metis, Antonucci et al., 2019), which will observe in a FOV between 1.7 R and 9 R over its elliptic orbit, as well as a heliospheric imager (SoloHI, Howard et al., 2019), with a FOV extending from 5.25 R to 47.25 R at perihelion. Another notable example from past missions is the Solar Mass Ejection Imager (SMEI; Eyles et al., 2003;Jackson et al., 2004) on the Coriolis Spacecraft, which used three wide-angle cameras to map almost the entire heliosphere starting from an elongation of 20°from the Sun. Ground-based coronagraphs, such as the K-COR instrument at the Mauna Loa Solar Observatory (MLSO) in Hawaii (de Wijn et al., 2012), have a limited outer FOV of around 4 R and can thus not replace space-borne instruments.
SOHO and STEREO are scientific missions, aimed at, among other scientific objectives, furthering our understanding of the fundamental physical principles behind CMEs (Gopalswamy et al., 2005;Thernisien, 2011). However, the coronagraph images provided by these two missions in particular are critical input to current operational space weather forecasting endeavours (Kraft et al., 2017). Relying on them for our forecasting needs has three main issues: 1. Due to their science-oriented design, some technical aspects of the instruments and ground-segment operations of these missions are not ideal for real-time operational purposes. For example, SOHO data is only downlinked using the Deep Space Network for a few hours every day, with the schedule varying on a daily basis. This non-continuous telemetry can lead to a latency in receiving SOHO data of more than 6 h. The extremely fast CME of 2012-July-23, which impacted STEREO-A, had an average velocity, from Sun to Earth, of around 2000 km s À1 , corresponding to a travel time of only 21 h (Temmer & Nitta, 2015). For such a case, a latency of 6 h corresponds to a significant fraction of the available warning time. Also, real-time telemetry rates for science missions can be very low, as provision of real-time data is not a mission priority, resulting in a poor image quality (spatial resolution, sensitivity) and cadence. 2. The second issue, pertaining to the STEREO mission in particular, relates to the fact that the satellites move relative to Earth. This results in a non-optimal variable vantage point relative to Earth-bound CMEs, and at least certain periods where communication with Earth is not possible, i.e. when the satellites are close to or behind the Sun. 3. The final issue relates to mission lifetime. SOHO was launched in 1995 andSTEREO in 2006. Their long lifetimes make them very successful, but their degradation and risk of imminent failure is increasing. For example, both missions are exposed to solar energetic particle radiation, which can lead to upsets in the onboard computers (Harboe-Sorensen et al., 2001) and potentially a subsequent loss of the satellite. Radiation also gradually degrades the solar panels and possibly other system components, including the detectors. Moreover, the loss of any satellite is possible during events such as orbital maneuvers. STEREO-B was lost in 2014 during a test of superior conjunction operations, and has not been recovered to date. Of course, instruments can also fail: the LASCO/C1 coronagraph on SOHO stopped working a few years after launch.
It is thus necessary to design and launch suitable replacements to ensure our space weather forecasting capabilities are not impeded. The SCOPE (Solar Coronagraph for OPErations) instrument (Middleton et al., 2016(Middleton et al., , 2019)aimed at providing images for operational space weather servicesis being developed under the European Space Agency's (ESA's) General Support Technology Programme (GSTP) by a consortium comprising the Rutherford Appleton Laboratory (RALlead institution), the Institut für Astrophysik der Georg-August-Universität Göttingen (IAG), the Royal Observatory of Belgium (ROB), the Centre Spatial de Liège (CSL) and Airbus DS Space Systems (ADS). SCOPE features a more compact optical design than a conventional coronagraph that simplifies the mechanism for stray light rejection and decreases the build volume for a given FOV. The SCOPE FOV in the Plane-Of-Sky (POS) extends from 2.5 to 30 R and is thus quite similar to that of the SOHO/C3 coronagraph, which has a FOV of 3.7 to 30 R . Figure 1 shows an example image of a CME taken by SOHO/C3. CMEs are often found with a three-part-and a flux-rope-structure, characterised by a bright leading front, a dark void formed by the magnetic flux rope, and a bright core (Illing & Hundhausen, 1985;Cremades & Bothmer, 2004). Fast CMEs additionally form magnetohydrodynamic shocks around the main structure, which must be distinguished for an accurate interpretation.
As its name suggests, the SCOPE instrument design is optimized for operational space weather forecasting purposes. Different orbital locations were considered in the SCOPE study, the Lagrange points L1 and L5, as well as different geocentric orbits. Currently, a modified SCOPE design is being considered for the Lagrange mission, the first mission to be positioned at L5 (Kraft et al., 2017). The appearance of a CME in a coronagraph (or heliospheric imager) image depends on viewing location relative to the CME's propagation direction.The example in Figure 1 shows a solar limb event, where the CME appears only on one side of the image and shows a clear structure, while Figure 2 shows a halo-event, where the CME moves towards the observer and emerges around the entire solar occulter, often with a more diffuse appearance. For forecasting Earth-directed CMEs, ideally both perspectives will be available together, because observations taken from the Sun to Earth-line (i.e. of a halo-CME) provide more information on the direction relative to Earth, while observations of CMEs near the limb (e.g. an Earth-directed CME viewed from L5) helps resolve the ambiguity between CME-width and -speed.
The detection of CMEs, and derivation of their characteristics, in white-light coronagraph images can be performed manually, but this can be time consuming and subjective, depending on the observer and image processing applied. An alternative, sometimes used by forecasting teams, is to apply an automated detection algorithm, such as the Computer Aided CME Tracking System (CACTus; Robbrecht & Berghmans, 2004) or the Solar Eruptive Events Detection System (SEEDS; Olmedo et al., 2008). CACTus is operated continuously at the Royal Observatory of Belgium. Email notifications are sent via a mailing list whenever a SOHO halo-CME (i.e. an Earth-directed event) is detected, and all detections are stored in an online catalogue 2 (Robbrecht et al., 2009). CACTus uses a linear Hough transform in polar coordinates to detect CMEs in coronagraph running-difference images. The output includes the angular width and velocity in the POS of each detected feature. While CACTus was developed to automatically detect CMEs in image sequences from the coronagraphs on board SOHO, it has since been used with the COR2 coronagraph on STEREO (and subsequently with STEREO HI images; Pant et al., 2016). SEEDS also transforms coronagraph images to polar coordinates, and identifies CMEs by tracking their leading edge. Other notable CME detection algorithms include CORIMP (Byrne et al., 2012) and ARTEMIS (Boursier et al., 2009).
For a robust detection of a CME, a sufficiently large signalto-noise ratio (SNR) is required. The precise value will depend on the defined goals of the coronagraph and the CME detection algorithm, i.e. it can depend on the structures that need to be identified. In the context of the SCOPE instrument, and future deep-space missions, the use of an automatic CME identification algorithm is particularly interesting. This is especially true for distant locations such as L5, where the telemetry rate is very low, e.g. 720kbps for STEREO when using the deep space network for science data telemetry (Driesman et al., 2008). A common trade-off with automated detection algorithms is how to define the detection threshold of the algorithm. A "tightly" set algorithm will only detect the biggest and brightest CMEs, whereas as a "loosely" set algorithm will detect smaller and dimmer CMEs, but will also make more false positive detections of flows and noise in the dataset.
STEREO has tried to overcome the restriction of limited telemetry by transmitting a continuous set of so-called beacon mode data, where the image size (typically 128 Â 128 or 256 Â 256) is much lower than the science frames (1024 Â 1024 or 2048 Â 2048), and additionally compressed more heavily, resulting in further loss of image quality. The resulting low quality data is nevertheless being used for forecasting purposes, although a higher image quality is clearly desirable for that purpose (e.g. de Koning et al., 2009;Harrison et al., 2017). If CMEs can reliably be detected onboard, a satellite could potentially prioritise data for immediate or delayed  downlink according to pre-defined schemes, e.g. trading between image resolution and cadence depending on the estimated CME velocity, to ensure a sufficient number of images are received for fast CMEs. Reliable automatic detection on-board could also be useful for sending the fastest possible warnings to operators and forecasters to ensure that detailed analysis can start as soon as possible. However, it is worth noting that the onboard computers on current spacecraft are generally less up-to-date and much less powerful than ground-based machines due to the preparation and reliability considerations required for radiation-hard equipment. This means that any onboard automated detection needs to use relatively undemanding algorithms.
The definition of the SNR requirement is not only important for detection algorithms, as mentioned above, but also as a required input parameter for other design aspects. This includes the light collecting power, which impacts the size and design of the imaging optics, or the stray light budget, which translates into the design of the occulter of a coronagraph, or the complexity of its internal baffle systems. For a given "budget" of dimensions, mass and fixed observation time, it is also possible to design the instrument with an SNR exceeding the critical requirements and open up the option to switch from a single exposure technique to one of multiple, shorter exposures that are subsequently combined. Depending on the exact exposure scheme adopted, the advantage of such a multiple exposure scheme can potentially be offset by reduced duty cycle, as each exposure takes a finite time to read out, and increased readoutnoise. On the upside, multiple shorter exposures allow for more effective removal (scrubbing) of solar energetic particle (SEP) artefacts. Such artefacts affect the number of useable SOHO and STEREO coronagraph images of very fast CMEs that tend to be accompanied by strong SEP events (Bothmer & Daglis, 2007). Moreover, such a multiple exposure scheme is already used for STEREO/HI (Eyles et al., 2009). Reducing any source of noise, in this case caused by SEPs, also offers potential improvements of the data compression that could reduce the telemetry budget. However, lossy compression algorithms needs to be applied with care, as they may not only smooth out noise, but also weak signal from less dense CMEs.
For the design and construction of an optics breadboard test model, baseline design parameters of SCOPE were defined with a goal of SNR = 4 per pixel at 30 R , and for a CME with a brightness of 1% of the background corona at this point. In this paper, we present the results from two approaches to assess SNR in coronagraph observations that were performed as part of the SCOPE SNR definition process: (1) by analysing images from SOHO/C3, which has a similar FOV to SCOPE, and (2) by generating simulated coronagraph images with artificial noise and analysing them using the CACTus software as an investigative tool. The SNR values achieved by C3 yield a reference for the performance of a well established instrument that is currently operational. From the baseline design of SCOPE, higher SNR than in C3 would be expected, mainly as the result of a larger aperture (25 mm vs. 9 mm). A shorter exposure time of approximately 10 s (~20 s for C3) will be roughly balanced by the higher quantum efficiency of a modern backside-illuminated charge-coupled device (CCD) detector. Simulated data allows us to apply different levels of predefined noise; by adjusting the SNR we can assess the detectability of a CME with different levels of noise applied. Furthermore, it is possible to assess the detectability of the same event from different viewing angles, to simulate what spacecraft at different locations, would observe. However, producing accurate simulations is challenging due to the differences between real observations and idealised models. The main intention behind this second approach was to evaluate the SNR using an existing, well established operational detection algorithm, namely the CACTus tool, including investigating if there are any particular effects when the design trade pushes SNR below established levels. Using simulations rather than real data allows more control over the relevant parameters. Section 2 outlines the selection of C3-observations for analysis, and the processing steps applied to the data. Section 3 discusses the generation of the simulated data. Section 4 presents the results obtained from the analysis of C3-observations and the CACTus-analysis of simulated images. Finally, Section 5 outlines the conclusions of the results.

Analysis of SOHO/LASCO observations
In this section we describe the method used to estimate the SNR in SOHO/C3 observations. Eight CME events were chosen, close to solar maximum of cycle 24, and are listed in Table 1, together with some of their properties taken from the SOHO CDAW catalogue 3 (Gopalswamy et al., 2009), a manually-produced CME catalogue. These CMEs were mainly from a single month (Jan 2012), although we included the aforementioned very fast CME from July 2012 as an example of an extreme event.
For each event, full-resolution, unpolarised Level 0.5 image data from C3 was downloaded and processed using the following steps: 1. For each event, the last image before the CME moved out of the FOV is identified for evaluation (third column in Table 1). 2. The CCD bias values (as reported in the image data fileheader) are subtracted from each image. 3. Each image is divided by the exposure time (as reported in the file-header) for the computation of a background. 4. For the image selected in step 1, the background image is computed as the 25th percentile over ±1 day centred on that image. The 25th percentile is chosen over the median value to avoid the CME influencing the background value; it is chosen over the minimum as the latter is not robust to artefacts in the images. Processing of each image used for the background computation includes the processing described in steps 2 and 3. 5. The background image is then scaled to match the exposure time of the image it is to be subtracted from. 6. The image of interest and background image are multiplied by a conversion factor of 13 e À per analog-to-digital unit (ADU) to obtain the number of photoelectrons (Morrill et al., 2006).
From the resulting images, the SNR is estimated under the assumption that shot noise is the dominant source of noise in the images. The analysis presented by Morrill et al. (2006), shows that C3 read noise is around 5-7 e À , and fixed pattern noise of the CCD is negligible, as are dark current noise and hot pixels due to the low CCD operating temperature. In comparison, the observed background at the outer FOV edge is around 10 4 e À , so that the shot noise computes to n = ffiffiffiffiffiffiffi 10 4 p = 100 e À . The SNR is thus computed according to the formula: Additionally, the ratio of signal to total image brightness is computed: The SNR and fractional signal (f) are then inspected around the apex of the CME, i.e. the furthest extent into the FOV, of the selected events, taking into account the irregular substructure of real CMEs. In contrast, the simulated events, which are presented in the following section, are perfectly regular because they are computed from an idealised model.

Simulated images
In this section, we discuss the computation of fully synthetic simulated C2 and C3 images for analysis with the CME detection tool CACTus. The simulations are composed of three components: (1) A K-and F-corona model, (2) a modelled CME, and (3) added noise.
The F(rauenhofer)-and K(ontinuum)-corona are the two most important components of the solar corona observed by white-light coronagraphs above a few R . The F-corona is created when sunlight Mie-scatters off dust particles, while the K-corona is created by Thomson-scattering of sunlight by free electrons. The spectrum of both components is essentially equivalent to the solar spectrum, but the faster velocity of free electrons compared to heavy dust particles causes a large Doppler-broadening, so that Fraunhofer absorption lines are only visible in the F-component. Moreover, Thomson-scattered light is strongly polarised, while the F-corona is only slightly polarised. The F-corona exceeds the quiet K-corona beyond around 4 R (Kimura & Mann, 1998). CMEs are composed of ionised plasma, and thus contribute only to the K-component. The coronagraphs on board SOHO and STEREO include polarising filters that can be used to separate the K-from the F-corona as well as providing some information of CME directionality (de Koning & Pizzo, 2011), although most images are downlinked in unpolarised light. Since the SCOPE design does not include a polariser, we only focus on unpolarised imaging in this paper.
The total brightness I(R) of the lower F + K corona was modelled as early as 1937 by Baumbach (1937), who used measurements up to 5 R taken during several solar eclipses between 1905 and 1929. He derived the following three-component power law: where I denotes the brightness relative to the solar disc centre and R the distance from the Sun centre in solar radii. We found this model to fit well even to the data in the FOV of SOHO/C3, so we use it as the basis for our corona model. Note that here the final term of the equation dominates, as it decreases slowest with elongation and equals the second term at 2.08 R . The coronal brightness is, however, not only a function of radial distance from the Sun, but is also higher near the ecliptic plane due to the higher concentration of dust, gas and plasma, resulting in an asymmetric shape of the corona. We included this asymmetry with a simple model of an exponential falloff with increasing position angle from the ecliptic plane: where I(R, 0) is the brightness in the ecliptic as given by equation (3), a the position angle between a point in the image and the ecliptic, and k a scaling coefficient. As the distribution of the corona with respect to the ecliptic appears to be different at different elongations, different values of k for the FOV of C2, k = 4.5, and C3, k = 2.3, were used to improve the visual match with a reference dataset of images from SOHO/C2 and C3, from the full day of 2013-Sep-29. These data were used to tune the simulated background computed from equations (3) and (4). This reference dataset comprises Level 0.5 C2 and C3 data, which have only been corrected for alignment, so that solar-north is along the Y-axis of the images. The purpose of this tuning is to set realistic levels for the K-and F-corona model, as well as to have realistic header information (e.g. exposure time, CCD bias, Sun centre position in the image) in the model, which is required by CACTus. Tuning the simulations to realistic levels helps in the evaluation of real coronagraph data. A single reference dataset was used in all simulations for consistency.
The CME component was simulated using the graduated cylindrical shell (GCS) model (Thernisien et al., 2006;Thernisien, 2011). GCS provides a geometrical model of a flux rope CME. It serves as a potential improvement over simpler models like cone models or a lemniscate, which are often used to fit CME observations (e.g. the CME-Analysis Tool, CAT, Millward et al., 2013, which is available in SolarSoft, Bentley & Freeland, 1998). The GCS model describes the flux rope CME with six parameters, two describing its basic geometry (half angle, aspect ratio) and four describing the position in 3D-space (apex-longitude, -latitude, -height, and tilt angle). An implementation of GCS (Thernisien, 2011) is also available in SolarSoft, and includes a ray tracing code to compute a simulated Thomson-scattering image of the chosen set of parameters for a given vantage point.
For the simulations, we used values for the GCS aspect ratio j = 0.4, half angle a = 30°and tilt c = 0°. These are the default values in the GCS-tool, but also represent a typical CME. For the direction of CME propagation, the apex-latitude was set to h = 0°and the apex-longitude to three different values: / = 0°, 60°, 90°relative to the observer. The first two cases simulate an Earth-bound CME observed by a spacecraft located at Earth or the Lagrange-point L1, and a spacecraft located at the Lagrange-point L5. In the third case, the CME propagates in the POS, for reference. Finally, the apex height h is a function of time as the CME propagates away from the Sun. We assumed a constant velocity for the CME, with values of 700 km s À1 , 1400 km s À1 , 2100 km s À1 and 2800 km s À1 to test the cases of slow and fast CMEs. While extreme events can be even faster than 2800 km s À1 (Webb & Howard, 2012), this velocity is the fastest so that our reference dataset, which has a 12-min cadence, should still satisfy the requirement of 13 frames in combined C2/C3 images (as specified in Robbrecht & Berghmans (2004), for avoiding false detections in CACTus).
After both the GCS CME image and the coronal background image (referred to as the coronal image here on) have been computed, the CME image is scaled so that the brightest point of its outermost visible front (the apex in the case of POS propagation) is a prescribed percentage of the brightness of the coronal background at a reference heliocentric distance of 30 R in the POS, i.e. the outer FOV-limit. Based on the observations in LASCO/C3, see Section 4, we have chosen percentages of 1% and 2%. To include an extreme test, some simulations were also computed using 10%.
Across the FOVs of C2 and C3, the brightness of the GCS-CME was then scaled using the density relation found by Bothmer & Schwenn (1997), where N p~R

À2.4
. This is similar to the effective gradient of the coronal brightness as described above. Alternatively, one could use the often-used assumption of self-similarity, which should lead to a R À3 -law. As this would increase the relative brightness in the inner FOV, one would assume this would simplify automatic detection, and we thus decided to use the empirical, more challenging case.
At this point in the process, the background coronal image and CME image are combined in units of brightness relative to the solar disc centre according to equation (3). To compute shot-noise, the summed image must be converted to the number of photoelectrons n that the instrument's CCD detector would produce and vignetting must be taken into consideration. For an image composed of a background corona and a CME with brightness 10% of the corona, for example, where the CME is the signal of interest, the SNR per pixel can be computed as follows: SNR ¼ n CME ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi n CME þ n Corona p ð5Þ where n CME is the number of photoelectrons corresponding to the CME and n Corona the number of photoelectrons corresponding to the background corona. Hence, the coronal background signal for this pixel, in photoelectrons, is given by This means that 440, 1760 and 11,000 electrons have to be collected in a pixel to achieve an SNR of 2, 4 and 10, respectively. In comparison, for a CME with 1% brightness relative to the background corona, the equations above yield the relationship: which results in a requirement of about 40,000, 160,000 and 1,000,000 photoelectrons for an SNR of 2, 4 and 10, respectively. In comparison, the full-well capacity (FWC) of a modern CCD sensor such as the Te2v CCD-230 family (baselined for SCOPE) is greater than 150,000 electrons per pixel. If high SNR is desired, to enable fainter CMEs to be detected, it must additionally be remembered that the image sensor would only be exposed to a few percent of its FWC near the outer FOV edge to prevent saturation at the inner edge of the FOV, due to the large intensity gradient of the corona. Other noise sources, such as CCD read noise and dark current, were not considered in this computation because they are typically only around 10 electrons per pixel. In our simulated images, this saturation is not directly considered, as the image is scaled to real ADU-levels (see below). We can thus generate simulations with unrealistically high electron levels (and thus SNR) for comparison. In a real instrument, if the number of electrons needed to fulfil a specific SNR requirement exceeds the FWC, the limit can potentially be alleviated via a multiple-exposure scheme as explained above. The next step performed is the application of vignetting to the combined corona + CME image. This is performed using the C2/C3 vignetting profiles available within the SolarSoft library. After vignetting has been applied, Poisson-noise is added to give the images the prescribed level of SNR.
Since the Level 0.5 C2/C3 data files are not in units of solar disc centre brightness or photoelectrons, but in ADU, the simulated images are converted from units of photoelectrons to realistic ADU levels. This is done by scaling to the values observed in the real C2/C3 image data, using an image that contained no CME for both the simulated images and the 2013-Sep-29 data.
The final two steps that are necessary are to adjust the simulated images for variations of the exposure time reported in the file-headers, relative to a nominal exposure time of 20 s, and to add the bias value, also taken from the C2/C3 file-headers.
To summarise, simulated images are generated, for each time of a C2-and C3-observation on 2013-Sep-29, using the following steps: 1. Compute a coronal background image (according to Eqs. (3) and (4)). 2. Compute an artificial CME image using GCS. 3. Scale the GCS CME to a prescribed fraction of the coronal background at the edge of the FOV. 4. Add the background and CME image. 5. Convert brightness to number of electrons by scaling, at the brightest pixel of the CME apex, to the desired number of electrons according to equation (9). 6. Apply vignetting. 7. Add Poisson noise. 8. Convert the image from photoelectrons to ADU-levels of C2/C3 observations. 9. Apply exposure time variations taken from the corresponding C2/C3 file-header. 10. Add CCD bias taken from the corresponding C2/C3 fileheader.
A comparison between a simulated C3-image and a real observation can be seen in Figure 3. It shows two profiles, one along the central meridian (red line for the simulation and green line for a C3 observation) and the other being an ecliptic cut (orange line for the simulation and blue line for the C3 observation). The data shown was computed for SNR = 2 at 1% relative CME brightness at the edge of the FOV and the brightest point of the front. There is an asymmetry in the simulation, seen around the occulter, which is not an exact match between data and simulations. Information on the location of the Sun included in the data header was used, but there appears to be a residual offset in this case. Figure 3 also reveals one component of the real data that has not been included in the simulations, the diffraction rings around the occulter. These can be seen in the slices through the real data as broad spikes near the intensity drop corresponding to the occulter. As CACTus operates on difference images, these will be removed in data processing and should not play a significant role in the detection of the CME.
In total, over 54 events were simulated (see Table 2 in Sect. 4 for a list of parameter combinations). This includes also a few "null simulations" without any CME in the FOV, but containing the same absolute noise levels in the range covered by the regular simulations. The simulated images are input into the automated CME detection algorithm CACTus (Robbrecht & Berghmans, 2004), which is used operationally and for research, and has been shown to be fairly robust when detecting larger CMEs, achieving a correspondence with the CDAW catalogue of up to 80% (Robbrecht et al., 2009). By using an automated detection algorithm, we remove the partiality of a human observer.
CACTus uses a Hough-transform to detect straight lines in height-time maps generated from running-difference images transformed to polar coordinates. The bright parts of CMEs show as bright tracks in the height-time maps as they propagate away from the Sun, and the gradient of the track represents the velocity. A full CME is then identified as a cluster in the resulting data over a range of position angles, onset-time and velocity. The detection thresholds of CACTus are tunable, whereby its detection ability can be adjusted for larger and brighter CMEs with a higher level of confidence, or to include smaller CMEs, but with a lower level of confidence, as more positive detections can arise from noise. The tuning parameters are an intensity threshold in the Hough-space, a noise constraint factor and a minimum angular width of the CME (full angular width as opposed to the half-angle used in the GCS model). In our simulations, we use threshold = 0.3, noise constraint factor = 4 and width = 5°, which are the default parameters.
The primary output of CACTus is a list of detections and their characteristics, together with a plot of the detected bright fronts in a map of time vs. position angle. An example can be seen in Figure 10 in the results (for event no. 49 from our list of simulated events). The detections are separated into CMEs and flows, and the parametrisation includes onset time (t0), duration (dt0), central position angle (pa), angular span (da), velocity (v), and velocity span across the angular span (dv) together with minimum (minv) and maximum velocity (maxv). The events are furthermore flagged as (partial) halos if the angular span is larger than 90°. The colour indicated in the detection plot (Fig. 10b) has no meaning, and is used only to distinguish the individual detected events. CACTus also displays an example observation for each detection with the angular span indicated (e.g. Fig. 8) and a graph of detected velocity vs. position angle (not shown).

Results
Figures 4-6 show SOHO/C3 images of CMEs, processed as described in Section 2, from 25-Jan-2012 at 13:30 UT, 26-Jan-2012 at 09:06 UT and 27-Jan-2012 at 21:20 UT respectively. These represent three events with velocities of v CDAW = 532 km s À1 , v CDAW = 1194 km s À1 , and v CDAW = 2508 km s À1 . The SNR at the leading edge also increases across the three events, reaching peak values of 2, 3 and 4 near the outer edge of the FoV respectively. Across our entire dataset (Table 1), we find SNR values between 1 and 4 near the CME leading edge. The corresponding number of CME electrons is between 300 e À and 1000 e À , equating to percentages of the background that lie between 0.75% and 2%. Of course, all CMEs show considerable substructure and, due to the reduction in the coronal brightness away from the Sun, higher SNR values are observed closer to the Sun. This is especially true Fig. 3. Comparison of the brightness along the ecliptic and along the central meridian of a simulated C3 image and its corresponding C3 observation. Table 2. List of the CACTus results for the CME simulations. The columns give: simulation number, the SNR value at the brightest point of the GCS CME, the longitude of the CME relative to the observer (with 0 being a halo CME), an indication if the detection was successful and contained further detections, the median velocity including variation reported by CACTus for the main CME where detected, and the corresponding maximum velocity reported by CACTus. Where event detections are marked "As flow", CACTus made a correct detection but reported it as a flow due to uncertainties. The velocity v simu of the batches is the 3D-apex-velocity, and the POS-velocity reported by CACTus differs different depending on the relative longitude. for the extremely strong event of 2012-July-23, sometimes called a "superstorm event", which was likely a double-ejection of two CMEs in quick succession (Liu et al., 2014;Temmer & Nitta, 2015). Figure 7 shows a comparison of simulations 29-32 (labelled (a)-(d)), which show an event with SNR 4 (a), 2 (b), 1 (c) and 0.5 (d). Each image is processed as a running-difference image. Moreover, each image was processed with enhanced contrast settings, plotted using the CME Analysis Tool (CAT). Down to SNR = 1, the CME can be seen clearly by eye. At SNR = 0.5, detection becomes difficult even for a human observer.    Table 2 lists the results obtained from the CACTus analysis of the simulated images. The simulations are grouped in batches of different CME-(apex-)velocity and relative CME brightness at 30 R in the POS. At velocities of 700 km s À1 and 1400 km s À1 , CACTus was able to detect the CME in all cases, though in some cases with additional CMEs or flows being reported that feature a small angular width. For a relative CME brightness of 2% and SNR = 0.5, which has a high absolute noise, the event is marked as a flow instead of a CME, but otherwise the detection parameters are similar.
In the extreme case with 10% relative brightness and SNR = 4, the CME was still detected, but accompanied by a much larger number of random events, most likely due to the higher absolute noise in this case. Interestingly, these random detections did not occur for SNR = 2.
Where the CME is detected, CACTus reports accurate POS-velocities. The maximum velocity has been compared to a manual fit using CAT assuming that the CME propagates in the POS. The result matches the maximum velocity v max reported by CACTus to within a few km s À1 in most cases. For simulations no. 36 and 38, a single outlier data point was responsible for a high v max . The median velocity is computed over the detected range of PAs, and therefore expected to be lower. It is not as easy to compare to a manual fit exactly, Fig. 7. Comparison of running-difference images of simulations no. 29 (top left; label (a)) to 32 (bottom right; label (d)), showing a CME at 60°elongation with SNR values of 4, 2, 1 and 0.5 respectively. The images were generated with the CAT software, using manual contrast adjustment to enhance the leading edge.  but for the halo-CME it lies between the manually fitted velocity at the equator and the pole, and therefore appears to be consistent. At a higher velocity of 2100 km s À1 , the detection started to become problematic. The halo-CME was still detected properly independent of SNR, while for the limb-events at longitudes of 60°and 90°only the two legs were detected as separate features. At 2800 km s À1 (results not listed), the halo-CME was still detected, while for the CME at 60°longitude the fractional leg-detection was even more strongly separated and the CME at 90°was not detected at all. Slower events than 700 km s À1 were also computed and successfully detected with similar fidelity to those at 700 km s À1 , so they are not shown here to keep Table 2 compact.
The null simulations without any CME in the image were computed with absolute noise levels corresponding to simulation no. 1-12 and 49-54, i.e. covering the cases of lowest and highest absolute noise. They did not produce any detections at the same CACTus settings, though with a slight reduction of the CACTus threshold parameter from 4.0 to 3.8, a few false positives could be produced even in these data. Figures 8 and 9 show an example of a successful CME detection by CACTus for the case of simulation no. 22, at longitude = 90°, SNR = 2 at 2% relative brightness and v = 700 km s À1 . The angular extent of the CME is identified well, and the maximum velocity of 686 km s À1 is close to the nominal 700 km s À1 . The variation of the median velocity of 125 km s À1 is a result of the range of angular span of PAs over which the value is computed. The detection plot (Fig. 9) shows both the leading and the trailing edges of the flux rope, probably due to the low density inside the core of the GCS-flux rope. Figure 10 shows the detection plot for simulation no. 49, with SNR = 4 at 10% relative brightness. The high absolute noise level in this case leads to false detections at various position angles, although the main CME is also detected correctly (CME 0001). Figure 11 shows a difference image for simulation no. 43, with v = 2100 km s À1 . Here, CACTus has detected one leg of the GCS flux tube, with the other leg being detected as a separate CME.

Conclusions
The SNR value a coronagraph needs to achieve is important for both the hardware in terms of optical design, as well as from the software side in terms of the ability to (automatically) detect CMEs. In this paper, we report the results from two approaches that establish and analyse a baseline SNR target of the coronagraph design study SCOPE. In the first approach, we analysed SOHO/C3-observations to find out the typical SNR values in data currently used for operational purposes. In the second approach, we studied the effects in automatic CME detection as the SNR deviates from the initially established design goal of SNR = 4, using artificial SOHO/C2 and C3 coronagraph images that were created by combining a modelled background corona image with GCS-flux rope renderings, and applying different levels of noise.
The inspection of SOHO/C3 images of several CMEs showed SNR values ranging from around 1 to 4 near the CME apex when the latter is close to the outer FOV limit, for fainter and brighter CMEs respectively. The brightness at the leading edge is furthermore observed to be in the range between 0.75% and 2% of the background corona. For SCOPE, these results confirm that the goal of SNR = 4 at 1% relative  (v = 2100 km s À1 ; SNR = 1 at 1% relative brightness), showing the angular span of the detected CME, which covers only one leg of the GCS-flux tube brightness will be compatible with operational demands. In comparison with C3, the SCOPE design features a larger aperture, wider spectral bandpass and more efficient detector, but also a shorter total exposure time to allow unblurred detection of exceptionally fast events (coupled with the summing of several exposures). As the FOV and image resolution is nearly identical, this should result in an increased SNR compared to C3. As we observed peak SNRs at the CME leading edge around 4 with C3, we can expect to reach this for more events with SCOPE if a similarly low instrument stray light can be achieved, or alternatively allow for a more relaxed stray light budget. Onboard summing of multiple shorter exposures helps to mitigate the impact of SEP events, as each exposure is scrubbed of particles before being summed onboard, without falling below the image quality achieved by SOHO.
The detection of our simulated events by CACTus succeeded down to SNR = 1 in the range of 1-2% relative brightness of the CME apex to the corona at 30 R , a similar range to what we observed in SOHO/C3. At SNR = 0.5, some events were marked as a flow instead of a CME, but otherwise parametrised correctly. Due to the definition of SNR, a higher relative brightness leads to higher absolute noise at the same SNR, leading to a number of pixels falling above the detection threshold and so triggering a false detection, which appears to be the cause of CACTus detecting noise. In our results, this is visible as additional CMEs and flows being reported besides the correct detection of the simulated CME. This was observed especially for halo CMEs, but also with high absolute noise as seen in some of the simulations at 10% relative brightness. For future applications of automatic on-board detection, the implications of additional false positives should be investigated in further development work of such algorithms. In our simulations, the actual CME was still detected at SNR = 0.5, but for automatic decision-making false positives could lead to nonoptimal performance. Because these false positives occured in the halo events even at SNR ! 4, this might be more important for a coronagraph located at the Lagrange-point L1 than at L5.
At a high velocity of 2100 km s À1 , the detection started to break down, with first only the legs of the limb-events being detected as individual events, while at 2800 km s À1 the CME at 90°longitude was no longer detected at all. This is likely due to the number of available images of the CME being low because of its speed of passage through the FOV and below the internal limits built into CACTus. This is consistent also with the CACTus catalogue 4 where values of the maximum velocity above 2000 km s À1 are observed only in very few cases. The results of the simulation thus also support the specifications of SCOPE.
Finally, it is worth noting that we encountered a few cases where CACTus runs of a simulation would abort with an error. This appeared to be caused by an incorrect convergence of an IDL-subroutine. A simple recomputation of the simulations without changes to the parameters, and hence only changes to the random noise, resolved the issue and the CME was detected correctly and consistently with other simulations. We are not aware of such behavior in real data.
For future work, improving the quality of simulated coronagraph observations could be used again to test and perhaps even pre-tune automatic detection algorithms, which in turn could help to transition a new instrument faster into operational services. These improvements to the model include: 1. An improved background corona-model with better outof-ecliptic profiles. 2. Simulations at different times of the solar cycle and thus different background corona. 3. A larger variety of CME flux rope geometries. 4. Including background structures such as streamers or a stellar background. 5. Using non-linear CME propagation and changing geometry.
Improved corona background models include e.g. the recent work of Stenborg et al. (2018), who describe the shape of the out-of-ecliptic corona. Covering times across the solar cycle would require some variation in background structure. This could again be performed by using reference data sets and tuning the coefficients to each set of images, as was done in this work with equations (3) and (4). A larger variety of geometries should be used as CMEs are not all identical in this regard. Distributions of the parameters with respect to kinematic properties, e.g. as recently demonstrated by Mrotzek (2020), could be included. The range of CME speeds should also be varied, to investigate the detectability as a function of the number of images with the CME in the FOV and, as explained above, to ensure compliance with the algorithm for extreme events. Our choice of relative brightness was proven too high by the results of the LASCO image inspection, and hence also should be reduced. Finally, to reach an even higher degree of realism, "imperfections" can be introduced into the background corona, most importantly streamers. These should also be computed with some temporal variability, as they would be almost fully removed by the difference-imaging, depending on included or simulated spacecraft pointing errors.