Automated Detection of Solar Eruptions

Observation of the solar atmosphere reveals a wide range of motions, from small scale jets and spicules to global-scale coronal mass ejections. Identifying and characterizing these motions are essential to advancing our understanding the drivers of space weather. Both automated and visual identifications are currently used in identifying CMEs. To date, eruptions near the solar surface (which may be precursors to CMEs) have been identified primarily by visual inspection. Here we report on EruptionPatrol (EP): a software module that is designed to automatically identify eruptions from data collected by SDO/AIA. We describe the method underlying the module and compare its results to previous identifications found in the Heliophysics Event Knowledgebase. EP identifies eruptions events that are consistent with those found by human annotations, but in a significantly more consistent and quantitative manner. Eruptions are found to be distributed within 15Mm of the solar surface. They possess peak speeds ranging from 4 to 100 km/sec and display a power-law probability distribution over that range. These characteristics are consistent with previous observations of prominences.


Introduction
Eruptions in the low solar atmosphere are key elements in generating space weather. Large eruptions can evolve into coronal mass ejections (CMEs) that can plow through the solar wind and ultimately impact the earths magnetosphere (Munro et al.(1979), Gosling(1993)). The source of many of these CMEs has been associated with prominence eruptions (Gopalswamy et al.(2003)). Yan et al.(2011) finds that approximately half of active region filament eruptions are associated with CMEs and over 90% are associated with flares. Smaller eruptions may provide the ultimate source for the solar wind (Tian et al.(2014)). Regardless of their magnitude, eruptions play a significant role in the structure and dynamics of the solar atmosphere.
Identifying eruptions occurring near the solar surface is complicated by the presence of a wide variety of features and scales. Active regions, coronal holes and filaments persist for long periods, while the short-lived eruptions pass through and among them. Flares and other sudden changes in intensity add distractions that can mask or mimic motions that would otherwise be visible. As a result, automated detection of these eruptions has been challenging.
Previous studies have developed automated methods to detect and track filaments primarily in H-alpha images ( Gao et al.(2002), Wang et al.(2010), Schuh et al.(2014)). These methods typically use image-based feature detection followed by a tracking step comparing the results of sequential detections. Similarly, Gissot et al.(2008) used an optical flow method to analyze the motion of a filament in three dimensions as observed in the 304 Angstrom images acquired by the EUVI instruments on the two STEREO spacecraft (Kaiser et al.(2008)). Measured velocities for prominence eruptions in these studies tend to lie in the range of 10-100 km/sec while quiescent prominences show velocities around 4 km/sec or less.
We take a different approach by first extracting velocities from a sequence of images and then identifying features within the resulting velocity fields. Here we use an optical flow method to identify regions of significant motion. Using derived velocity fields to define regions of interest rather than working directly from the images removes many of the distracting features and permits us to identify and characterize the flows in sufficient detail for further analysis.
In the following sections we present the underlying method used by Eruption Patrol, assess its performance, survey the statistical properties of the resulting detections and summarize our findings in the conclusion.

Method
Our approach to identifying solar eruptions is to extract velocity fields from sequences of solar images using the opflow3d method described in Hurlburt and Jaffey(2014) as applied to images obtained by the Atmospheric Imaging Assembly on the Solar Dynamics Observatory (SDO/AIA, Lemen et al.(2012)). Ten sequential He II 304 Ångstrom images (spanning two minutes) of full-resolution Level 1 data are and processed to create a single velocity map. These images have had dark current and flat-field corrections and have had spikes caused by bad pixels and radiation hits removed. aligned with the local velocity with areas proportional to the speed. The corresponding image is shown in the color background. Two regions are seen to be erupting: a long filament on disk is ascending into the corona; and another region in the upper left that may be part of the same eruption, or a sympathetic response. Only velocities over 1km/sec are displayed and solar rotation is not removed. The peak velocity here is 3.8km/sec.
After square-root compression these images are fed into opflow3d to extract a time-averaged velocity field with an effective spatial resolution of 60 arc-seconds. The opflow3d method uses a least squares approach that has been shown to minimize the effects of detector noise, transient intensity variations and other sources of measurement error. This presents a trade off between the sample size in space and time versus accuracy and computational speed. With the choice of 60 arc-second (100 pixels) and 10 frames, we expect statistical errors of less then 1% and a computational time for a single velocity fit of approximately one minute on a 2013-vintage Apple iMac.
Previous studies found velocities exceeding 100 km/sec, which translates to about 17 arc-seconds during one velocity fit. This is within our chosen resolution so systematic error due to smearing should be small; it also suggests that the maximum reliable velocity estimate we can expect for our sampling choice is about 350 km/sec. Fitting for higher speeds would require either smaller time samples or larger spatial windows. For instance, we could detect speeds approaching 3.5Mm/sec over 60 arc-seconds by using the maximum AIA cadence of 12 seconds. (If we were to apply this method to the the coronal images collected by AIA, say 193 Ångstroms, where velocities are expected to reach these ranges, we would need to adjust our parameters accordingly.) An example of the resulting flow field is displayed in Figure  1. The spatial resolution of our velocity fit was doubled to 30 arc-seconds here to better define the regions. The flow associated with the large filament eruption near the northern pole is clearly captured, as well as a few smaller-scale flows around the limb.
The derived velocity fields are composed of multiple components, some of which are sources of error for our application: these include solar rotation, super-granulation and other quasi-static motions. Most of these motions are small and reasonably isotropic. The solar rotation profile is neither, with a peak value of about 2.2 km/sec. This can introduce a bias when using a thresholding technique to identify eruption sites. Hence, EruptionPatrol subtracts a background velocity corresponding to that of solid-body rotation (but not for the smaller effects of differential rotation and meridional circulations).
EP samples velocities every 20 minutes and records the time, location and velocity at the point of maximum speed within each sample is recorded. As described above, this velocity corresponds to the best-fit over a region of about 60 arc-seconds in a two minute interval. Hence the precise position of the peak is only known to that resolution. Figure 2 displays the raw output of the patrol over a seven week period starting on 29 March 2014. The effect of the rotation removal can be seen as a drop in the floor of the velocity measurements to values consistent with those expected from super-granulation (e.g. Shine et al.(2000)) and other sources. Peaks corresponding to eruptions and spacecraft motions are also clearly visible. The later are excluded in the production version of the method.
Sub-sampling the images as we risks missing short-lived events. However eruptions with lifetimes shorter than this probably have little impact on their surroundings. Our goal here is to identify eruptions that may have significant impacts, so the computational cost savings outweigh the loss of information. Our in- tent is to return to periods of significant motion for more detailed analysis in the future.
The results of this first pass are then processed to identify time periods of where velocity exceeds the threshold of 3.6km/sec. This value was settled upon by the need to exclude the background motions seen in Figure 2 while generating a moderate detection rate. It also corresponds to the level of motion found by Wang et al.(2010) in quiescent prominences. These periods, along with the largest velocity and its position, are then recorded to the Heliophysics Events Knowledgebase (HEK, Hurlburt et al.(2012)) as preliminary reports of eruptions. Our intent is to analyze these more carefully in a second "characterization" routine and then replace or update these entries with more details.

Comparison with manual selection
We assess the performance of our method by comparing it to eruptions recorded manually the HEK. These entries are primarily provided by members of the SDO/AIA science team who monitor data as it arrives at the AIA Validation Center (see Hurlburt et al.(2012) for details). Volunteer annotators to regularly sign up for three-day shifts. Thus all the datasets used by EruptionPatrol, along with other AIA channels, have also been reviewed by this team. Over the interval from 18 April 2014 to 17 July 2014, a total of 43 filament eruptions and 44 eruptions were recorded by the team. For this case we consider an eruption to be any of two classes accepted by the HEK: eruptions and filament eruptions. The first is a catch-all category that may or may not be associated with a filament; the later is associated with a filament that the observer considered to have ejected material into the corona.
As a first test, we queried the HEK for both classes of eruptions using iSolsearch (http://www.lmsal.com/isolsearch) to select the events and then exporting them into SolarSoft (Freeland et al.(2000)) and using the hek match events routine. For this study we considered events that overlapped within an hour in time. The results are displayed in Table 1. Of the 43 filament eruptions reported by humans, 37 (79%) matched times reported by EP. The success drops to 24 (44%) when we also require a separation of less than 120 arc-seconds. Human reports of eruptions displayed a similar behavior, which is partly due to some observers selecting both when the generate their reports. We will discuss this further below.
As a second test, we selected the 29 events with speeds exceeding 30km/sec from EruptionPatrol over the same interval and compared them to entries reported by human annotators. Nine (31%) match the human annotations in time, while the remaining 20 did not. Only 7 of those 9 also overlapped spatially. All of these missed events were reviewed visually using the daily movies posted at http://sdowww.lmsal.com and were found to be associated with significant eruptions.
These patterns persist over the entire AIA dataset, as can be seen in Table 2. Overall EP finds about 70% of all time periods manually reported as erupting in either category. This mod-  Comparisons of EP (again limited to those exceeding 30 km/sec) detections to manual ones also shows a drop to 24%. This might indicate that the converse is also true: fast, short-duration eruptions may overlooked by human annotators. This is consistent with what we found in the smaller sample, but we have not manually reviewed the 289 events to confirm this. The success at matching both the time and location of filament eruptions drops to 27% in the larger sample. This may again be due to large-scale filament eruptions. In that case, the position selected by annotators will tend to be the geometric center of the filament, while the position reported by EP will be the fastest moving element, or perhaps that of a separate, fast but short eruption. The two eruptions seen in Figure 1 provides an example of this situation.
The matches to generic eruption continues to be slightly over 50%. In this larger time span, the two sets (filament eruptions and generic eruptions) are reasonably independent, with the later catching a larger range of behavior (which is left to the discretion of the annotator). The fact that their is not higher success rate here is to be expected, since EP reports only a single location for a given time sample. If multiple eruptions are occurring, all but one of them will be missed. This is another issue that will be addressed in the characterization module.
The overall accuracy of manual entries, finding less than a quarter of significant time periods identified by EP and with only 11% overlapping spatially is noteworthy. Their accuracy appears to be uncorrelated with the euptions duration or magnitude. There may be some relation to spatial extent, but in reviewing smaller sample, the main source of discrepancy is most likely a lack of attention or interest. Each annotator has particular interests related to their individual research, and there is a clear correlation with in their success at this task with their interests.

Statistical properties
The Eruption Patrol module described above was run over the entire SDO mission up to July 12, 2014, thus spanning just over four years. Here we give an overview of the statistical proper-ties of this sample. The left panel in Figure 3 displays the histogram of peak speeds detected for each recorded eruption. The distribution has an inverse square dependence on the speed, with the largest event having a speed of 96 km/sec. These are consistent with previous studies, such as Gopalswamy et al.(2003) and Wang et al.(2010). The right panel in Figure 3 displays the distribution of velocities as a polar plot. There does not appear to be a significant directional bias in the sample. Figure 4 displays the spatial distribution of all events over this period. Eruptions are detected almost everywhere on the disk, as seen in the left panel, but are clearly clustered near the activity belts and the limb. This distribution appears to be independent of the magnitude of the events (as indicated by the color of the dots). There is a clear lack of eruptions reported near the poles which may be due to relatively slow-moving polar crown filament eruptions being masked by more dynamic regions as described in the last section.
The apparent clustering near the limb is examined in the right panel, where the histogram as a function of radius (r) for these events is displayed. The distribution rises from zero near disk center (r = 0) until the active region bands begin to contribute at r ≈ 0.4R sun , where R sun is the solar radius. The distribution remains relatively constant between that point until near the limb (r ≈ R sun ), where the counts climb rapidly before falling back to zero. This distribution is consistent with that expected in the case of a shallow, optically-thin, formation layer (between 3-15 Mm) containing a uniformly random distribution of eruptions. Gopalswamy et al.(2003) reports that (relatively-large) eruptive prominences have heights between 1.1 to 1.5 R sun . This suggests that there may be some scale dependence in the distribution that is neglected in our simple model, which might also explain the deviations from our model for r > R sun . Some level of scale dependence is expected in the structuring of the solar atmosphere by magnetic fields, as in the magnetic carpet model of Title and Schrijver(1998).
Another source of systematic error may result from projection effects. If all eruptions were predominantly radial, we would expect a radial dependence in the magnitude and direction of the reported velocities. Gopalswamy et al.(2003) found this to be the case overall, but with many eruptions possess tangential motions on the order of 10km/sec. In contrast, the motions we de- tect are randomly-oriented and show no significant projection effects. We may resolve this discrepancy by noting that the former study was effectively tracking the centroid of a prominence while our method is measure local velocities, which includes twisting, writhing and streaming motions that frequently accompany eruptions. Hence we can identify times and location of eruptions based on these associated motions regardless of where they appear on the disk. A more complete reckoning these velocity contribution falls to the future characterization module.

Conclusion
We have developed an automated method for finding eruptions in the lower solar atmosphere and have deployed it within the SDO/AIA Event Detection System which operates on the data as it arrives. The method has been found to measure velocities with statistical properties consistent with previous studies. The reported eruptions also appear to be consistent with those reported by human reviewers. The automated detections are less prone to lapses in attention or skewed by personal interests, but may miss slow, long-duration eruptions. They also provide a more complete characterization of eruptions by reporting both the location and plane-of-sky velocity. The reported events are found to be distributed in a layer near the solar surface and possess a power law distribution in peak speed. Details of these events, including summary movies, can be found using a variety of tools including Helioviewer (http://helioviewer.org), iSolsearch (http://www.lmsal.com/isolsearch) and SolarSoft. As part of the HEK, they are automatically cross-referenced with solar datasets obtained by the Hinode (Kosugi et al.(2007)) and Interface Region Imaging Spectrograph (IRIS, De Pontieu et al. (2014)) missions.
Subsequent papers will explore how these eruptions compare with those found with other automated processes recording in the HEK and will describe a characterization module that confirms and extracts more detailed information on the eruptions reported by the Eruption Patrol.