Statistical characterization of ionosphere anomalies and their relationship to space weather events

The statistical characterization of the relationship between thermosphere-ionosphere anomalies and space weather events, also referred to as space weather anomalies, such as solar coronal mass ejections (CMEs) and geomagnetic storms, is a crucial component in the development of a forecast capability for thermosphere-ionosphere disturbances. This manuscript presents a systematic statistical approach for analyzing historical ionosphere and space weather observations to derive a quantitative characterization of the relationships between the thermosphere-ionosphere anomalies and space weather anomalies. Based on 2 years of historical data, our analysis reveals the complex nature of the relationship between space weather disturbances and ionospheric responses.


Introduction
The primary objective of our efforts in medium-range thermosphere-ionosphere storm forecasting (Mannucci et al. 2015) is to develop the capability to predict thermosphere-ionosphere anomalies starting with observations of the solar corona. Despite the availability of advanced first-principle based models linking solar activities to the dynamics of the thermosphere and ionosphere, the quantity of observations of the space environment is, at the moment, inadequate for calibrating the chain of models to a degree where they could be used to deliver a reliable forecast. Additionally, it is well known that anomalies in the thermosphere and ionosphere can also be triggered by troposphere waves (Pedatella et al. 2008;Immel et al. 2009;Lu et al. 2015). It is therefore necessary to characterize statistically the connection or correlation between major space environment disturbances such as solar corona mass ejections (CMEs) or magnetosphere storms and ionosphere anomalies.
There is an abundance of research efforts focused on identifying and explaining the effects of space environment disturbances on the ionosphere (Fagundes et al. 2008;Pokhotelov et al. 2010;Lee et al. 2011;Burns et al. 2012;Tsurutani et al. 2012). The identification of anomalous ionospheric features is often made within the context of a known space weather event, as well as by comparing ionospheric observations immediately following the event to preceding ''quiet'' period data. By ''space weather event'', we refer specifically to periods when geomagnetic indices identify disturbed periods, or when solar wind conditions indicate a disturbance variation. Interestingly, a similar approach has been used in identifying thermosphere and ionosphere anomalies preceding or following an earthquake (Liu et al. 2004;Pulinets et al. 2006;Zhou et al. 2009) even though the latter lacks a convincing physical explanation linking the two events. A difficulty with this approach is in determining and quantifying what exactly constitutes nominal conditions for the thermosphere and ionosphere. A corollary to this, is the question: how often do anomalies that are similar in nature to those attributed to either a space weather event or an earthquake, occur in the historical data record? The answers to this question are crucial for producing reliable forecasts with well-determined uncertainties.
In describing our findings, we will rely on terminology from Bayesian theory and statistical hypothesis testing. We will be interested in assessing the ability of linking the occurrence of an anomalous global ionospheric map (GIM) of vertical total electron content to the occurrence of an observed space weather disturbance, such as a CME. Thus we will be interested in calculating the sensitivity, the conditional probability of the occurrence of an anomalous VTEC map given that a space weather event occurred, the specificity, the conditional probability that an anomalous map did not occur given that a space weather event did not occur, and, most significantly for our purposes here, the posterior conditional probability that a space weather event occurred given that an anomalous global VTEC distribution was observed. These probabilities are closely connected with what are referred to in the statistical literature as the probability of a Type I error, that is, the conditional probability that an anomalous global VTEC distribution was observed given that a space environment disturbance did not occur and the probability of a Type II error, that is, the conditional probability that an anomalous global VTEC distribution was not observed given that a space weather disturbance did in fact occur. The probability of the complement of a Type II error, or the conditional probability that an anomalous global VTEC distribution was observed given that there was a space weather disturbance, is referred to as the power of the forecast.
With the availability of observations of the space environment and the thermosphere and ionosphere, a purely statistical characterization of the relationship between these two types of anomalies is possible. One key aspect of the analysis required for this characterization is the classification of ionosphere anomalies. Recently there has been a great deal of progress in the development of new efficient data mining techniques in the context of a large variety of applications (Steinwart et al. 2005;Chandola et al. 2009;Sodemann et al. 2012;Grnitz et al. 2013;Mata et al. 2014). These techniques can be broadly grouped into supervised and unsupervised classification approaches. The most common analyses of storm effects on the ionosphere fall into the general category of supervised classification because the ''quiet'' or disturbed conditions are preidentified by the absence or presence of space environment anomalies. There have also been several attempts at the automated monitoring of ionosphere anomalies (Forbes et al. 2000;Luo et al. 2005;Mandrikova et al. 2012) that could be viewed as examples of unsupervised detection. In the training of a supervised classification approach, a data mining algorithm searches for ways of separating data points labeled as ''quiet'' from data points labeled as ''disturbed''. A trivial supervised classification of ionospheric condition consists of labeling an ionospheric condition as disturbed only when it is identical to conditions on one of the days during which solar or geomagnetic anomalies were observed. Given the natural variability of the ionosphere where no 2 days have identical conditions, the trivial classification may even be 100% correct when applied to a training data set. That is, at least for the training data set, the method would not classify any quiet day as disturbed. However, since it is well known that a portion of the solar or geomagnetic anomalies do not lead to significant disturbances in the ionosphere, a supervised classification algorithm risks relying on minor differences that are well within the natural variabilities of the ionosphere for the resulting classification. Since our goal is to characterize the connection between significant disturbances in the ionosphere and solar or geomagnetic anomalies, a classification of ionosphere disturbed conditions independent of the knowledge of solar or geomagnetic conditions is essential. Therefore, only unsupervised classification approaches are used to identify disturbed ionosphere conditions. As a part of our effort in forecasting thermosphere and ionosphere anomalies and their connection to external space weather anomalies in the solar wind and magnetosphere, we developed techniques for a statistical characterization of the relationship between these two types of anomalies. In particular, we have attempted to derive quantitative estimates of the sensitivities and specificities of anomalous VTEC distributions and the posterior probabilities of the occurrence of an externally driven space weather event. Our analyses use two years, 2011 and 2012, of historical data of both ionospheric VTEC and observations of the space weather environment to develop our estimates. In this manuscript, we report on our efforts at establishing a quantitative characterization of the correlation between interplanetary space weather and thermosphere and ionosphere anomalies. An outline of our manuscript is as follows. In Section 2 we describe the data upon which our analysis will be based and in Section 3 we provide a detailed description of the key steps in our analysis. In Section 4 we discuss the implications of our findings.

Description of data sets
For the analyses presented in this paper, the primary data set we use for identifying anomalous conditions in the ionosphere is the global vertical total electron content (VTEC) obtained from global ionosphere maps, or GIMs, produced by the Jet Propulsion Laboratory (Mannucci et al. 1998). The GIM-VTEC map is derived from data received from a network of 200-1,000 globally distributed ground-based Global Positioning Satellite (GPS) receivers. Although it is known that VTEC is not the most sensitive parameter to solar or geomagnetic disturbances, significant disturbances in the ionosphere often lead to unusual VTEC maps as well. Our primary reason for this selection is the global availability and the consistent quality (Komjathy et al. 2005a) of GIMs. As for the training of any automated classification algorithm, it is essential that the training data set is sufficiently large to include all possible scenarios. Daily GIM-VTEC maps have been operationally produced since the mid-1990s. Our analyses are based on global VTEC maps with a 1 by 1°resolution in latitude and longitude sampled at 15-min time intervals. Examples of these maps are shown in Figure 1. It is useful to remind ourselves that a single data point in our daily analysis consists of a collection of 96 global VTEC maps. Figure 1 only displays 12 of these maps. Although there are significant advantages of using a consistently produced gridded product, there are also drawbacks. In particular, as a map of VTEC, GIM inherently lacks many important descriptive parameters essential for a full characterization of the ionosphere such as the F 2 -region peak density, NmF2, and height HmF2. Alternatively, output of 3-dimensional electron density produced by data assimilation models, such as GAIM (Schunk et al. 2004;Wang et al. 2004), provides a complete description of the electron density vertical profile. However, the most advanced GAIM analyses are not currently operationally produced and readily available. In addition, a common shortcoming of GIM analyses and, for that matter, GAIM as well, is that they are not actual direct measurements. As a result, GIMs include artifacts associated with the analysis methods. For example, in regions where data are sparse or not available, these analyses are either extrapolated from other regions or products of climatological models. This aspect of the GIMs does not present a major problem for their use in anomaly detection because extrapolated or seasonally averaged data are not likely to be ''out of family''. GIM and GAIM have already been used extensively in studies of disturbances, so their deviation from average behavior as represented in the data is well established (Pi et al. 2009;Immel & Mannucci 2013). In fact, truly anomalous GIM maps would have to contain their anomalies in data-rich regions. On the other hand, while the direct use of raw ionosphere measurements such as NmF2 and HmF2 derived from ionosondes or slant TEC measurements would be more desirable, they would also be more difficult to use due to the lack of absolute calibration and consistent availability. Therefore, we feel that the use of GIM as a proxy indicator of the state of the ionosphere presents an appropriate trade-off for demonstrating our general approach (Komjathy et al. 2005b;Datta-Barua et al. 2014).
There are many data sources that can be used to represent the state of the space environment (Menvielle & Berthelier 1991;Richardson & Cane 2010). However, since the main goal of our analysis is to establish a statistical characterization of the link between significant ionosphere anomalies and anomalies in the space environment, we decided to use classifications of the space environment widely accepted by the space weather community based on the solar wind, since solar wind conditions are in principle forecastable and are the basis of medium-range forecast research (Mannucci et al. 2015). In particular, we use the list of CMEs and ICMEs since 1996 compiled by Richardson and Cane, which is available on the Internet at (http://www.srl.caltech.edu/ACE/ASC/DATA/ level3/icmetable2.htm) to identify space weather anomalies associated with the solar wind. On the other hand, instead of directly using the Kp or Ap index for magnetosphere conditions, we rely on the quiet days, or Q-days and disturbance days, or D-days classification produced by the GFZ German Research Center for Geosciences as a global description of geomagnetic conditions. A description of the method used for selecting Q-days and D-days is provided by  It should be noted that the determination of the occurrence of ICMEs (as defined in Richardson & Cane 2010) is not entirely unambiguous. Similarly, there is also significant nuance involved in the identification of Q-days and D-days (see Hibberd 1981).
In our analyses presented in this manuscript, we consider one calendar year of GIM-VTEC maps as a training data set. As a result, the classification of a normal or quiet day and an anomalous day is made in relationship to the rest of the days in the year. In particular, we have focused on 2011 and 2012, as two separate training sets, for our analyses. As a result, we also only use CME or ICME days and geomagnetic Q-days and D-days within these 2 years to investigate the correlations between the ionospheric and space environment anomalies. An interval of 1 year is considerably longer than the time intervals typically used in an episodical analysis of an isolated storm event. In those studies, on the other hand, the quiet reference time interval usually spans less than 20 days. Restricting our historical data set to one calendar year inevitably limits the validity of our observations to only one specific phase within one specific solar cycle. However, as a proof-ofconcept demonstration of our analysis methodology, we feel that our analysis time interval selection represents an appropriate trade-off between computational complexity and the statistical significance of our analyses.

Technical approach
Since an ionosphere anomaly is identified by comparing it to typical, or nominal, conditions, it is crucial to develop an unambiguous characterization of what can be considered to be the normal state of the ionosphere. For many previous space weather studies (Forbes et al. 2000), this requires only the characterization of the ionosphere during a time period absent of any geomagnetic storms or solar anomalies. This type of anomaly detection method is often referred to as supervised detection because two training data sets labeled ''normal'' and ''abnormal'' are considered available. However, the approach relies implicitly on two basic assumptions. First, we must assume that our observations of the geomagnetic storm and solar anomalies are complete, and second, we must believe that all ionosphere anomalies are consequences of either geomagnetic or solar anomalies. In fact, both of these assumptions are in dispute. Indeed, geomagnetic indices are indirect indicators of ionospheric conditions, and only very limited solar and interplanetary observations are available. At the same time, it is well known that troposphere waves can, in some cases, be the cause of ionosphere anomalies (Pedatella et al. 2008;Immel et al. 2009;Lu et al. 2015). Alternatively, an unsupervised detection approach relies primarily on the relative frequency of occurrence of normal conditions to distinguish them from abnormal conditions. The assumption that quiet conditions are a more common occurrence is implicit in many studies (see for examples, Tsurutani et al. 2004;Pokhotelov et al. 2010;Burns et al. 2012;Tsurutani et al. 2012). In developing an unsupervised classification scheme for ionosphere anomalies, we can attempt to establish possible correlations between ionosphere anomalies and externally driven space weather anomalies by examining the coincidence of occurrence of the two types of anomalies. In this section, we present the key steps in an unsupervised ionosphere anomaly classification approach and a statistical analysis for their correlation with space weather anomalies.

Cluster analysis and cluster radius
In unsupervised classification, data points are considered as belonging to the same class if the ''distances'' between them are small. Distance is determined by a metric that varies according to the degree of similarity between two VTEC distributions, and is zero for two identical distributions. For the purposes of classification of ionospheric conditions using VTEC maps, we consider a data point to be a collection of 96 global VTEC maps spanning a period of 24 h. As a result, for a training data set of one calendar year, there are 365 (or 366 in the case of 2012) such data points. Each data point can be considered to occupy a specific ''location'' in a multidimensional space of dimension 180 · 360 · 96 (for 1 · 1°maps of 15-min cadence). The coordinates within this space are determined by the TEC values at each location and time.
There are many different ways of quantifying the difference between two sets of VTEC maps. We will discuss some of them in the next subsection. A selected metric, or distance function, gives us the distance d i,j between the ith day, or data point, and the jth data point in the data set. Therefore the 365 · 365 matrix D defined by [D] i,j = d i,j is referred to as the distance matrix for a data set containing 365 data points. A graphic representation of a distance matrix is given in Figure 2. Since the metric used in Figure 2 is the average of the absolute VTEC differences, the distance is measured in TEC unit (TECU). A more precise definition of this metric is given in Section 3.2. Thus, a distance of 1 TECU indicates that the average difference of the two VTEC maps is 1 TECU for any longitude, latitude, and UT. One of the most noticeable features of the distance matrix is the relatively low values along and close to its principal diagonal (from the upper right corner to the lower left corner). The diagonal values are, of course, zero by definition, but the small values clustered around the diagonal are a consequence of the construction of the GIMs. Even though new data are used to refresh the VTEC maps, only a subset of ground receivers provide new data during any given 15-min sampling interval. Therefore the new refreshed map is the result of extrapolating data from the previous maps by taking into consideration the change of sun angle and geomagnetic field orientations. These approximations introduce what is in effect, a numerical diffusion phenomenon. As a result, the effect of any slant TEC measurement may persist over a day or two causing the VTEC maps to be statistically more similar to each other than are VTEC maps separated by several more days. It is also easily noticeable that substantially larger distances exist between maps for November and June than for January and June.
The goal of a cluster analysis based on a distance matrix is to identify data points that are surrounded by either a dense or sparse cluster of other data points. There are two main approaches to identify points in dense clusters. The first consists of counting the number of data points within a given radius r of a given data point. In the case of Figure 2, we can count the number of elements in a given column that have values less than r-TECU. The larger is the number of neighboring data points the more dense is the cluster of data around a given data point. We refer to the number of data points within r-TECU to a data point as the popularity of the data point. An example of popularity and location of closest neighbors is shown in Figure 3. Although not surprisingly most of the days within the 4.0-TECU radius are within a time interval of a few days to the given day, we note that not all days immediately before or after the given day are in the 4.0-TECU neighborhood. Conversely, there are also days several months before or after a given day that fall within a 4.0-TECU neighborhood. Alternatively, we could also calculate the largest distance, r n i , of the n closest nearby data points to a given data point i. The smaller the value of r n i the denser the population of data points around data point i. We refer to r n i as the radius of the n-cluster around point i. The evaluation of r n i is straightforward using the distance matrix D. In fact, we first sort the values in each column of the matrix D in increasing order. Then the n + 1st value of the sorted distance matrix of column i is equal to r n i . Figure 4 illustrates our approach for computing r 10 i . A more direct depiction of the cluster radius is shown in Figure 5. We observe from Figures 4 and 5 that the cluster radii are substantially larger for time periods from March to May and from September to December than the time period from June to August. On the other hand, in each of these periods of time, there are days that stand out with a much larger cluster radius relative to other days. This may be due to enhanced geomagnetic activity during these months, that create anomalous Fig. 2. Distance matrix for 2011 data set using the averaged absolute VTEC differences as a metric shown in a 2-dimensional color rendering in which red and blue represent, respectively, large and small distance between two data points. The most visible variations in the distance matrix, i.e., low values between data from June to September and high values between data from October to December, are most likely due to variations in geomagnetic activity levels.
ionospheres. Assuming that the metric we use is a valid way of comparing ionospheric conditions between any 2 days, the cluster radius shown in Figure 5 provides an indication of how unusual a given day's data is in relationship to other data points in the set. In Section 3.4, we shall present a method for identifying quiet and anomaly days using the cluster radii of the data points. However, in the next subsection, we first discuss the different metrics or distance measures we have considered in our analyses.

Metrics for comparing two VTEC maps
In the previous subsection the metric used in the examples presented was the average of the absolute value of pixelwise differences between 2 days of VTEC data. However, there are many ways to compare two sets of VTEC maps. In particular, we can define the following metrics.
where v i k represents the k-value of i-day of VTEC maps and N is the total number of VTEC values in these maps. The average of the absolute differences in VTEC, d abs i;j in (1), is one of the most direct ways of measuring the differences between VTEC maps. However, the typical values of VTEC can vary significantly and systematically. For example, at local noon time near the equator, the VTEC value is much higher than the values in predawn high-latitude regions. As a result, d abs i;j is always dominated by the differences in VTEC in the equatorial region. Conversely, the average of the relative difference, d rel i;j in Eq. (2), normalizes the VTEC difference by the absolute value of VTEC. However, when VTEC is very low, as they are in predawn high-latitude regions, the difference in VTEC maps may be primarily due to VTEC measurement uncertainties. Indeed, when normalized by small VTEC values in these regions, division by a relatively small value will tend to magnify measurement error. As a result, d rel i;j could be artificially dominated by differences in regions with very low VTEC values. An intermediate approach is given in (3) where the normalization factor is at least as large as a pre-set threshold value v threshold . In fact, d relthr i;j is simply a multiple of d abs i;j when v threshold is very large and d relthr i;j is equal to d rel i;j when v threshold is very small.
In addition to comparing the differences in the VTEC values, several researchers of ionosphere anomalies have indicated (see Luo et al. 2005) that differences in spatial and temporal gradient of VTEC could also be significant indicators of ionosphere anomalies. As a result, we can also define a metric that compares the gradients of VTEC maps by where D lat , D lon , and D UT are weighted difference operators in latitude, longitude, and UT, respectively. The selection of weights for these difference operators can definitely alter the results of an analysis. In our investigation, we select the weights to make the contributions of each of the three differences have comparable magnitude. In fact, each of the differences D lat v i k , D lon v i k , and D UT v i k is affected by different driving forces of the ionosphere. Figures 6-8 show examples   of these differences. For example, geo-electrical field strength strongly influences the ions' E · B velocity in the mid-and low-latitude regions. The resulting equatorial anomaly leads to the characteristic latitude-dependent features in the maps of D lat v i k shown in Figure 6. In many investigations of the effects of solar and geomagnetic storms (see Tsurutani et al. 2004;Lee et al. 2011;Tsurutani et al. 2012) different latitude regions on the globe show different sensitivity. As a result, we can evaluate the metrics d abs i;j ; d rel i;j ; d relthr i;j , and d H 1 i;j by summing up the differences only over regions of interest. Specifically, we can examine the difference only over high-latitude regions; mid-and lowlatitude regions; for selected local times, and for selected UTs. As a result, the combination of general metrics defined in Eqs. (1)-(4) and subsets of VTEC maps provide a large selection of metrics to compare VTEC maps for any 2 days. These metrics already include most commonly used approaches for identifying effects of solar and geomagnetic disturbances. In our subsequent statistical analysis, the sensitivity of these metrics to space weather disturbances is examined.

Correlation between geomagnetic Q-days and VTEC quiet conditions
The n-cluster radius for a given day derived in Section 3.1 is viewed as an indicator of how unusual are the VTEC maps for this day. It is therefore interesting to examine whether the  observed unusual nature of a particular map or set of maps is in anyway related to the geomagnetic activity level for this day. In particular, if for each month we select the 10 days with the smallest cluster radius, referred to as as VTEC Q-days, and 5 days with the largest cluster radius, referred to as as VTEC D-days, we would like to determine whether or not there is a substantial overlap with the Q-days and the D-days compiled by GFZ-Potsdam. A graphic representation of our analysis is shown in Figure 9. The two panels in Figure 9 represent the geomagnetic Q-days and D-days in comparison to VTEC Q-days and D-days. Each row in these tables represents 10 Q-days and five D-days of a month. The colors of the small rectangles represent the position of a Q-day or D-day within a month. The dark blue color represents the first of a month and the bright yellow color corresponds to the last day of the month. The correlation of two image panels is an indication of the correlation between the two sets of Q-days and D-days. At least for the 10-cluster radius derived from average absolute VTEC differences, the correlation of the two panels does not appear to be very strong.
The level of correlation of the n-cluster radius determined Q-days and D-days using different comparison metrics can serve as a useful indicator of the nature of the sensitivity of ionosphere to geomagnetic disturbances. We shall present a comparison of the correlation coefficients for several distance metrics in the next section.

Identification of outliers
As explained in Section 3.1, our ionosphere anomaly classification approach relies on the cluster radius determined using the metrics defined in Section 3.2. In general, data points with exceptionally large cluster radii are considered as outliers. A common approach in outlier identification (see for example, Fig. 11. The 10-cluster radius for the 2011 data set and identified outliers are shown in solid blue line and red circles. The vertical purple lines indicate days of ICME as compiled by Richardson & Cane (2010). Fig. 12. Panel on the (a) shows a comparison between P VTEC (k) and P VTEC (k|ICME). Panel on the (b) shows a comparison between P ICME (k) and P ICME (k|VTEC) for the 2011 data set. Sodemann et al. 2012) involves the comparison of a sampled empirical distribution with a normal approximation and labels the unexpectedly larger number of data points at the tails of the normal approximation as outliers. More precisely, for any level of significance a, a data point that is larger than F À1 (1 À a; l, r), where F represents the cumulative normal probability distribution with expected value l and standard deviation r, are classified as an outlier. A close examination of the empirical distribution of the cluster radius in Figure 10 reveals significant deviation from a normal distribution. This is not entirely surprising since we can already observe in Figure 5 that during the months of September through November, the cluster radius of all points is substantially higher than during the period of June-August. As shown in Figure 10 three normal approximations of the empirical distribution of the 10-cluster radius are evaluated. The blue curve represents a normal approximation with expected value and standard deviation equal to the sample mean and standard deviation, respectively. This approximation leads to the classification of the large number of data points with small cluster radii as outliers. For the red curve, we use the median of the sample as the expected value for the normal approximation. If we let r q (p) represent the value of the cluster radius at quantile p, then the standard deviation for the normal approximation is given by r r ¼ minðr q ð0:5 þ pÞ À r median ; r median À r q ð0:5 À pÞÞ F À1 ð0:5 þ p; 0; 1Þ : Since quantiles of a data set are in general less sensitive to outliers, r r is viewed as a more robust estimation of the standard deviation than the sample standard deviation. However, in the case of the cluster radii shown in Figure 10, the trimodal form of the empirical distribution makes r r even larger than the sample standard deviation. An alternative approach is to use the center of the bin in the histogram of the data set with the highest number of data points as an approximation to the expected value and then approximate the standard deviation by where Dr bin and n bin are, respectively, the width and the number of points in the bin with the highest number of data points, and N is the size of the data set. This approximation is given by the black curve. It is evident that this approximation may lead to the classification of a large number of data points as outliers.
In order to avoid under-and overclassification of outliers, we use a hybrid global and local normal approximation of data. This approach consists of two levels of classifications. The first level relies on the normal approximation of the entire data set with mean l equal to the sample mean, standard deviation r equal to the sample standard deviation, and the labeling of all data points with cluster radius larger than F À1 ð1 À a g ; " r; rÞ as outliers. Then, for data in a moving time window of width N s days, a normal approximation for the empirical distribution of the subset of data is derived with l and r equal to the subsample mean and standard deviation, respectively. Then, all data points with cluster radius larger than F À1 ð1 À a l ; " r; rÞ are labeled as outliers. This permits outlier identification within a time neighborhood of several days. An example of this classification is shown in Figure 11.
Since the n-cluster radii depend on the metric used to measure the difference between the data points, a variety of classifications of outliers are possible.

Empirical conditional probabilities for VTEC anomalies and ICME conditions
The VTEC outliers identified using approaches presented in the previous section enable us to establish empirical probabilities of coincident occurrence of ionosphere VTEC anomalies and the ICMEs. In fact, let n ICME (k) be the number of days identified by Richardson & Cane (2010) that an ICME is observed within the previous k days. Although several parameters are provided in the list compiled by Richardson & Cane (2010), in our analysis we choose not to stratify our analysis results by these parameters at this moment. Let n VTEC (k) be the number of days for which an anomaly day in the previous k days is identified by our method, and let n ICME+VTEC (k) be the number of days that both (1) an ICME is observed and (2) a day is identified as anomalous in the previous k days. The probability P ICME (k) that an ICME is observed within the previous k days is given by n ICME /N where N is the number of days in a data set. For 2011, N = 365. Similarly if we are interested in forecasting a VTEC anomaly using the observation of an ICME, we first note that the sensitivity P VTEC (k|ICME) of VTEC to ICME is the conditional probability of detecting a VTEC anomaly given that an ICME is observed in the previous k days. This quantity can be estimated as Conversely, the posterior probability of the VTEC anomalous response P ICME (k|VTEC) to ICME is given by the conditional probability of observing an ICME given that an anomalous VTEC map is identified in the last k days. That is, The reason we choose to examine the coincident occurrence between the two types of anomalies over a time interval of several days, is so that we can attempt to account for the delayed response of ionosphere anomalies caused by an ICME. In this analysis, for simplicity, we do not require that an ICME must occur prior to the VTEC anomaly. A comparison between P VTEC (k) and P VTEC (k|ICME) provides an indication of how predictive an ICME observation is for an occurrence of a VTEC anomaly. Figure 12 shows a comparison between the marginal probabilities P VTEC (k), P ICME (k) and the conditional probabilities P VTEC (k|ICME), P ICME (k|VTEC), respectively.
At a very basic level, the comparison in Figure 12 illustrates two obvious facts. First, there is not always an  occurrence of an ionosphere anomaly following an observation of an ICME. Although our definition of an ionosphere anomaly is dependent on many subjectively selected parameters such as the choice of a metric, the cluster size, and outlier detection threshold, the selection of these parameters is motivated by previous investigations (see for example Tsurutani et al. 2004;Burns et al. 2012;Tsurutani et al. 2012) in which similar metrics are used to explain the effects of an ICME. Conversely, it can also be observed that not all occurrences of anomalous VTEC maps are preceded by an ICME. A more comprehensive discussion of our findings will be presented in the next section.

Analysis of ionosphere anomalies and their connection with space weather events
As we have discussed in the preceding section, there are a multitude of ways that ionosphere VTEC maps can be compared and anomalies can be identified. In this section, we present our analyses for data sets from 2011 and 2012. Since in this study we have restricted ourselves to anomaly detection based on a single metric, one key issue we want to address is which metric provides the greatest sensitivity to either ICME or geomagnetic storms.
In addition to the general metrics defined in Eqs.
(1)-(4), we also restrict the summations in these definitions to three different subsets, or masks, to create additional metrics. The primary purpose of restricting our comparison between two sets of VTEC maps to masked region is to potentially increase the sensitivity of the differences to space weather events. Indeed, many research efforts on thermosphere-ionosphere disturbances due to space weather events have focused on specific regions as well (Tsurutani et al. 2004;Lee et al. 2011;Tsurutani et al. 2012). These three masks are labeled as high-latitude, mid-latitude noon, and antarctic. Figure 13 provides a depiction of these regions. The two masked regions, high-latitude and antarctic, are defined for fixed universal times. The mid-low-latitude region is defined for local time between 10:00 and 22:00 local time. As a result, a total of 16 different metrics are used for the classification of ionosphere anomalies. These metrics constitute a representative sample of the most often used criteria for identifying unusual features in a VTEC map. For future reference we assign a number to each of these metrics in Table 1.
For each of these 16 metrics, we evaluate the cross-correlation coefficients of the tables of Q/D-days determined by Kp index and that determined by the 10-cluster radius as presented in Section 3.3. Additionally, we tabulated the number of Q-days and D-days for each month that are identified by both classification methods as Q-days and D-days. The results are shown in Figures 14 and 15   15 indicate that for most of the metrics we considered, although the correlations between the Q/D-tables are positive, they are, in general, quite weak. This observation is also confirmed by the examination of the percentage of quiet days that are classified as such by both methods. In general, the percentages of common classification for D-days are somewhat higher than those for Q-days. Also, metrics 1 and 2 seem to have higher sensitivity than do metrics 12-16.
For the classification of VTEC anomalous days, we use the identical parameters for outlier classifications with the significance level of classification selected at 2.5% and a local outlier detection interval of 36 days. In Figures 16 and 17 each day of the year is represented by a row of pixels in the image and the first 16 columns represent the classifications by the 16 metrics. When a day is classified by the kth metric as disturbed, the pixel on the kth column of the corresponding row is given a nongray color. In fact, the color of the pixel corresponds to the total number of metrics by which the specific day is classified as disturbed. For example, if a white pixel is seen in a row which indicates that only one of 16 metrics classifies the day as disturbed, this must be the only non-gray pixel on this row. Conversely, if a black pixel is seen in a row which indicates all 16 metrics classify the day as disturbed, all 16 first pixels on the row must also be black. In column 17, the ICME days are shown. These figures show that although there are significant overlaps in the days identified by different metrics as anomalous, there are also significant differences among the classifications as well.
In Figures 18 and 19 we show the marginal probability of a VTEC anomaly over an interval of width of n days, as well as, the conditional probabilities given the occurrence of an ICME during the same period. The results in Figures 18 and 19 confirm our observation in Section 3.5 that the majority of the comparison metrics show a certain degree of sensitivity to the occurrence of an ICME. In fact, the probability of detecting a VTEC anomaly based on several comparison metrics increases significantly from the marginal probability when an ICME is observed. While some of these increases are observed for both the 2011 and 2012 data sets, the behavior can be different for other metrics. For example, for metric 5 (or 3), a substantial increase in probability is observed for the 2011 data set, while the 2012 data appears to be quite insensitive to the occurrence of an ICME. Fig. 18. Marginal probability of observing a VTEC anomaly during a time interval of n days (a), conditional probability of a VTEC anomaly given an ICME is observed during the same period, i.e., sensitivity (b), and conditional probability of observing an ICME given that a VTEC anomaly is observed during the time interval, i.e. posterior probability (c) for the 2011 data set. Note that the last row represents the probability of observing an ICME.   19. Marginal probability of observing a VTEC anomaly during a time interval of n days (a), conditional probability of a VTEC anomaly given an ICME is observed during the same period, i.e., sensitivity (b), and conditional probability of observing an ICME given that a VTEC anomaly is observed during the time interval, i.e. posterior probability (c) for 2012 data set. Note that the last row represents the probability of observing an ICME. Fig. 20. Marginal probability of observing a VTEC anomaly during a time interval of n days (a), conditional probability of a VTEC anomaly given that an ICME is observed during the same period, i.e., sensitivity (b), and conditional probability of observing an ICME given that a VTEC anomaly is observed during the time interval, i.e. posterior probability (c) for the 2011 data set. Note that the row numbers represent the number of comparison metrics required for classification of a VTEC anomaly.
In addition, to compute the marginal probability of observing a VTEC anomaly using a single comparison metric, we also computed the probability of observing a VTEC anomaly by at least k different comparison metrics for k = 1,. . ., 16. Naturally, as k increases, the marginal probability decreases because unless k comparison metrics all classified a data point as anomalous, the point will not be classified as anomalous. Figures 20 and 21 show the resulting marginal probabilities, as well as, the sensitivity and posterior probability of the anomalies to space weather events. The results in Figures 20 and 21 suggest that while the probability of identifying a VTEC anomaly by 4-10 metrics shows substantial sensitivity to the occurrence of an ICME, the classification by less than three metrics is overprescribed. These results also suggest that a more refined approach for using multiple comparison metrics has the potential to enhance both the sensitivity and posterior probability of VTEC anomaly detection. It is important to put the results presented in this section into perspective. First, the list of ICME events used in this analysis may include events that are not geo-efficient. Secondly, the differences in 2011 and 2012 results can simply be attributed to the difference in the phases of the solar cycle for these years.

Conclusion
The analysis results presented in the previous section constitute a demonstration of a technical approach for the statistical characterization of connections between space and ionosphere anomalies. Although these results already established quantitative estimates of sensitivity and specificity, these estimations should be considered as preliminary due to the limited time span of data used in this study. However, the potential value of a quantitative characterization of the connections between space weather events and thermosphere-ionosphere anomalies is quite evident. For example, as indicated by the results in Figure 18, without knowledge of an ICME, the probability of observing a VTEC anomaly is below 10%. With the detection of an ICME, the probability of observing a VTEC anomaly can be, in some cases, increased to 40-50%. In fact, a much richer pool of solar and magnetosphere observations can be mined to provide a more detailed characterization of the relationship between space weather events and ionosphere anomalies. In our view, the ultimate goal of forecasting ionospheric weather that results from solar and magnetospheric disturbances is advanced by a quantitative understanding of the causal relationship between these types of anomalies.