Improved characterization and modeling of equatorial plasma depletions

This manuscript presents a method to identify the occurrence of Equatorial Plasma Bubbles (EPBs) with data gathered from receivers of Global Navigation Satellite System (GNSS). This method adapts a previously existing technique to detect Medium Scale Travelling Ionospheric Disturbances (MSTIDs), which focus on the 2nd time derivatives of total electron content estimated from GNSS signals (2DTEC). Results from this tool made possible to develop a comprehensive analysis of the characteristics of EPBs. Analyses of the probability of occurrence, effective time duration, depth of the depletion and total disturbance of the EPBs show their dependence on local time and season of the year at global scale within the latitude belt from 35 N to 35 S for the descending phase of solar cycle 23 and ascending phase of solar cycle 24, 2002–2014. These results made possible to build an EPBs model, bounded with the Solar Flux index, that simulates the probability of the number of EPBs and their characteristics expected for a representative day at given season and local time (LT). The model results provided insight into different important aspects: the maximum occurrence of bubbles take place near the equatorial anomaly crests, asymmetry between hemispheres and preferred longitudes with enhanced EPBs activity. Model output comparisons with independent observations confirmed its soundness.


Introduction
Equatorial Plasma Bubbles (EPBs) represent plasma depletions with respect to the background ionosphere and are the major source of electron density irregularities in the equatorial F-region (e.g.Kelley, 1989).The ionospheric irregularities within the bubbles can affect the radio-based technologies by scattering and diffracting radio wave signals that pass through them resulting in random fluctuations in signal amplitude and phase known as scintillations (Hargreaves, 1992).These turbulent conditions can be observed in ionograms as a spread in the echo range and frequency (Equatorial Spread F, ESF) (McNamara et al., 2013) and can disrupt communications and navigations systems (i.e., Kumar et al., 2016aKumar et al., , 2017)).Although other mechanism can produce equatorial scintillations, the scintillations from EPB/ESF are likely to have the strongest impact on operational systems (Whalen, 2000).This is why a large number of studies has been carried out during the last decades to gain a better understanding and improve the modeling and prediction of the EPBs phenomena (i.e., Wernik et al., 1980;Vijayakumar and Pasricha, 1997;Paznukhov et al., 2012;Kumar, 2017).
These plasma depletions are generated through the generalized Rayleigh-Taylor instability (e.g.Ott and Russell, 1978) in the bottom side F region of the ionosphere, in the post sunset period, and they rise to high altitudes (Tsunoda et al. 1982;Sultan, 1996;Eccles, 1998).The occurrence of EPBs depends on many factors such as local time, longitude, season, solar cycle and geomagnetic activity (e.g.Huang et al., 2002;Burke et al., 2004;Magdaleno et al., 2012Magdaleno et al., , 2017;;Cherniak and Zakharenkova, 2016;Kumar et al., 2016b).Although EPBs are mainly equatorial phenomena, they can extend to higher latitudes along the magnetic field lines (e.g.Kil and Heelis 1998;Kil et al., 2009) and propagate eastward with a clearly defined a d m i structure, so can be considered as ''well-formed irregularities'' (Haaser et al., 2012).
The global climatology of EPBs has been studied for long time.The spatial and temporal distributions of the phenomenon have gradually been established thanks to the contributions of, among others, Maruyama and Matuura (1984), Kil and Heelis (1998), Huang et al. (2002), Burke et al. (2004), Gentile et al. (2006), Cervera and Thomas (2006), Kil et al. (2009), Magdaleno et al. (2012), Magdaleno et al. (2017).Data gathered from receivers of Global Navigation Satellite System (GNSS) signals have played a significant role in the knowledge of the climatology of EPBs since the late 1990 s.Such a climatology refers to the parameters used to characterize EPBs from total electron content (TEC) variation estimated by GNSS data.These parameters usually refer to the occurrence, intensity of the disturbance or depth of the depletion and size of EBPs.Occurrence rate (hereafter OR) of EPBs can be interpreted as an estimate of its probability of existence.The size of the EPBs relates to the effective time of duration (ETD) of the perturbation (e.g.Magdaleno et al., 2017) which is the time interval during which the ray path from the satellite to the receiver station crosses the EPB.ETD of an EPB is calculated as the difference between the times for the first point after starting the depletion and for the last point before ending the depletion and it may last from several minutes to few hours.The depth of an EPB usually refers to the largest anomaly of the depletion with respect to the background ionospheric content.The climatology of EPBs may be summarized as follows.EPBs are observed in the local nighttime, after the Sun sets.The largest OR of EPBs manifest at magnetic equator and especially in the South America to Africa sector, and the OR of EPBs decreases as the distance from the equator increases.OR of EPBs increases with the solar activity and peaks close to the equinox.However, the maximum values of OR of EPBs for low solar activity levels are observed in both sides of the magnetic equator, being better defined in the South America to Africa sector.This agrees with the characteristics of occurrence of the largest electron density gradient that provides the largest vertical plasma drift and the higher probability of EPB formation (Kelley, 1989;Fejer et al., 1999).EPBs can cause depletions of more than two orders of magnitude lower than the background ionospheric density and can occupy several hundred km in the east-west direction, thousands of km in the northsouth direction, and occur at altitudes of several hundred km.The later can be estimated by the ETD of EPBs as observed from the signals of individual satellites at a given GNSS receiver.The depth and ETD of EPBs highly correlates with OR of EPBs and both increase with increasing solar activity.In fact, temporal and spatial behavior of the depth and ETD of EPBs follow a very similar pattern.The larger values of depth and ETD occur in the South American sector and in the magnetic equator, with values decreasing with increasing magnetic latitude.As observed in the OR, both depth and ETD tend to peak near equinoxes.The observed ETD depends on the satellite path and trajectory and it is expected that ETD of EPBs should be larger close to the geographical equator, where the satellites are visible longer time, than at latitudes far from the equator.
The knowledge of the EPBs phenomenon has progressed thanks to these studies and many others.There are numerical models that could reproduce the structure of EPBs knowing physical constraints (e.g.Yokoyama et al., 2015).However, there is still needed an improved knowledge regarding the ionospheric spatial and temporal (small-scale) variability, including bubbles and depletions.Particularly, it is of interest to reproduce such a small-to medium-scale ionospheric disturbances, and to improve prediction capabilities for ionospheric effects, because they may challenge, at system level, the accurate estimation of ionospheric delay corrections for systems like SBAS (Satellite-based augmentation systems) when they operate in single-frequency mode (Datta-Barua, 2008).They are responsible, for example, for availability degradation during strong perturbations (Sanz et al., 2014), or also at user level for systems using code-carrier smoothing (Rovira-Garcia et al., 2016).In this sense, the European Space Agency (ESA) has requested, in the framework of the project named SCIONAV (ESA-ITT 1-8214/15/NL/LvH), a model for slow refractive scintillation, including the refractive ionospheric effects caused by ionospheric depletions and bubbles (EPBs) in equatorial regions (low frequency TEC fluctuations), was requested.
This paper aims at contributing to an improved characterization and modeling of equatorial plasma depletions.Model parameters are derived from a comprehensive analysis after the design of a method to identify unambiguously large depletions, bubbles and other significant disturbances.Such a method is based on the ideas and experience provided by other teams (e.g.Hernández-Pajares et al., 2006;Magdaleno et al., 2012;Pradipta et al., 2015).The empirical and analytical model will focus on low frequency fluctuations in the total electron content caused by the ionospheric EPBs in equatorial regions.The model will account for probability of occurrence of EPBs, their depth and size (in terms of the ETD and total disturbance of the depletion from its background).These characteristics depend on local time, season, solar activity and location in a global way.The results of this EPBs occurrence and characteristics probability model will be used by a general perturbation model derived from the SCIONAV project that reproduces the effects of EPBs on the slant Total Electron Content (sTEC) estimated by GNSS data (Camps et al., 2017).
The main objectives of this work are: 1) to develop an automatic EPB detection tool using ground-based GNSS receiver data and 2) to model, globally, the depletion occurrence rate and its main characteristics (duration, depth and size).

Data set
To perform this investigation, we have used the data gathered for a set of 71 ground-based GNSS receivers located at low latitudes and well distributed in longitude.These stations are listed in the Table 1 and their geographical distribution is depicted in Figure 1.These receivers belong to the International GNSS Service (IGS) network and are available on the Crustal Dynamics Data Information System Data Center, ftp://cddis.gsfc.nasa.gov/pub/gps/data/daily/.
Data covers years with different solar activity levels, such as 2002-2003, 2009, and 2014, which observed a yearly averaged solar flux of 172.28, 70.43, and 145.78 sfu respectively.GNSS observables are sampled at 30 s and we have processed 16 different weeks equally distributed within the year (Table 2) for each receiver under analysis.E. Blanch et al.: J. Space Weather Space Clim. 2018, 8, A38 3 Automatic EPB detection tool One of the objectives of this research is to design an automatic bubble detection tool to detect EPBs to later perform a comprehensive analysis and develop a global model of their main characteristics: probability of occurrence, duration (ETD), depth (maximum |DTEC|) and total disturbance of the depletion in the TEC variations (TDB).

Data processing and method
To develop the automatic EPB detection tool, we have used the ground-based IGS receiver data set corresponding to 2014 (see Table 1 and Figure 1 to identify such receivers and their location).
We have used the technique to detect Medium Scale Travelling Ionospheric Disturbances (MSTIDs) developed by Hernández-Pajares et al. (2006) as a basis to detect EPBs.That technique identifies MSTIDs by analyzing distinct changes in the temporal variations of the second slant Total Electron Content (sTEC) derivative or sTEC curvature (Eq.(2) from Hernández-Pajares et al., 2006).It uses the geometry-free combination of the two GPS carrier phases (LI = L1 -L2; Eq. (1) in Hernández-Pajares et al., 2006) to obtain the sTEC and apply an elevation mask of 50°to avoid any significant effect from the mapping function uncertainty.
We have adapted the Hernández-Pajares et al. (2006) technique to detect MSTIDs to develop the automatic EPBs detection tool.Principal differences and novelty of our proposed approach are: d Different from Hernández-Pajares et al., (2006), we detect the presence of EPBs from the second difference of TEC instead of sTEC.We do this in order to increase the number of data to analyze because we remove the elevation mask of 50°used in Hernández-Pajares et al. (2006) for LI and thus, we avoid amplifications of the sTEC at low elevation that could result in enhanced variability and produce false alarms.We have applied an obliquity factor (or mapping function) to obtain TEC from sTEC values, i.e.TEC = STEC/M.Where M is a mapping function, which depends on the elevation (e), and, assuming a single layer model for the ionosphere can be modeled as where R E is the Earth radius and H ion is the altitude of the ionospheric layer, which in this work is taken as 350 km (Mannucci et al., 1993;Kumar and Singh, 2009).
d Similar to previous approaches (Hernández-Pajares et al., 2006;Magdaleno et al., 2012), we estimate the TEC using the geometry-free combination of the ground-based GNSS carrier phase delays (LI = L1 À L2), which is much less noisy than TEC estimated from the geometry-free combination of the code delays (PI = P2 À P1).This is illustrated in Figure 2 (left), by comparing the estimation of the TEC obtained by different delays, LI and PI.It can be seen that the TEC estimated from LI, in black, is much more precise than the TEC estimated from PI, in grey.However, as it is known, carrier-phase can be affected by cycle-slips that produce jumps in the TEC estimation and that would result on an erroneous estimation of the bubble parameters.Such cycle slips are one of the effects caused by the ionospheric irregularities (scintillation) on the tracking of the GNSS signal by the receiver.Thus, as an alternative to extract additional information from the GNSS signal under severe conditions, we propose to use the TEC data estimated by smoothing the unambiguous but noisy code of the GNSS signal (PI).In particular, we have smoothed the TEC obtained from the geometry-free combination of the code delays (PI) by computing the 5-points running mean.Such technique will help to enlarge the TEC information connecting TEC estimated from the phase delays (LI) in different arcs (Fig. 2, right).Nevertheless, we have to take into account that the noise of the PI can be about 100 times larger than the noise on LI at low elevation, because of the increased noise due to the multipath in PI delay.Such as effect could be erroneously interpreted as an additional ionospheric activity.To avoid this, we have introduced an elevation mask of 20 degrees for any estimated TEC from PI delays.

Disturbance identification: ETD
Similar to Hernández-Pajares et al. ( 2006) approach, we use the second derivative of TEC to identify EPBs instead of the first derivative (Magdaleno et al., 2012) because the nominal variations of the TEC can present TEC rates of non-negligible values, which in turn can make more difficult the identification of the EPBs.Moreover, if we compare the variance of the second TEC derivative with that of the first TEC derivative we observe that the former remain more stable under quiet intervals than the latter, making easier the identification of depletions on the TEC.Thus, we use the second derivative of TEC (2DTEC) to detect automatically the presence of EPBs from ground-based GNSS data.Then, by computing the SIGMA of this second derivative (Eq.( 2)) we can define the starting and ending times of the plasma depletion, T 0 and T f respectively, as the times when the SIGMA is over a given threshold value.

SIGMA 2DTEC
ð Þ¼ In Equation ( 2), 2DTEC is the second TEC derivative and " TEC is the mean value of the data sample of n values.We calculate the SIGMA of the second difference of the TEC (SIGMA(2DTEC)) over sliding time windows of a given length to find distinct changes of its pattern as indicators of a disturbance.We have initially used a window of 600 s as a first approximation to calculate the SIGMA(2DTEC), similarly to that proposed by Magdaleno et al. (2012).We have also tested other time windows and we got to the conclusion that there are no significant differences in the detection of disturbed intervals.However, by enlarging the window length of analysis (900, 1200 s or more) we lose the possibility to find disturbed intervals of short time duration and consequently we will lose the capability to find small-scale size depletions.Notice that, as a zero order of approximation, assuming that EPBs drift with velocities of about 100 m s À1 and that geometry of the satellite observables do not affect the size estimation of the disturbance (which might not be true), the smallest size of the EPB we can observe using a 600 s window in the analysis would be of few tens of km.
In order to identify automatically the ionospheric disturbances (EPBs), we need to search for a threshold value of the SIGMA(2DTEC) above which we can define the starting and ending times of a disturbed interval and potential plasma depletion, T 0 and T f respectively.We have found that SIGMA(2DTEC) remains below a given threshold value for quiet intervals and SIGMA(2DTEC) systematically rises over such a given threshold when a depletion or large disturbance exists in the time variations of the TEC. Figure 3 shows two examples of the trend of SIGMA(2DTEC) for a quiet day (with no depletions) and for a disturbed day (with depletions).The left plots of Figure 3 depict the time variations of the TEC for a given day.The middle plots of Figure 3 depict the time variations of the SIGMA(2DTEC).Finally, the right plots of Figure 3 depict the cumulative number of windows which observes SIGMA(2DTEC) larger than a given value.
Under quiet conditions, one observes small values of the variance SIGMA(2DTEC) as depicted in the upper middle plot of Figure 3, which ranges from 0.0086 to 0.0714 TECU.The cumulative number of the SIGMA(2DTEC) over the time series displays a quasi linear trend (right upper plot).However, SIGMA(2DTEC) shows large values for disturbed intervals (middle bottom plot of Figure 3), which ranges from 0.0086 to 2.5 TECU.It is noticeable a significant change in the trend of the cumulative number of the SIGMA(2DTEC) over time series (right bottom plot) which clearly departs from the quasi linear trend observed for a quiet day.We have analyzed the above changes in the SIGMA(2DTEC) to identify the existence of plasma depletions within disturbed time intervals and to find a pattern enabling to fix their limits.Figure 4 shows the clear enhancement of the SIGMA(2DTEC) when a significant TEC depletion is observed.
According to this, we have analyzed the variability of the TEC for different seasons, longitudes and time intervals to characterize the above threshold value for the SIGMA(2DTEC) that will define the EPB time interval.We have done a statistical analysis for year 2014.We have used data from 25 GNSS receivers of low latitude IGS stations (30 s data sampling) (see Table 1 and Fig. 1), for 16 different weeks equally E. Blanch et al.: J. Space Weather Space Clim. 2018, 8, A38 distributed within the year (see Table 2) for all active satellites.Thus, the total data analyzed has been more than 10,000 weekly files.The results of this analysis show that no significant depletions are found for SIGMA(2DTEC) values lower than 0.714 TECU.It might happen that observing values of SIGMA(2D-TEC) !0.714 TECU can result with no depletion but a disturbance is present.However, we miss significant number of depletions by rising the threshold value above 0.714 TECU.That is why we fix a threshold of the SIGMA(2DTEC) = 0.714 TECU for estimating the time interval of a disturbance as potential indicator of a depletion caused by a EPB.Using this method, we obtain the time interval of the disturbance and its effective time duration (ETD; i.e.ETD = T f À T 0 ) (Fig. 5).However, additional constrains are needed to identify significant depletions as follow.

Bubble characterization: depth and total disturbance
The bubble depth is the maximum difference between the expected background TEC (TEC 0 ) and the estimated depleted TEC (TECD) within the limits identified as starting (T 0 ) and ending times (T f ) (Eq. 3): For this purpose, we must define an appropriate method for predicting the TEC 0 or must adopt an existing technique to trace the slow varying background TEC trend signal, which refers to the logical equilibrium level (e.g.Magdaleno et al., 2012;Pradipta et al., 2015).We estimate the background TEC 0 (t) values between T 0 and T f using a polynomial of degree 2 in the time domain, whose 3 coefficients have been determined by requiring continuity of both the function and its first derivative (Eq.( 4)).While the first condition is kept rigid, thus determining 2 degrees of freedom, the second condition has been relaxed and the 3rd degree of freedom is achieved by least squares fitting.
for T 0 < t < T f Figure 6 shows an example for estimating the TEC 0 (t) values between T 0 and T f (dark green dotted line in the top).When no boundary data is available in one of the edges (T 0 or T f ), the determination of the polynomial of degree 2 in the time domain might result into unrealistic and wrong estimation of the TEC 0 (t).In these cases, we use linear interpolation to estimate the background TEC 0 (t) (Eq.( 5)) to avoid potential false alarms caused by wrong estimation of the first derivative by least squares fitting when estimating the background TEC.
Once we have the TEC 0 (t) and TECD(t) we can obtain the disturbance itself DTEC(t) = TEC 0 (t) -TECD(t) and the Depth (Eq.( 3)).To avoid wrong effects in the determination of DTEC(t) produced by large data gaps in TECD(t), we only take under consideration, those time intervals in which the amount of data between T 0 and T f is larger than 60% of the expected data assuming a fixed 30 s data sampling.We also need to introduce additional constrains to identify a depletion caused by an EPB in the variations of the TEC.One of these is to ensure that detected depletion accounts for minimum Depth of 5 TEC units within the disturbed time interval, as requested by the SCIONAV project requirements, otherwise such interval is no longer considered for bubble identification.The last constraint is to ensure that the detected disturbance is caused by a depletion itself and to discard disturbances caused by enhanced TEC events or by TIDs-like disturbances.Although EPBs may be related to equatorial TID activity, a TID will be observed as a fluctuating pattern in the DTEC(t).Therefore, once we have obtained a disturbed time interval we require that the area limited by positive DTEC(t) values (PDTEC) must be lower than 40% of the area limited by the negative DTEC(t) values (NDTEC).Otherwise, such interval is no longer considered for bubble identification.
Finally, we can define the total disturbance caused by an EPB (TDB) as the integral of DTEC(t) (Eq.( 6)):

Tool operation and results
Considering the previous discussion in Sections 3.2 and 3.3, we have built an automatic detector of plasma depletions (EPB).It analyzes the second differences of TEC (2DTEC), and according to several conditions, it identifies the occurrence of EPBs and its main characteristics: duration (ETD), depth and total disturbance (TDB) that will be further used to build up a bubble prediction model.Figure 7 illustrates de tool operation describing the input data, the data processing within the automatic EPB detection tool and output products.
The automatic EPB detection tool provides, as an output, the bubble occurrence and their main characteristics.It also provides a plot for each case depicting the TEC (in TEC units), sigma of second difference of TEC, SIGMA(2DTEC), and DTEC(t) (see Fig. 6).
Comparing with other existing techniques (e.g.Portillo et al., 2008;Nishioka et al., 2008;Magdaleno et al., 2012;Cherniak and Zakharenkova, 2016;Kumar, 2017), the main advantages of using the proposed technique to detect automatically bubbles and depletions are: 1. the use of the TEC instead of the STEC means that we do not need to introduce a mask elevation when we use LI to avoid possible amplifications for low elevation data.Thus, more data can be analyzed and more bubbles or depletions can be detected; 2. the use of the unambiguous code delays (PI = P2 À P1) to obtain the TEC when data from the phase delays (LI=L2-L1) is not available.This usually happens when the signal pass through a significant disturbance, such a large bubble or depletion.Using the smoothed code delays, we are able to reconstruct the perturbation and analyze a larger amount of data.Note that this is only done for elevations larger than 20°because, at low elevation, code delays can be very noisy; Data from GNSS receivers 3. use of the second difference of TEC instead of the first derivative or rate of TEC change (ROT).The second difference remains more stable under quiet intervals and it makes easier the identification disturbed interval and therefore, the initial and final time of the bubbles or depletions.
Nevertheless, it is worth to mention that this technique, as existing ones that use the phase delays to obtain the TEC, presents false alarms when cycle-slips appear.It has been observed that when there is a cycle-slip in the phase delays, SIGMA(2DTEC) increases in the same way as if it was an ionospheric disturbance.The tool does not discriminate a cycle-slip from a bubble or depletion.Further work is needed to minimize the effect of the cycle-slips in the automatic detection tool.Two possible solutions have been identified to minimize the false alarms: (1) to use a cycle-slip detection method previous to calculate SIGMA(2DTEC) (2) to use only L1 instead of LI because it has been observed that L2 suffer more cycle-slips (e.g., Skone et al., 2001).
To validate the detection tool, we have compared the observations from different but collocated instruments similar to Burke et al. (2003) and Zakharenkova et al. (2016).We have compared the measurements by the GNSS Arequipa receiver, the Jicamarca unattended long-term studies of the ionosphere and atmosphere (JULIA) radar (Hysell and Burcham, 1998) and the Jicamarca Digisonde Portable Sounder (DPS-4).The Jicamarca JULIA radar is located at the Jicamarca Radio Observatory (lat À12.0, lon À76.8) near Lima, Peru.It is a low-power 50 MHz coherent scatter stratosphere-troposphere radar suited to observe the equatorial E and F region plasma irregularities and neutral atmospheric turbulences (Hysell and Burcham, 1998).The Jicamarca digisonde is also located at the Jicamarca Radio Observatory.An ionospheric sounder uses basic radar techniques to detect the electron density (equal to the ion density since the bulk plasma is neutral) of ionospheric plasma as a function of height.By scanning the transmitted frequency from 1 MHz to as high as 40 MHz and measuring the time delay of any echoes (i.e., apparent or virtual height of the reflecting medium) a vertically transmitting sounder can provide a profile of electron density versus height (e.g.Reinisch et al., 1992).We have compared the depletion in the TEC observed over the AREQ GNSS receiver caused by an EPB with the signal-to-noise ratio as a function of altitude and time observed by JULIA radar and the equatorial Spread F observed by the Jicamarca digisonde (Fig. 8).It is known that severe Spread F in range phenomena in low latitude ionograms are attributed to the plasma bubbles developing/passing above the station (e.g., Whalen, 2000;McNamara et al., 2013).Figure 8a shows an ionogram recorded over Jicamarca at 2:14:59 UT on 15th June 2002 (~21:40 LT 14th June) when no bubbles where detected.Thus, this ionogram was unaffected by the presence of plasma bubbles and shows a clear signal traces.However, Figure 8b shows a strong Spread F in both frequency and range at 2:14:59 UT on 2nd February 2002 (~21:40 LT on 1st February) that has been also observed by the JULIA radar between 20-22:30 LT (Fig. 9).JULIA measurements show that reflections from E layer altitudes persisted throughout the night.From 20:00 LT, we can observe the reflections from bottomside irregularities that rose in altitude few minutes after 21:00 LT.We can observe a plume at ~21:00LT (2:07 UT) that reach altitudes of 800 km and a second one at ~21:15 LT (2:22 UT).This irregularity observed at both instruments (JULIA radar and Jicamarca digisonde) is consistent with the formation of the Equatorial Plasma Bubble detected over AREQ on 2nd February 2002 at 2:02 UT (~21:26 LT) (Fig. 10).We have performed this validation experiment for all cases in 2002 with presence of bubbles (detected by the automatic detection tool) and the results are similar to those we present here: Jicamarca digisonde ionograms and JULIA radar present severe Spread F phenomena at same hour that the EPB occurrence.

Equatorial plasma depletions occurrence and characterization modelling
The second objective of this research is to develop an empirical model for the equatorial plasma depletions occurrence and their main characteristics (duration, depth and total disturbance) with respect to factors such as local time, season, solar cycle and geomagnetic latitude.

Data processing and methodology
We have used the automatic bubble detection tool described in the previous section to detect the bubbles and to obtain their main characteristics for further perform a comprehensive analysis to develop the empirical model.We have analyzed data from 52 ground-based low latitude IGS receivers for different years: 2002-2003 (maximum of solar cycle 23), 2009 (minimum of solar cycle 23) and 2014 (maximum of solar cycle 24).Table 1 shows a list of the IGS receivers used in this analysis and Figure 1 shows their geographic distribution.We can observe that data covers all longitudinal sectors except for that from À100°to À130°.For each period, we have processed 16 different weeks described in Table 2.For each analyzed receiver, week and solar activity we have detected a number of bubbles and obtained their main characteristics.From this information, we want (1) to obtain the probability of occurrence according to the local time, month and solar activity and (2) to obtain the probability to detect bubbles with a particular duration, depth and total disturbance.

Bubble occurrence probability
In order to obtain the probability of bubble occurrence in a determined location and year, we have calculated the number of reference bubbles per year (NRBY) which depends on the number of bubbles observed one year and the number of days analyzed during that year in order to avoid biases when no data is available (Eq. ( 7)).We also have defined the probability of bubble occurrence per month (POM) (Eq.( 8)) and the probability of bubble occurrence per Local Time (POLT) (Eq.( 9)): As a result, we have the number of reference bubble, the probability of bubble occurrence per month and the probability of bubble occurrence per LT for the three different periods of solar activity for each analyzed location.Our objective is to generate a global model for low latitude to predict the occurrence of bubbles.
To do this, we have interpolated the data using a Kriging method with an interpolation ellipse of 180°in longitude (to mitigate possible effects of data sparsity in longitude) and 30°in modip.Modip coordinates (Rawer, 1963) have been used due to the magnetic dependence of the EPBs development and the advantage of using a magnetic field coordinates in the global analysis.The interpolation has been done using a grid from À220 to 220 in longitude and À45 to 45 in modip in steps of 10°.To obtain a smooth transition for adjacent longitudes at À180 and 180, we have interpolated from À220 to 220 in longitude duplicating some of the stations (guam (lon., À215.13 E), tuva (lon., À180.8 E), nium (lon., 190.07E) and thti (lon., 210.39 E)) and also forcing the interpolation to be 0 at modip of 40, 45, À40 and À45 to have a smooth transition when moving to middle latitudes.From the interpolation results, we have generated a set of look up tables (LUTs) for geographic coordinates in steps of 10°.Above latitude 35°North, and below 35°South the presence of bubbles/depletions is negligible, so the corresponding data in the LUTs is set to zero. Figure 11 shows the number of reference bubbles for a given solar activity level (top panel), the probability of EPBs occurrence for a given month (middle panel) and Local Time (bottom panel) for high (left) and low solar activity (right) as an example.By multiplying the number of reference bubbles with the probability to detect bubbles in one month and the probability to detect bubbles at a local time we obtain the number of bubbles expected for one day at month M and local time LT (Eq.( 10)).
Data analyses have been applied to years with different solar activity levels, 2002-2003, 2009, and 2014.To obtain the probability of occurrence for other years (from 2002 to 2014), we have perform a linear interpolation of the probabilities for each year according to Solar Flux (F10.7)taking into account if the corresponding year is in the ascending or descending part of the solar cycle.With this, we have obtained a LUT for each year from 2002 to 2014 from which we have the number of reference bubbles per year, the probability of occurrence bubbles per month and the probability of occurrence bubbles per LT.For other years different from the interval 2002-2014, we can obtain the corresponding LUT by looking the most similar year in the period 2002-2014 according the Solar Flux and to the position in the solar cycle (ascending or descending part).The corresponding LUTs will be used to predict the presence of bubbles at any time for any year according to the Solar Flux.
Analyzing the results of the model for years from 2002 to 2014 for all months and hours, we have observed that, it simulates different important aspects: 1. maximum bubble activity occurs around the equatorial anomaly crest as it is expected according to the maximum scintillation effect in that area (De Rezende et al., 2007); 2. presence of bubbles over zones were no data is available as over the pacific; 3. asymmetry between hemispheres and longitudes that has not been reported so clearly before.
We have compared the number of reference bubbles (NRBY), which relates with the number of bubbles observed per year, for high and low solar activity (top panel of Fig. 11).It is observed that during low solar activity the presence of bubbles is very low and almost absent outside the Equator and South-America.It is clearly seen that during high solar activity, the number of reference bubbles is much larger than during low solar activity, and their distribution in latitude and longitude varies (e.g.Smith and Heelis, 2017).
Probability of occurrence per month (POM) has also been analyzed.It has been observed that, in general, the probability of occurrence of bubbles is larger during equinox months than during solstice's months, especially for June solstice (e.g.Zakharenkova et al., 2016).Moreover, it has been observed that the POM also depends on the longitude).As it has been described in previous studies (e.g.Tsunoda, 1985;Nishioka E. Blanch et al.: J. Space Weather Space Clim. 2018, 8, A38 et al., 2008;Magdaleno et al., 2017;Juan et al., 2018a), the relative position of the geomagnetic equator and the line of the solar terminator can explain these results: large values of E • B vertical plasma drift favors the generation of EPBs and larger values of E • B occurs when the angle between the solar terminator and the geomagnetic equator maximize.This depends on the longitude.
We also have analyzed the probability of bubble occurrence in local time (POLT).As it was expected, the presence of bubbles maximizes after the sunset.The bottom panel of Figure 11 shows the probability of occurrence at 20 LT for January 2014 and 2009 as an example.

Bubble characterization probability
As it has been explained in Section 4.1, we have used the automatic bubble detection tool to detect the presence of bubbles and their main characteristics for 2002-2003, 2009 and 2014.For each period, we have analyzed the effective time of duration (ETD), depth, and total disturbance (TDB) of the TEC perturbation and obtained the probability to observe a bubble with a given duration, depth and total disturbance independently of the location, month and local time (see Fig. 12 for 2009 and 2014).Similar to bubble occurrence probability, we have obtained the bubble characterization probability for other years (from 2002 to 2014), by interpolating the probabilities for each year according to Solar Flux (F10.7) and taking into account the position of the year within the solar cycle.
After examined the results for all years, we have observed that the depth of a bubble can be deeper during high solar activity than during low solar activity.At high solar activity, we have observed bubbles of 47 TECUs of depth (middle top panel of Fig. 12).Meanwhile, during low solar activity we have not detected depletions larger than 20 TECUs (middle bottom panel of Fig. 12).Similar behavior is observed for the effective time of duration and the total disturbance of the bubbles.At high solar activity, the bubble duration can be larger than during low solar activity when observed bubbles do not have a duration superior to 75 min while we can find bubbles at high solar activity that can last up to 95 min (left panels of Fig. 12).
Look Up Tables of these characteristics have also been generated to be used to estimate the characteristics of the bubbles causing TEC perturbations.Table 3 shows an example of LUT for the depth probability for year 2014.

Model results and validation
After applying the automatic bubble detection tool, we have analyzed the occurrence of bubbles according to longitude, latitude, local time, month and solar activity and their main characteristics according to solar activity.As result of this statistical analysis, we have built several Look Up Tables that give the probability of bubble occurrence and the probability to find a bubble with a given depth, ETD, TDB: -number of reference bubbles per year; -probability of occurrence per month; -probability of occurrence per Local Time.
As a result, these tables provide the probability of bubble occurrence for different local time, month and solar activity as well as their characteristics (Depth, ETD and TDB) in a global way.They can be found as additional material in the online version of the published article.We have observed that at high solar activity the occurrence of bubbles is larger than at low solar activity and that the longitudinal dependence depends on the month due to the different angle between the solar terminator and the geomagnetic field line.Depth, ETD and TDB have been also found to depend on the solar activity, being these values larger at high solar activity than at low solar  activity.These tables are used as input for a general perturbation model that reproduces the effects of EPBs on the slant Total Electron Content (sTEC) estimated by GNSS data as it is described in Camps et al. (2017).
We have obtained the number of bubbles observed per day for each month for 2014 for those stations and compared them with the expected number obtained from the interpolation.Figure 13 shows the position of those stations (light blue star) together with the number of bubbles per day observed and predicted by the model for January (Fig. 13a), March (Fig. 13b), April (Fig. 13c) and June (Fig. 13d).Although not shown here, similar behavior is observed for November and December, October, September and May to August respectively.Results show that larger number of bubbles are observed for those regions where larger number of bubbles are expected to occur.However, observed and modelled do not fit exactly.We should E. Blanch et al.: J. Space Weather Space Clim. 2018, 8, A38 keep in mind that expected values are modelled by the interpolation procedure explained in Section 4.2.This can be used as a first indication for the soundness of the model.From Figure 13 we also can observe the longitudinal and seasonal dependence of the bubble occurrence for this area: in wintertime, the maximum number of bubbles occurs over South American-Atlantic sector while during equinox, it occurs over the Atlantic-African sector (Zakharenkova et al., 2016).
Results have also been compared qualitatively with the observations reported in recent published papers by Kumar (2017), for Indian sector and Magdaleno et al. (2017) for different longitude sectors.In these researches, the authors analyzed the climatology of the equatorial plasma bubbles using different techniques that used in this paper.Our results confirm those reported by previous teams.Kumar (2017) reported that, over the Indian sector, the monthly EPB occurrence shows an equinoctial maximum during high solar activity years and maximum occurrence in June during solar minimum years.Our model provides the number of bubbles expected to observe one day at a given month (NBM = NRBY AE POM) and, over the Indian sector, it is larger during equinox months at high solar activity, as observed by Kumar (2017) but it does not show the maximum EPB probability at June during low solar activity.Magdaleno et al. (2017) analyzed the temporal behavior of the occurrence rate and its relation with the season and solar activity at different longitudes and latitudes.We have compared their results with the results of our model (number of bubbles expected to observe one day at a given month, NBM = NRBY AE POM) for similar locations over areq (lat., À16.47°N; lon., À71.49°E), nklg (lat., 0.35°N; lon., 9.67°E), issc (lat., 13.02°N; lon., 77.57°E) and guam (lat., 13.59°N; lon., 144.87°E).Figure 14 shows the expected number of bubbles per day at a given month for each year obtained from our model near over areq.If we compare these results with those reported in Magdaleno et al. (2017) for the same station (areq) and years (2002)(2003)(2004)(2005)(2006)(2007)(2008), we observe similar results: the peak of occurrence of bubbles occurs at March and October at high solar activity and during low solar activity, there is no probability of bubble occurrence.Although not shown here, all the stations show a good agreement between the observed by Magdaleno et al. (2017) and the predictions of our model.Magdaleno et al. (2017) have presented results at global scale for the occurrence rate (OR), depth end effective time duration (ETD).Their results indicate that the OR, depth and ETD maximizes at magnetic equator and these values decrease as the distance from the equator increases.However, our results go one step forward, certainly we confirm that the probability of occurrence of EPBs as function of local time and month (or season) tends to maximize at modip equator, especially by 19-20 local time.However, later on it tends to reach peak values near the anomaly crest (±10-15 modip) (Fig. 11).This can be explained as follows: EPBs are generated in the equatorial region and move to low latitudes along the magnetic field lines reaching the anomaly crest approximately in one hour.When they reach the anomaly crest, they present their larger amplitude (De Rezende et al., 2007).It is also known that the most intense scintillation occurs near the crest of the equatorial anomaly (e.g.De Paula et al., 2003).As EPBs are  2017) also reported for the temporal behavior in different longitudinal sectors.Our results confirm the results previously reported by Magdaleno et al. (2017) but expanded for the entire longitudes around the globe.We can confirm the larger EPB activity for the Atlantic to Africa sector (Fig. 11) as it has been observed with in situ measurements using Defense Meteorological Satellite Program (DMSP) data (Burke et al., 2004) and reported by Smith and Heelis (2017) based C/NOFS measurements of ion density and vertical ion drift.In addition to that, one can guess several cell structures of enhanced EPB activity at particular longitudes.These structures are centered at longitudes of À40°E, 40°E, 120°E, and 180°E approximately.Our results also report a longitudinal dependence of the seasonal occurrence of EPBs (Fig. 15).It is clear that occurrence of EPBs peaks near to equinoxes for all sectors.However, occurrence of EPBs for America to Africa sector is significant in November-December and insignificant in June-July (also reported in Burke et al. 2004 from measurements using DMSP data and by Huang 2017 using ion density and velocity data measured by C/NOFS).The opposite happens for the Pacific sector, where the occurrence of EPBs is insignificant in November-December and significant in June-July (see also Burke et al., 2004).As it is explained in Section 4.2 this is due to the dependence in longitude of the angle between the geomagnetic equator and the line of the solar terminator.
It is worth noticing that the technique used by Magdaleno et al. (2017) is independent to the one we have developed in this analysis.These authors bases the detection of a depletion on the variation of the slope of the sTEC values and the population variance of these slope values.Our approach to detect EPBs focus on the 2nd time derivatives of TEC (2DTEC).2DTEC looks more sensitive to irregularities and more stable to quiet intervals than 1st TEC derivatives.Therefore, confirmation of previous results obtained by different methods and techniques and independent observations confirms the soundness of both approaches.

Current application and future work
The results of this research have been used in the framework of the ESA project SCIONAV (ESA-ITT 1-8214/15/ NL/LvH).The automatic bubble detection tool has been used to obtain the occurrence of bubbles in a global way to later perform a statistical analysis to build the bubble occurrence model and the bubble characteristic probability.This analytical model is used by the generic tool developed in the framework of the SCIONAV project that evaluates the impact of ionospheric disturbances, particularly low frequency fluctuations in radio signals (Camps et al., 2017).Low-frequency TEC variations are modeled based on the EPBs characterization derived from our research concerning EPB Depth (DTEC) and the size, in terms of the effective time duration (ETD) and total disturbance of the depletion itself (TDB), both depending on local time, season and solar activity.In this generic model, EPBs are assumed to have a Gaussian shape to provide a continuous variation of the STEC (Eq.( 11)).All these parameters are geographically and temporally random variables.where DSTEC max , and T are depth and duration of the bubble respectively according to the LUTs described in Section 4. T 0 is the time of maximum bubble depth, ðx 0 ; y 0 Þ is the bubble's maximum depth position (lat, lon) at t = 0 s, ða; b; cÞ are the shape dimension parameters.Figure 1 of Camps et al. (2017) shows an example of a snap-shot of the bubbles generated.Current work of the team is in development to estimate and model the velocity vector of propagation and direction of EPBs depletions, whose results should improve the aforementioned EPB model predictions and use to evaluate the impact in radio signals of the low frequency fluctuations of the ionosphere in equatorial latitudes.We also pretend to extent the applicability of the bubble detection tool to detect and analyze the occurrence of bubbles at mid latitudes.Cherniak and Zakharenkova (2016) and Katamzi-Joseph et al. (2017) reported the observation of plasma bubbles at mid latitudes during geomagnetic storms.We would like to analyze the climatology of the bubbles occurrence at mid latitude and, if possible, to perform an empirical and analytical probability of occurrence model.Moreover, this technique can be also adapted to detect other ionospheric disturbances such as polar patches.
We already have tested its functionality with promising results.
As already stated and well known, the occurrence of EPBs is closely related to the scintillation phenomenon that strongly affects navigation and telecommunication systems based on radio techniques (e.g.Paznukhov et al., 2012).Ionospheric scintillation is one of the most challenging effects affecting precise positioning in Global Navigation Satellite Systems (GNSS).This perturbation is related with fluctuations in the intensity and the phase of electromagnetic signals when they are refracted and/or diffracted by irregularities in the electron distribution encountered during their travel along the ray propagation path.In this way, scintillation causes the GNSS signals to have an increased level of noise and the amplitude of the signals suffers deep fades (decreasing dramatically their signal to noise ratio) which can end on a cycle slip of the carrier phase tracking or even on a loss of lock, thereby disrupting the performance of space-based communication and navigation systems.This effect has affected our data analysis as explained in the Section 3, and we tried to mitigate effects of loss of lock in the carrier phase tracking (LI) by using the TEC data estimated by smoothing the unambiguous but noisy code of the GNSS signal (PI).In this way, we can enlarge the TEC information, connecting TEC estimated from the phase delays (LI) in different arcs (Fig. 2, right).In any case, the identification, correction or mitigation of the scintillation effects is one of the current challenges in achieving precise GNSS navigation (Pi et al., 2014) and would benefit to data analyses studies as the one we carried out in the present work.Recent research has accounted for this, (e.g.Juan et al., 2017).They have proposed a method able to isolate, over the combined signal applying the geodetic detrending, the effect of single jumps in the individual frequencies, and the frequency experiencing the cycle-slip using GNSS data at 1 Hz sampling.Application of this technique has been proved by Juan et al. (2018a) concluding that precise navigation is possible under strong scintillation conditions as long as the problem with the cycle slips would be properly addressed.Therefore, future use of Juan et al. (2017) method would improve carrier phase tracking by fixing the cycle slips effect and get improved estimation of TEC data and consequently a better estimation and characterization of equatorial plasma depletions, which in turn will provide better prediction model development.The latter opens future directions of work for the team.

Summary and conclusions
We have developed a method to detect Equatorial Plasma Bubbles from GNSS data.This methodology has been inspired on a previous work of Magdaleno et al. (2012) and has been improved using the proposed methodology of Hernández-Pajares et al. (2006) to detect Medium Scale TIDs.The novelty of this methodology is the use of the second difference of TEC instead of the first difference of STEC used in several studies (e.g.Portillo et al., 2008;Nishioka et al., 2008;Magdaleno et al., 2012;Cherniak and Zakharenkova, 2016;Kumar, 2017).This allows, on one hand, to analyze a larger amount of data because using TEC instead of sTEC we do no need to apply an elevation mask when using LI to avoid effects of low elevation data.On the other hand, this also allows a better determination of the initial and final time of the perturbation because the second derivative of TEC is less noisy than the first derivative.We also have introduced another novelty comparing with existing techniques: we use the noisier but unambiguous code (PI = P1 À P2) to reconstruct the perturbation when phase (LI = L1 À L2) is not available due to cycle slips or loss of lock.This allows to increase the number of analyzed data and to be able to detect large perturbations that could not be detected.We only use the code when the elevation angle is larger than 20°to avoid possible noise at low elevation.In case we cannot reconstruct the gap for elevations lower that 20°, the data interval will be only analyzed if LI gap is lower than 40%.As a result, we have developed an automatic bubble detection tool that give us the occurrence of a bubble from GNSS and provides information about the location, duration, depth and total disturbance.The second objective of this work is to apply this automatic detection tool to a large number of GNSS receivers located at low latitudes and well distributed in longitude in order to analyze the occurrence of bubbles according to latitude, longitude, solar activity, month and local time and their main characteristics.From these data, we have performed an exhaustive statistical analysis to obtain: (1) probability of bubble occurrence and (2) bubble characterization.As a result, we have obtained and empirical and statistical model that provides the expected number of bubbles per day in a given month and local time and the probability to detect a bubble with a given duration, depth and total disturbance according the solar activity.This model is based on a several Look Up Tables that give the corresponding outputs.Analyzing the output of this model, we have observed: (1) the number of bubbles observed per Local Time does not depend on the longitude, whereas (2) the number of bubbles observed per day at a given month depend on the longitude (longitude asymmetry).Other authors have observed this season-longitudinal E. Blanch et al.: J. Space Weather Space Clim. 2018, 8, A38 dependence previously (e.g.Burke et al., 2004;Su et al., 2008;Kumar et al., 2016b;Magdaleno et al., 2017).This is due to the dependence of the angle between the solar terminator and the magnetic field with the generation of the Rayleigh-Taylor instability, which maximize when this angle minimize (Tsunoda, 1985).We also have observed that (3) larger number of bubbles occurs over the South America-Atlantic region as has been already observed by other techniques using different data sources (Burke et al., 2004;Su et al., 2008).In addition, we have observed that (4) the maximum bubble activity occurs at the equatorial anomaly crest that has not been observed before (when comparing with Magdaleno et al., 2017)

Fig. 1 .
Fig. 1.Geographic distribution of the IGS GNSS receivers used in the analysis for different years.

Fig. 3 .Fig. 5 .Fig. 4 .
Fig.3.Examples of time variations of the TEC (left plots) and SIGMA(2DTEC) (middle plots) over the indicated day, and of its cumulative number (right plots) for a quiet day (upper plots) and disturbed day (lower plots).Light green shows the TEC estimated by the phase delays (LI), blue dots shows the TEC estimated by the code delays (PI) and red dots correspond to the 5-points smoothed code delays (PIs).

Fig. 6 .
Fig. 6.An example of the output of the automatic bubble detection tool for BAKO receiver, 12 October 2014 and satellite 21.Upper panel: Estimated TEC from LI (TECU) (light green dots), TEC from PI (TECU) (blue dots), the 5-point running means smoothed TEC from PI (TECU) with an elevation mask of 20°(red dots) and the predicted background TEC from LI (TECU) (dark green line).Vertical lines whose values are shown in the legend mark T 0 and T f .Middle panel: SIGMA(2DTEC) (TECU), where the horizontal blue line indicates the threshold value (0.075 m 2 = 0.714 TECU) defining T 0 and T f , and in turn ETD.Bottom panel: Difference between predicted background TEC and estimated TEC, DTEC(t) (TECU), from which we calculate the bubble Depth and the total disturbance TDB.

Fig. 7 .
Fig. 7. Tool operation description showing the input data (data from GNSS receivers), the automatic detection tool algorithm and the output products.

Fig. 10 .
Fig. 10.TEC depletion detected on 2nd February 2002 at 2:02 UT (~21:26 LT of previous day) from the areq GNSS receiver using the Equatorial Plasma Bubbles automatic detection tool described in Section 3. Depletion related with an EPB of 13.2 TECUs depth and a duration of 38 min.Light green dots: estimated TEC from LI (TECU).Blue dots: TEC from PI (TECU).Red dots: the 5-point running means smoothed TEC from PI (TECU) with an elevation mask of 20°.Dark green line: predicted background TEC from LI (TECU).Vertical lines whose values are shown in the legend mark the initial time (T 0 ) and final time (T f ) of the bubble detection.

NBMLT¼
Number of bubbles for one day at month M and local time LT ¼ NRBY Á POM Á POLT:

Fig. 11 .
Fig. 11.Interpolation maps for high solar activity (left) and low solar activity (right).Top: Number of reference bubbles for a given solar activity (NRBY).Middle: Probability of EPBs occurrence for a given month (POM), January.Bottom: Probability of EPBs occurrence for a given Local Time (POLT), 20LT.Blue triangles correspond to receivers used in the interpolation and the numbers are the corresponding values at each location.

d
Look Up Tables A, for the period 2002-2014 in steps of 10°of latitude and longitude with the following data:

Fig. 13 .
Fig. 13.Number of bubbles per day for January (a), March (b), April (c) and June (d) 2014.Blue triangles correspond to the IGS stations used to design the model.Light blue stars correspond to additional receivers not used in the construction of the model.Numbers for each station correspond first to the number of bubbles observed per day and second to the number of bubbles expected per day according to the interpolation.

Fig. 14 .
Fig.14.This plot shows the expected number of bubbles per day at a given month of a given year near to areq station (À70°E, À15°N).

Fig. 15 .
Fig. 15.Expected number of bubbles per day at a given month for 2014 at 5°North in latitude for different longitudes.

Table 1 .
List of IGS GNSS receivers used in this study arranged by longitude.
TEC variation for DoY 262 of year 2014 from RECF receiver and PRN 24.Left: Light green dots show the TEC estimated from the phase delays (LI) and blue dots show the TEC estimated from the code delays (PI).Right: Red dots correspond to the 5-points running mean TEC estimated from the code delays (PIs).Observe that part of the gap in the LI near the TEC depletion can be recovered.

Table 3 .
LUT for bubble depth probability obtained for 2014.
and it is expected according to the maximum scintillation effect in that area.