Database of episode-integrated solar energetic proton fl uences

This is anOp Abstract – A new database of proton episode-integr ated fluences is described. This database contains data from two different instruments on multiple satellites. The data are from instruments on the Interplanetary Monitoring Platform-8 (IMP8) and the Geostationary Operational Environmental Satellites (GOES) series. A method to normalize one set of data to one another is presented to create a seamless database spanning 1973 to 2016. A discussion of some of the characteristics that episodes exhibit is presented, including episode duration and number of peaks. As an example of what can be understood about episodes, the July 4, 2012 episode is examined in detail. The coronal mass ejections and solar flares that caused many of the fluctuations of the proton flux seen at Earth are associated with peaks in the proton flux during this episode. The reasoning for each choice is laid out to provide a reference for how CME and solar flares associations are made.


Introduction
Episodes of elevated proton fluxes are of interest because of the hazard they pose to space systems and space crews.Higher levels of radiation in space pose a risk to astronauts beyond low Earth orbit.It can cause cancer, decrease central nervous system function, degenerate body tissue, or cause acute radiation syndrome (Chancellor et al., 2014).After multiple missions, astronauts on the International Space Station can exceed NASA's lifetime radiation limits (Cucinotta, 2014).The elevated radiation levels can cause spacecraft to be damaged in multiple ways, including by single event upsets (a microelectronic cell changes state) or satellite charging (Petersen et al., 1982;Suparta, 2014).Models like the Cosmic Ray Effects on Micro-Electronics, 1996 version, CREME96, (and its more recent update, Space Ionizion Radiation Environments and Effects toolkit, or SIRE 2 ) and SPace ENVironment Information System (SPENVIS), were developed to address the space radiation environment for spacecraft in orbit (Cressler and Mantooth, 2012;Adams et al., 2017).There have been many models that try to give a probability that some solar proton fluence level will be exceeded during a space mission (King, 1974;Feynman et al., 1990;Xapsos et al., 1999a, b;Jiggens et al., 2012).Most of these models use a database of proton measurements to predict the probability of encountering extremely high levels of proton fluxes.Providing new and bigger databases of proton measurements is one way to provide the space radiation community with new data to help improve the current models.With better models available, mission planners and spacecraft designers will have a better understanding of the space radiation environment and can use this knowledge to prepare space systems and space crews to better survive in this environment.
Episodes of solar proton activity consist of one or more solar energetic particle events.When an episode contains more than one event, it is because the events are partly overlapping in time.It can even happen that smaller events are hidden under larger events.For these reasons, individual events cannot always be cleanly separated.The properties of events within the same episode tend to be collated.Episodes, however, tend to not be correlated.
This paper presents a record of episodes of elevated solar proton flux observed by spacecraft in high Earth orbit.This record contains the differential energy spectra of the episodeintegrated proton fluence for each episode along with its onset and end time and covers the period from November 1973 through December 2016.For episodes up through October, 2001 these spectra begin from 0.88 MeV.After October 2001, the spectra begin at 4.2 MeV.All the spectra extend up to 485 MeV.In Section 2, the satellites and instruments that are used to measure the proton fluxes are described.Section 3 discusses the cleaning and episode identification process in the data.The process used to create a seamless database from different instruments is discussed in Section 4. Section 5 details the fitting of the Geostationary Operational Environmental Satellite data spectra and converting it into the Goddard Medium Energy Experiment energy channels.Then a discussion of the episodes in this database is presented in Section 6.This includes listing some common features observed in episodes.Finally, Section 7 gives the conclusions.
2 Description of the goddard medium energy experiment and the geostationary operational environmental satellite datasets The majority of the description of the satellites was first reported by Robinson (2015).
The Goddard Medium Energy (GME) Experiment was launched on the Interplanetary Monitoring Platform-8 (IMP-8) on October 26, 1973.Its orbit is approximately circular at 35 Earth radii so GME was well positioned to measure interplanetary solar particle fluxes.See https://spdf.sci.gsfc.nasa.gov/pub/data/imp/imp8/documents/archived_website/gme/GME_instrument.htmlfor a detailed description of the GME instrument.The GME data used in this paper cover the time period from November 1, 1973 to October 31, 2001 and are available with 30 minute resolution over on the Coordinated Data Analysis Web (https://cdaweb.sci.gsfc.nasa.gov/index.html/).No later GME data are available, although IMP-8 continued to be tracked intermittently for a few more years before contact with the satellite was lost.
The GME data for this work were collected and analyzed by Michael Xapsos and Craig Stauffer in the mid-2000's.GME has a total of 30 channels that measure proton flux data.These 30 channels cover most of the range from 0.88-485 MeV.There is a 6 MeV gap (from 81 MeV to 87 MeV) between channels 22 and 23.In this work, interpolation was used to extend each channel to 83.95 MeV in order to create a continuous dataset.Also channel 15 was excluded from the dataset since most of this channels energy range is covered by channel 16.This created a second data gap from 18.7 MeV to 19.8 MeV in between channels 14 and 16.Both channels were extended to 19.24 MeV by using interpolation to fill this gap.For this work, the channel numbers were reduced by one beginning with channel 16 to have sequentially numbered channels.The energy bin boundaries of the 29 GME energy channels used in this paper are given in Table 1.
The Geostationary Operational Environmental Satellite (GOES) data come from the Energetic Particle Sensors (EPS).These instruments, along with the High Energy Proton and Alpha Detector (HEPAD), are usually referred to as the Space Environment Monitor (SEM) package.The SEM data used in this paper are available at https://satdat.ngdc.noaa.gov/sem/goes/data/ as 5-minute average fluxes.When the authors first downloaded the GOES data, the data was in a different format than the one currently available online.For the more recent years (2012)(2013)(2014)(2015)(2016), the authors used the new formated data, which is the only format currently found online.The data on the website comes in two types, corrected and uncorrected.The corrected data use the Zwickl algorithm to remove the cosmic ray background and the secondary detector response (Onsager et al., 1996) and were used in this study.
There are multiple satellites in the GOES series.The first one was launched October 16, 1975.All of the GOES satellites orbit the Earth in geostationary orbit, approximately 35,800 km above Earth's surface (Onsager et al., 1996).For this database, the GOES series data were used from November 1, 2001 to December 31, 2016.NOAA's recommendations for the primary and secondary satellite were used for the proton measurements in different time periods.GOES-8 was used from November 1, 2001 to May 31, 2003, GOES-11 from June 1, 2003to April 30, 2010, and GOES-13 from May 1, 2010 to December 31, 2016.The energy channels for the SEM package can be found in Table 2.A complete description of these instruments can be found in Onsager et al. (1996).In recent years, there has been work done to redefine the proton energy channels on GOES, see Sandberg et al. (2014).The redefined energy channels were not used in this current paper because the authors concluded that the best approach for this work would be to use the energy channel widths defined by NOAA and the data corrected by the Zwickl algorithm.The SEM instruments lack of anti-coincidence detectors would not prevent sidepenetrating events from contributing to the count rate for each energy channel.These side-penetrating events are likely to make a significant contribution, therefore, a correction for them was needed.The Zwickl algorithm is designed to approximately correct the count rate for side penetrating events.For this reason, we have chosen to use the corrected GOES data.Rather than adjusting the GOES energy channels to match the IMP-8/GME measurements, we chose to use the energy channel boundaries determined by NOAA and the instrument builder.These are presented in Onsager et al. (1996).We then corrected the GOES data to the IMP-8/GME measurements in each GOES energy channel using the average of the ratio of the GME flux (integrated over each GOES channel) to the flux measured by GOES in the same channel during the same SEP events.We only used time periods when the fluxes in both satellites were well above background.This procedure brings the GOES channels into agreement with the GME channels without adjusting the energy boundaries of the GOES energy channels.
In this work, the GME data have been assumed to provide the most accurate measurements of the proton flux available in this time period based on investigations of various instruments that have been reported in the literature.Smart and Shea (1999) showed that there is a general agreement between GOES and IMP-8 to within a factor of 2. Rosenqvist et al. (2005) concluded that the most reliable instruments are IMP-8/GME, GOES-7, and GOES-8.Rodriguez et al. (2014) showed that the GOES-8 through GOES-15 EPS instruments agree to within 20% of each other.And while Glover et al. (2008) recommends converting the IMP-8 data to GOES-8, the authors feel that using the GME energy channels has more benefits due to the higher energy resolution of GME.For these reasons, the authors feel confident that a consistent database can be built using the GME and GOES data due to the reliability of these instruments.
3 Method for preparing the data and identifying episodes of elevated proton flux This section describes the process used to create a continuous dataset of episodes.First we describe the steps taken to process the GME data and how episodes were identified.Next, the GOES data processing is explained in detail.This includes correcting for gaps in the data record, background subtraction, and the episode identification process.The majority of this section was first reported by Robinson (2015).

GME data processing
Gaps in the GME data may be caused by incomplete data recovery by the tracking network, telemetry errors, or instrument saturation.These gaps were identified and filled as follows.Small data gaps can be filled reliably by interpolation using the good data preceding and following the gap.This was done using either a linear or logarithmic interpolation depending on which was most consistent with the time profile of the event at the point where the gap occurs.If the gap appeared to be during a section of time where the flux was changing rapidly, logarithmic interpolation was used.If the gap was during a period of time where the flux was changing very slow or not very much, linear interpolation was used.Larger data gaps exist, for example, when the GME instrument saturated.To fill these gaps, it was necessary to obtain the time profile from an alternate data source.For gaps occurring before 1986, data from the Charged Particle Measurement Experiment on IMP-8 were used.From 1986 until the data record ends in 2001, data from the GOES satellites were used.In every case, data were obtained from the alternate source to cover a broad time span overlapping the data gap.This allowed the data from the alternate source to be scaled to match the fluxes measured by GME preceding and following the gap.The alternate data, after scaling, were used to fill the gap.This method was tested by removing a section of GME data from an episode and filling it in with the GOES data following the method described here.The resulting time profile of the GOES data matched closely with the removed GME data.
Now with a completely gap-free dataset, a search could be made for episodes of elevated proton flux.Episodes were identified when either the peak flux in the 1.15 to 1.43 MeV energy channel exceeded 4 cm 2 • s À1 • sr À1 • MeV À1 or the peak flux in the 42.9 to 51.0 MeV energy channel exceeded 0.001 cm 2 • s À1 • sr À1 • MeV À1 .The trigger threshold for the high-energy bin was included to avoid missing small events with hard energy spectra.Such events can be submerged in the background at low energies, only appearing above background in the higher energy channels.Only 1 event in the dataset was found that exceeded the high energy threshold without exceeding the low energy threshold.The episodes that satisfied the threshold criteria described above are supplied in the supplemental text document.Whenever possible the onset and end times of episodes were determined from the time at which the flux first exceeded background to the time at which it returned to background.Sometimes the flux fell below the threshold and went above it before falling again below the threshold.When this occurred, the second rise above threshold was interpreted as the start of a second independent episode.In these cases the two episodes were separated by the local minimum in the flux-time profile.If the flux showed successive rises and falls but remained above the threshold, only one episode was identified.
Next, the background was removed from the data in each channel of every year.This was done by looking for periods of a few days to more than 2 weeks where the flux was at background.The flux during this period of time was then averaged to create a background value for each channel and year.Each flux measurement during an episode then had the background subtracted off of it.If the flux measurement dropped below background at any point in a channel during an episode, the flux was then recorded as 0. To calculate the episode-integrated fluences, the 30-minute averaged fluxes are multiplied by 1800 to convert them to fluence and then were summed up over the entire episode to get the total fluence.Recently, Sandberg et al. (2014) provided a way to re-calibrate five of the GME channels due to a malfunction in the instrument after April 1984.The calibration functions had large corrections for low fluxes (10 À4 -10 À2 MeV) in five energy channels.However, at higher fluxes, the calibration function approaches a value of 1 for all five channels.Since this paper is looking at episode-integrated fluences, higher fluxes contribute more to the overall fluence of the episode than the lower fluxes.For this reason, the calibration functions can be ignored in this dataset without sacrificing accuracy in the episode-integrated fluence for the GME data in those channels.

GOES data processing
The data in GOES channel P1 is usually contaminated by particles of magnetospheric origin.These magnetospheric particles caused daily fluctuations in channel P1 over many months in all three of the GOES satellites used in this paper.This time-varying background made it difficult to identify episodes in this channel.Generally, the solar energetic particle flux in P1 only exceeded the magnetospheric background during relatively large solar energetic particle events.For this reason, the channel P1 data were excluded from this database.

Cleaning and background subtraction
The raw data from GOES contains scattered missing data.These missing data were marked by the values of 32700 or À99999 depending on whether the data chosen from NOAA's website was the old average or new average fluxes, respectively.The only difference between the old and new average data is the value used for the missing data.These missing data have to be corrected through interpolation or substitution.If there were only one or two successive bad 5-minute average fluxes, the missing data was replaced by linearly interpolating between the preceding and following good data.The larger gaps (when three or more bad 5-minute average fluxes appear in succession) were not filled unless they occurred during an episode.For such unfilled gaps, the value 9.0E6 was inserted as a temporary missing-data marker.In what follows, these will be referred to as the "cleaned 5-minute-averaged flux data".The cleaned 5minute-averaged flux data were used so that the missing data value of 32700 would not be confused with data measured during an episode.The value of 32700 was comparable to the flux levels in channel P2 during some of the largest solar particle events.Using the value 9.0E6 for the missing data allowed for it to be easily recognized when the authors were visually identifying the episodes.
Even though the corrected data files were chosen from the NOAA website, they still include background heliospheric and particles of magnetospheric origin.These backgrounds must be removed.The cleaned 5-minute averaged fluxes were combined to obtain daily flux averages and were plotted for each year.In each annual plot, a search was made for a period of at least 10 days with the lowest fluxes in each energy channel.The dailyaveraged fluxes were averaged over these periods to obtain estimates of the residual backgrounds for each energy channel.The authors strived to include the maximum numbers of days into these averages in order to reduce the effects of daily fluctuations.These background fluxes were subtracted from the cleaned 5-minute-averaged flux data.When the 5-minute averaged flux was less than the background level, the background-subtracted flux was set to zero.

Episode identification
For episode identification, it was decided to create 30minute averages of the cleaned and background-subtracted fluxes because the higher statistical precision of the 30-minute averages outweighed the resulting loss in temporal resolution.This choice was confirmed by comparing GME and GOES measurements during the period from January 1, 2000 to October 31, 2001 when measurements from both satellites were available.Both 5-minute and 30-minute averages of the GOES data were used to determine onset and end times for episodes of elevated proton flux and it was concluded that the 30-minute averages gave start and stop times which agreed more closely with those from GME for the episodes examined.All the cleaned and background-subtracted 5-minute data for each month was then converted into 30-minute flux averages.
Episodes of elevated proton flux were identified visually by graphing the data from channels P2 through P7 for each month from November 2001 to the end of 2016.These graphs were used to find the onset and end times for each episode.An episode was recognized when the flux was seen to rise above the residual background.The onset of each episode was identified as the point at which the flux first rose above the residual background level.The end was identified as the point at which the flux returns to the residual background level.The panels of Figure 1 shows the July 4, 2012 episode plotted for two channels.Onset and end times were identified in channels P2 through P7.The onset and end times in channel P2 were used to define the temporal extent of individual episodes.Sometimes the onset occurred earlier and usually returned to background earlier in the higher energy channels.In addition, there were times when the flux in a higher energy channel dropped to background and then rose again.In these cases, a second onset and end time was determined for the channel but both periods of elevated flux were defined as part of the same episode.This effect is shown in the lower panel of Figure 1.The flux returns to background around days 194 and 197 in channel P4.All three events in this channel were consider to be part of the same episode since the flux never returned to background in channel P2.After the onset and end times were identified in each energy channel, the 30-minute-averaged fluxes in each channel were converted to fluences using same procedure as was used for GME.
The fluences for each episode were checked for extremely high values indicating the presence of one or more data gaps that had been filled with flux values of 9.0E6 during the cleaning process described above.The monthly plots of the 30minute averaged flux for the episode were examined for the channels containing the high fluence to find the data gap or gaps during the episode.These data gaps were filled using data from one of the secondary GOES satellites recommended by NOAA.In those cases where the secondary satellites had a data gap at the same time, logarithmic interpolation of the primary satellites data was used for these larger data gaps.The authors note that this form of interpolation was never needed for a period longer than two hours.

Normalization
In order to create a single internally consistent dataset, it is necessary to normalize the measurements.As stated in Section 2, GME was chosen as the benchmark instrument based on investigations of various instruments that have been reported in the literature.
To normalize GOES-8 to GME, periods of time where the proton flux was well above background were found during the overlapping period of GOES-8 and GME.The flux had to be well above background to mitigate any residual background left in the data.Due to the differences in the pointing of the instruments on the two satellites, data for the normalization calculations were only used where the proton flux was believed to be isotropic throughout the sky.Thus, periods of time were identified that were after a large event and at least an order of magnitude above the residual background.Since it is known that the particle environment around Earth becomes isotropic during the declining phase of events, this allowed the comparison between the two instruments to be unaffected by the different spins of the satellites.
With the normalization periods identified, the proton fluxes from both instruments were gathered during the chosen time periods.The GME energy channels were combined to match the larger GOES channels.The fluxes for each normalization period were summed up and then multiplied by the duration of each measurement in seconds and by the GOES energy channel widths which are shown in Table 2.This number calculated is the fluence during the normalization period, in units of protons per centimeter-squared-steradian.The fluence from each GOES normalization period was divided by the fluence from the corresponding period in GME.These ratios were then averaged for each channel to create the GOES-8 to GME normalization factors.This ratios are shown in Table 3.
The GOES-11 and GOES-13 data were normalized to GOES-8.This was done using the normalizations between the different GOES satellites reported by Rodriguez et al. (2014).These normalizations had to be made using intermediate GOES satellites.GOES-13 was normalized to GOES-8 via GOES-10 while GOES-11 was normalized to GOES-8 via GOES-10 and GOES-13.The GOES-8 to GME ratios were then used to normalize GOES-11 and GOES-13 to GME.These results are also shown in Table 3.

Fitting the GOES spectra
The process described in this section is similar to the one used by Robinson (2015).In order to combine the GME and GOES data into a single seamless dataset, it was decided to present the GOES fluence spectra in the GME energy channels.
To do this, it was necessary to distribute the fluences measured in the broader GOES energy channels into the more finelyspaced GME channels.This was done by fitting the GOES fluence spectra to spectral representations found in the literature.The GOES differential fluence spectra were fit using four trial spectral models: the Band Function (Band et al., 1993), the Ellison-Ramaty model (Ellison and Ramaty, 1985), the Weibull model (Xapsos et al., 2000), and a power law in energy, f(E) = AE Àg .
With the GOES differential fluence spectral fit, the median energy value for each channel had to be refined.Due to the width of the GOES energy channels, new median energy values were calculated for each chanel so that, E, for each channel, i.e. a value such that the energies of half the events in the channel fell below and half above this median value, i.e. a value such that, where E 0 and E 1 are the lower and upper energy bounds of a GOES energy channel.An iterative procedure was used to fit the spectrum with the four trial spectral models discussed above.
Each model was used in Equation (1) to refine the estimates of E for each energy channel.Every spectrum was now re-fit with the four trial models using the refined values of E. The fitting process was considered a success if the reduced x 2 was less than 1.5 for any of the fits.If multiple models produced a reduce x 2 < 1.5, then the model with the smallest x 2 was chosen to represent the spectrum.The x 2 value of 1.5 was chosen because most fits stopped fitting the data well with higher x 2 values.Applying an upper limit on the x 2 also allowed for the fit to be judged on its accuracy without bias from the authors.An example of the Band, Weibull, and Ellison-Ramaty model fits for the April 18, 2014 episode are shown in Figure 2. When the best fit gave a reduced x 2 value of 1.5 or higher, the onset and end times for the episode were checked and corrected if necessary to remove residual background.In a few instances the best-fit x 2 value remained too high.In these cases, it was found that the episode was dominated by two SEP events, one with a soft energy spectrum and one with a hard spectrum so that the soft spectrum dominated the episodeintegrated spectrum at low energies while the hard event dominated at high energies.In these cases, it was necessary to fit the spectrum with a combination of a power law in energy and one of the other fitting trial functions in order to obtain a reduced x 2 < 1.5.
There were also some small episodes that only had flux present in channel P2 on GOES.These episodes had to be excluded from the database since there was no reliable method to accurately fit their fluence spectra.
With all the episode spectra properly fit, the fits were used to distribute the GOES fluence into the 29 GME energy channels.The corresponding median energy for each GME channel was just taken to be the midpoint of the energy channel.Since GME has much smaller energy channels than GOES, the midpoint energy of each channel is much closer to the median energy than it is in the GOES channels.The resulting database can be found in the accompanying file.This file contains the start and end times for each episode along with the episode integrated fluence for the 29 GME energy channels.If there is a value of 0 in a channel, the channel was at background during this episode and had no fluence present during this time.The only exception is for channels 1-6 of the GOES episodes.Since channel P1 in the GOES data had to be discarded due to contamination, there were no data for that low of energies in the GOES episodes.The fluence was reported as 0 in these channels even though the flux was above background at that time rather than trying to extrapolate the data that were thrown away.

Discussion of episodes of elevated solar energetic proton flux
The database produced in the study contained 690 episodes.However, 50 of these episodes were GOES episodes that only had flux above background in channel P2.These were excluded from our database, as detailed in Section 5. Of the remaining 640 episodes, there were 479 episodes identified from the GME data and 161 episodes identified in the GOES data.
To determine whether the process used to create this database is sound and consistent with existing databases, two episodes in this study were compared to the Solar Energetic Particle Environment Modeling (SEPEM) system (Crosby et al., 2015).The two episodes chosen for this study were the January 15, 2005 and the March 4, 2012 episodes.Using these two episodes allowed for a comparison of the normalization factors for two different instruments, GOES-11 to GME and GOES-13 to GME.The SEPEM data had to have the background subtracted off.This was done by averaging the flux over a quiet period of time before the episode started.That average was then subtracted off each measurement during the episode.The SEPEM episodes were identified by eye for each channel using the same process used in this paper for the GOES data.The fluence for each channel was then found by multiplying each measurement by the number of seconds in five minutes and then summed up over the entire episode to create an episode-integrated fluence.The smaller energy channels in this database were then combined to match the SEPEM energy channels.The results of this comparison can be found in Tables 4 and 5.The SEPEM energy channels are also provided in Table 4.
Looking at both episodes, there is good agreement between the two datasets in the lower energy channels.Channels 8-10 have the largest difference between the two datasets, with the reported fluences in this dataset 100-175% higher than the SEPEM data.The difference in the fluence in these channels can probably be linked to the difference in the processing of the GOES data.The SEPEM data used the corrected GOES energy channels that were reported in Sandberg et al. (2014).The corrected energy channels are smaller than the ones that are reported by NOAA, especially in the higher energy channels where the channel width has been reduced by over 50%.The reduction in the size of the GOES energy channels will lead to a smaller episodeintegrated fluence.
The rest of this section is broken up into three separate parts.The first part gives a comparison of the length of episodes during solar maximum and solar minimum.The second section discusses the characteristics of episodes of elevated proton flux.The characteristics include duration, the number of peaks in the episodes, and the steepness of the increase and decrease in the flux around each peak.These characteristics were first reported by Robinson (2015).The third part goes into a detailed description of the July 4, 2012 episode found in this database and the identification of peaks with outbursts on the Sun.

Episode lengths
Looking at the episode lengths of all 690 episodes identified in this study, we see that there are a wide range of episode lengths.There are 42 episodes that were identified that lasted for less than 1 day.The shortest episode occurred on March 21, 2004 and lasted for 4 hours.There were also 4 episodes that lasted over 30 days in length.The longest episode occurred on April 24, 1981 and lasted 33 days and 22.5 hours.The average length of an episode in this study was 7.21 days long.
The episodes were also split into solar maximum and solar minimum.To do this, the monthly sunspot numbers were combined to create yearly sunspot numbers for a year starting on November 1 of the previous year and ending on October 31.So the sunspots for the year 1975 goes from November 1, 1974 to October 31, 1975.The years with the most sunspots in them were identified to have the solar max at the midpoint of the year.So for the year 1980, the solar maximum occurred on May 1, 1980.Solar maximum was also taken to last 7 years with 2.5 years of maximum occurring before the sunspot maximum and 4.5 years occurring afterwards.With this guideline, solar maximum was found to have occurred 4 times in this database, with the dates listed in Table 6.The periods of time not listed in Table 6 but covered in this study are assumed to be part of solar minimum.
Using these dates to separate the episodes into episodes that occurred during solar maximum and solar minimum, there were 584 episodes that occurred during solar maximum and 106 episodes during solar minimum.A histogram of episode lengths for both solar maximum and solar minimum is shown in Figure 3.The average length of episodes during solar maximum and solar minimum was 7.35 and 6.42 days, respectively.Since there was a difference of almost a full day in episode lengths during different times of the solar cycle, a two-sample Kolmogorov-Smirnov test was preformed to see if the episode lengths distributions came from the same parent distribution.The p-value for this test was 0.294, meaning that the distributions of episode lengths during solar maximum and solar minimum come from the same parent distribution.This implies that an episode occurring during solar minimum has the same likelihood to be a long episode as one produced during solar maximum.Since longer episodes tend to have higher episode-integrated fluences, it is useful to know that a solar particle episode has an equal chance of being long during both solar maximum and solar minimum.

Characteristics of episodes
The method used for defining episodes creates a database in which episodes are so weakly correlated that they could be treated as statistically independent.In reality, these episodes are not completely statistically independent.However, if it can be assumed that an active region on the sun does not contribute to more than one episode, then the occurrence of episodes will be approximately independent.Although active regions have been found in the database that contribute events to more than one episodes, they are rare (<10%).For this reason, the episodes in this database can be considered statistically independent from one another.
In general, the authors observed that longer episodes had more peaks in them than the shorter episodes.Peaks are when the proton flux in a channel reaches a local maximum.Most of the time, a peak can be traced to part of a solar particle storm.However, there were a lot of instances where episodes of roughly the same length had a wide range of numbers of peaks in them.This is clearly demonstrated in episodes starting on July 16, 2002 andAugust 14, 2002, see Figure 4.The July episode was observed for 16 days but there were only 3 peaks during that time.The August episode was observed for 14 days but contained 13 separate peaks.An interesting feature that occasionally appeared in episodes was a plateau-like period, i.e., when the flux stays at a fairly constant for an   Another trait of the episodes in this dataset is the steepness of the rise and fall of each episode's flux around a peak.In the top panel of Figure 5, the July, 4 2012 episode shows an example of the 30-minute flux increasing rapidly over a few short hours.Not all episodes have the flux changing dramatically in a short time period.The bottom panel of Figure 5 show the June 18, 1983 episode that had a peak with the flux slowly rising over the course of about 3 days.The July 4, 2012 and the June 18, 1983 episodes are at opposite ends of the spectrum and demonstrate how different two episodes can be.When an episode is above background in channel P7, it was almost always a steep rising episode.Figure 6 shows the June 4, 2011 episode which is an example of steeply decreasing flux at the end of the episode.Even though it's not as steep as others at the beginning of episode, it still has a relatively steep end.When this episode is compared to the June 18, 1983 episode (see bottom panel in Fig. 5), the June 4, 2011 episode has a rapid decrease in flux at the end of the episode.
The final trait of episodes that the authors would like to discuss is how episodes can have hard or soft spectra.A soft spectrum is one that declines quickly with increasing energy while a hard spectrum is one that decreases slowly.A soft spectrum is seen in the episodes that are above background only for one or two of the lowest energy channels, for example, all the P2 only episodes that we had to drop from our dataset.A pure hard spectrum episode (one with only hard events) is rare but the October 14, 2013 episode is an example of one and is shown in Figure 7.The hard spectrum occurs around day 285 in channels P6 and P7.Usually a hard spectrum episode contains a mix of events with soft and hard spectra.There were a few such episodes where events with the soft spectra   contributed a significant portion of the episode-integrated fluence in the lower energy channels while events with the hard spectra dominated the higher energy channels.In the June 4, 2011 episode shown in Figure 8, the last peak (day 168) is from a soft spectra event that contributes slightly more than the first peak (day 159) to the episode-integrated fluence.However, this spectra is absent in channels P5 and above whereas the first peak is from an event with a hard spectra that is still quite big in channel P7.

July 4, 2012 episode
The proton flux often peaks multiple times during episodes.These peaks can sometimes be associated with solar energetic particle events caused by Coronal Mass Ejections (CMEs) or solar flares.Some peaks are storm-time events, caused by the arrival of an interplanetary shock at Earth.Still others are the result of propagation effects which are caused by changes in the magnetic connectivity of the Earth to the accelerating region.
As an example, a detailed look at the July 4, 2012 episode is given here.The episode that began July 4 (day 186) and ended on Aug. 7 (day 220) of 2012 is very complex and contains many features.Figure 9 shows the episode as recorded in the six highest energy GOES EPS channels.Table 7 lists some of the features that can be identified in the figure.We have attempted to associate CMEs or flares with these features.This was done using the SOHO LASCO CME Catalogue, Stereo Movies, and SolarMonitor.org.Ten solar outbursts were identified.They are listed in Table 8.The last column of Table 7 lists the outbursts from Table 8 that appear to be responsible for the features listed.
The flux profile in channel P2 is ragged and has many peaks.The seventeen most prominent are listed in Table 7.It is instructive to attempt to follow these peaks into the higher energy channels.The first feature listed in Table 7 only appears distinctly above the background in channel P4.This feature appears to be due to a small and soft SPE.The particles in this SPE may have been accelerated by the flare that occurred on day 186.92.This flare gave rise to a CME, but its speed was only 556 km/s so it may not have been an effective particle accelerator.This flare and CME both launched from 22°west.The onset of this SPE began very close to the time that the flare occurred and appears to have already been in progress when the CME launched.Also this SPE has a gradual onset, suggesting that it is not well connected to the acceleration site.If we assume that the solar wind speed measured at Earth is typical of the speed over most of the distance back to the Sun, we can make an estimate of the foot-point location at the time of the CME.Using data obtained from the SWEPAM instrument on the ACE satellite, we found that the average speed of the solar wind in transit from the Sun to the Earth during the time preceding the onset of this CME was 571 km/s.For this average speed, the predicted foot-point would be 41°W, some 19°from the flare site.Since flare acceleration sites are small, we would not expect the Earth to be well connected to this flare which is consistent with the SPEs slow onset.
Feature #2 in Table 7 is the peak of a SPE with a hard spectrum.This SPE is well above background in all the channels.We identify this SPE with the second CME listed in Table 8.It launched on day 188.98 from 49°west of the center of the Sun with a speed of 1828 km/s.The average speed of the solar wind in transit from the Sun to the Earth during the time preceding the onset of this CME was 469 km/s.For this average speed, the predicted foot-point would be 50.7°,so this CME was very well connected to Earth, which explains the rapid rise of the flux in all the energy channels.
The third feature in the table is a peak which becomes distinct in channel P3, peaking on day 191.3 and it can be seen above background in all the higher energy channels.Like the second feature, this one is a SPE that results from the third CME listed in Table 8.This CME launched from the same active region as the first two on day 190.7.By this time the active region is located at 74°west.The speed of this CME is 1495 km/s.Using the procedure described above, we estimate the foot-point to be at 54.8°, some 19°east of the site Fig. 8.The June 4, 2011 episode demonstrates a mix of hard and soft spectra events.The beginning of this episode contains a hard spectrum event that persists into channel P7 while the end has a soft spectra event that is not seen in the higher energy channels.This image has been modified from Robinson (2015).where the CME launched therefore it is likely that the east side of the CME was better connected to Earth than its front.In Figure 9 it can be seen that the onset of this SPE is slower than the second one in channels P3, P4 and P5 but similar to it in channels P6 and P7.This behavior is consistent with theoretical predictions that the higher energy particles are accelerated more effectively on the sides of CMEs where the shock normal is quasi-perpendicular to the downstream field while the lower energy particles are accelerated predominately from near the front of the shock where the shock normal is quasi-parallel to the downstream field.
About 17 hours after this SPE, a small peak can be seen in channels P2, P3 and P4 on day 192.0.This peak may be associated with the fast CME that erupted on day 191.24.This CME was located near the west limb.It was from the same active region as the previous three CMEs.This could explain the 18 hour delay between the CME and the peak of the SPE.This SPE appears to have a gradual onset, but it is almost buried by the flux from the two previous SPEs so most of the rise to the peak cannot be seen.
On day 194.9, there was another SPE peak.Just before the onset of this event, the proton flux had returned to background in all channels except P2, so the sudden onset of this event could be seen above background in all the channels even though this event was softer than the second and third SPEs in this episode.This SPE is due to a CME from a new active region.It launched from 1°west on day 194.70 with a speed of 885 km/s.Because this CME is slower, it will be farther from the solar surface when it exceeds the speed of sound and begins to drive a shock capable of accelerating particles.The SOHO LASCO CME catalog reports its angular extent to be 76°.This may help to explain the sudden onset of this event even though the CME launched from near the center of the Sun.
The peak at 194.9 is followed by five small peaks.The last of these peaks is on day 196.8.It is coincident with an interplanetary shock that arrived at ACE on day 196.72.It also created a sharp compression in the interplanetary magnetic field as recorded on GOES-13 and GOES-15 so this appears to be a storm-time event.We have been unable to identify a solar or interplanetary event that appears to be associated with the first four of these peaks.
There is a closely spaced group of three peaks between days 199.8 and 200.0 of almost identical amplitudes.The first of these (and perhaps all three) are an SPE that results from a CME which launched from 65°W on day 199.58.The onset of the first of these peaks is at 199.65.Using the same procedure discussed above, we estimate the foot-point to be 50°W.LASCO reports that the angular size of this CME is 176, making it exceptionally wide.This appears to enable the Earth to be well connected to the CME shock that was accelerating the particles and explains the sharp onset in channels P2 through P5.This SPE is intermediate in its hardness, falling near background in channels P6 and P7.
The tenth entry in Table 7 is a feature which peaks on early on day 201 and can be seen in channels P2, P3, and P4.It occurs when the flux from the preceding event is falling so the entire onset cannot be seen but it looks sharp in P2.There are no CMEs at the time of this event and no interplanetary shocks.There is only a C2.4 class X-ray flare at this time and it is at 40°E, so there is no explanation for this feature.
There is a hard SPE which peaks on day 201.5 ± 0.1 in all the channels.It is due to the CME which launched from 65°W on day 201.23 with a speed of 1631 km/s.The onset is more sudden in the higher energy channels, perhaps because the Earth is connected to the east side of the CME and sees the higher energy particles more promptly.
The episode that peaks on day 202.2 contains a broad feature with a soft spectrum.It can be seen clearly above background in channels P2 and P3.In P4 it appears to be part of an even broader feature extending to earlier times.In channel P5 there is just a hint of it in the declining flux that follows the preceding hard SPE.There is no flare or CME at this time.There is no interplanetary shock passage.There is a flare and a small CME at 201.68 on the back side of the Sun at about 150°E and an associated CME with a speed of 731 km.The characteristics of this feature are consistent with a poor and delayed connection to a remote backside event so it is possible that this feature was caused by the small CME.
Another hard SPE peaks around day 205.7.It is above background in all the channels but it is on the declining side of the previous events in channels P2, P3 and P4.This SPE appears to be associated with a CME that occurred on the back side of the Sun at 205.11 near 160°E.The snow in the STEREO A EUVI 195 image confirms that this CME produced solar energetic particles.This SPE began to appear in the STEREO A EUVI 195 image only at 205.43 so the 14 hour delay in the particles reaching Earth is not surprising.The strength and relatively quick commencement of the event in channels P4 through P7 is unusual for an event that is so poorly magnetically connected to Earth.
The last two features identified in Table 7 are at 209.2 and 215.7.Both are above background only in channels P2, P3 and P4.There is no fast CME or gradual X-ray flare associated with the first feature and there is also no interplanetary shock reaching Earth at this time, so we have no explanation for it.The feature peaking at 215.7 appears to be a soft SPE whose onset is very close in time to a CME that launches from 108°W, just behind the west limb.This CME has a speed of only 563 km/s which could account for the softness of this SPE.The onset in channel P3 is gradual.It is difficult to judge the speed of the onset in P2 and P4 as the event is only slightly above the background in these channels.
This episode was one of the largest of Solar Cycle 24.Table 9 gives the start and stop times of solar proton events according to the Space Weather Prediction Center (SWPC) at NOAA.SWPC identifies a solar proton event when the integral proton flux for the >10 MeV channel has a value above 10 proton flux units.The SWPC events and the description on how they are identified can be found at ftp://ftp.swpc.noaa.gov/pub/indices/SPE.txtSWPC identified four different solar proton events during this time and they have been matched to the features identified in this database.The first event in Table 9 is believed to be linked to feature #2 in Table 7.The associations identified in this paper match fairly well with the SWPC identifications.The CME and flare match are both the same time and the flare location is very close in proximity on the surface of the sun.The second event in Table 9 is the same one as feature #7 in this paper.Once again, our association matches the SWPC prediction pretty well.
The third event in Table 9 is a little more challenging to associate with one particular feature from this paper.Since there were two features identified within 5 hours of each other in this paper, the slightly later feature time was associated with the CME and flare.Since SWPC only had one event, the earlier feature was the one associated with the flare and CME.However, the flare and CME dates were once again very close, with the CME time matching exactly and the flare time within 6 hours of each other.The last event identified by SWPC is believed to be associated with feature #15 in this paper.The association done in this paper does not match the SWPC event.While the CME launch times match, SWPC does not provide a flare time for this event.Moreover, the active region location given by SWPC is not the same as the flare location in this paper.From the information at hand, the authors believe that the associations are not the same as SWPC.One last thing to note is that the normalization factors and GOES fitting spectra process were changed since they were first described by Robinson (2015).The authors feel the methods described in the previous sections give better results than those previously reported.The normalization factors were recalculated to only include instances when the two instruments would most likely be measuring identical proton environments.This removed most of the episodes where the proton flux around these satellites were not identical and could effect the overall normalization factors between the satellites.The GOES spectra fitting process now calculates the median energy for each channel for every fit and not just the fit with the best x 2 .This removes the bias from the first best fitting model on the second round of fitting.To prove that these two changes improved the data, the authors point out that in the GOES data from November 1, 2001 to December 31, 2013, there were 18 bad fits in the database after the second iteration of fitting using the method in Robinson (2015).With the new approaches described in this paper, there were only 9 episodes that had to be reexamined after the second iteration of fitting, cutting the number of bad fits in half.
In this paper, a new record of SPEs was described.Data were combined to create one of the largest continuous records of episode-integrated SPE fluence spectra to date.This database combined data from two completely different instruments across multiple satellites to create a seamless database for the years 1973-2016.The episodes in this database were compared to the SEPEM system and a good agreement was observed in the low energy channels.The differences in the higher energy channels were linked to the different processing techniques used on the GOES data.Episode lengths were also compared during solar maximum and solar minimum to show that episode lengths are independent of solar cycle phase.Characteristics of these episodes were discussed in this database.These characteristics can be useful to better understand how these episodes could potentially effect space crews and missions.Finally, the July 4, 2012 episode was extensively examined.CMEs and flares were associated with the different peaks seen in an episode.The identifications performed in this study were also compared to the ones done by SWPC.
In the future, the authors hope to continue to add to this database.With the launch of the GOES-16 satellite in late 2016, there will be different energy channels that will need to be normalized to the current database before the data can be added.The authors also plan to do a more in depth study of individual episodes in this database to find features of interest.

Fig. 1 .
Fig. 1.The proton flux in two different channels were plotted for the July 4, 2012 episode.The red line in both graphs represents the times during which the episode is above background in each channel.Top panel: The proton flux for channel P2 is shown.Bottom panel: The proton flux for channel P4 is shown.The flux returned to background on day 194 and between the days 197-199 in channel P4.

Fig. 2 .
Fig. 2.This figure shows the spectra fitting process of the Band, Weibull, and Ellison-Ramaty functions for the April 18, 2014 episode.The top panel shows the Band Function fit, which had a x 2 value of 2.02.The middle panel uses the Ellison-Ramaty model to fit the episode and the fit had a x 2 value of 0.9839.The bottom panel shows the Weibull function and had a x 2 value of 1.214.The Ellison-Ramaty model was chosen to fit this episode.

Fig. 3 .
Fig. 3.This histogram shows the probability of episode lengths found in this database during solar maximum and solar minimum.As can be seen in the figure, the distributions look somewhat similar.

Fig. 4 .
Fig. 4. Channel P2 flux measurements were plotted to show how episodes can have a different number of peaks in the data.The peaks for each episode is represented by the circles on the graphs.These images have been modified from Robinson (2015).Top panel: The July 16, 2002 episode is shown.Bottom Panel: The August 14, 2002 episode is shown.

Fig. 5 .
Fig. 5.The two graphs above demonstrate rapid and slow flux changes at the beginning of a peak.Top panel: The July 4, 2012 is shown.This episode has multiple steep peaks, with the steepest happening around day 195.Bottom panel: The June 18, 1983 episode is shown to illustrate a slow rising peak.The episode takes close to 3 days to reach its peak, which occurs at day 173.The GME data used in this figure were combined to form the GOES P2 channel.

Fig. 6 .
Fig. 6.Channel P2 of the June 4, 2011 episode is plotted to demonstrate a rapidly decreasing flux at the end of an episode.

Fig. 7 .
Fig.7.The October 14, 2013 episode demonstrates a mix of hard and soft spectra events.The hard spectra event occurs around day 285 while the soft spectra event occurs between day 288 and 289.

Fig. 9 .
Fig. 9.The July 4, 2012 episode is shown in this image for all channels.

Table 3 .
GOES to GME normalization factors.

Table 6 .
Dates of solar maximum.

Table 8 .
CMEs identified during the episode beginning on July 4, 2012.

Table 9 .
Solar Proton Events Identified by the Space Weather Prediction Center at NOAA.