Authors: Jorge Padial (Vanderbilt University), Kelly Holley-Bockelmann (Vanderbilt University), Eric Jonas (University of Chicago)
Predicting extreme space-weather events are important to protect the transmission and distribution of the electrical grids due to strong geomagnetic storms. Attempts have been made to use different machine learning models to predict when and where the strongest of Solar flares will occur, all yielding similar evaluation metrics (TSS ~ 0.72 +- 0.13). This stagnation may indicate that we have reached the pinnacle of machine learning approaches for this task or that the data that is being used is incorrectly labeled. Evidence for the latter stems from more than 50% of flares from the NOAA database having no location metadata. To test the second hypothesis, we developed the Automatically Labeled EUV and X-Ray Incident SolarFlare (ALEXIS) pipeline. Solar flare locations are learned by ALEXIS by recreating the full disk X-Ray time-series (XRS) with a weighted linear combination of discrete regions as observed by the multi-pixel EUV images. The ALEXIS catalog returns flare peak-times, coordinates, the corrected scaled XRay magnitude, and the associated NOAA active region with a HARP identifier number independently from any external data-products (SWPC catalog, SolarSoft catalog, or HARP catalog). A proof of concept run of ALEXIS was run locally parsing through a total of 14 TB of AIA and SXI images in search of 1054 random solar flares of C-class magnitudes and above. Comparison of ALEXIS’s catalog with those produced by SWPC and SolarSoft show that these canonical databases need revisiting for 59% and 17% of the sub-sample, respectively. Additionally, we increased the amount of flares reported by 15% and 16% for SWPC or SolarSoft, respectively. Regardless, ALEXIS misses 7% of the subsample and returns 6% of false positives. The pipeline can be modified to ingest near real time data providing a new way to monitor for imminent flaring regions and has provided the first observational evidence for sympathetic and synchronous flares. We have scaled ALEXIS to 100 TB of AIA-images using >10,000’s node hours at the Argonne Leadership Computing Facility on all 8300 flare entries compiled by NOAA from May 2010 to March 2020. A preliminary discussion of the full catalog can be discussed.