A Machine Learning-Ready Dataset of GAMERA-GL CME Simulations for Forward and Inverse Modeling of the Interplanetary Magnetic Field at L1

Authors: Viacheslav Sadykov (Georgia State University), Elena Provornikova (Johns Hopkins Applied Physics Laboratory), Dustin Kempton (Georgia State University), Djalil Sawadogo (Georgia State University), Evangelos Paouris (Johns Hopkins Applied Physics Laboratory), Charles N Arge (NASA Goddard Space Flight Center), Tamima Saba (Georgia State University), Debosmita Mallick (Georgia State University), Rafal Angryk (Georgia State University), Petrus C Martens (Georgia State University)

Understanding Coronal Mass Ejections (CMEs) and their impact on the geomagnetic environment is among the most critical questions of space weather. The recent advances in physics-based CME modeling led to the development of extensive CME simulation datasets and employment of data-centered techniques for understanding the physics of the CME propagation and impact. We present the machine learning-ready dataset constructed based on the existing grid of the GAMERA-GL simulations of the CMEs propagating in the inner Heliosphere. The dataset has three background solar wind options (corresponding to the solar activity minimum, and its rising and decaying phases) and has the Gibson-Low flux rope of varying properties initiated at different locations, resulting in ∼23,000 complete simulation runs and ~7.4M unique timeseries of solar wind properties at hypothetical L1 locations. We consider the applications of this dataset to several problems, including (1) prediction of the key CME properties at L1 point, such as the CME Bz magnetic field component and its arrival time, and (2) development of the inverse model constraining the initial properties of the CME close to the Sun based on the L1 time series dynamics and CME geometry. We highlight how combining the large simulation grids and machine learning approach can help us understand the CME dynamics and enhance space weather forecasting.