Baselight

NASA And NOAA Satellites Solar-Wind Dataset

Predict DISTURBANCES In Earth’s Geomagnetic Field

@kaggle.arashnic_soalr_wind

About this Dataset

NASA And NOAA Satellites Solar-Wind Dataset

Context

The efficient transfer of energy from solar wind into the Earth’s magnetic field causes geomagnetic storms. The resulting variations in the magnetic field increase errors in magnetic navigation. The disturbance-storm-time index, or Dst, is a measure of the severity of the geomagnetic storm.

As a key specification of the magnetospheric dynamics, the Dst index is used to drive geomagnetic disturbance models such as NOAA/NCEI’s High Definition Geomagnetic Model - Real-Time (HDGM-RT). Additionally, magnetic surveyors, government agencies, academic institutions, satellite operators, and power grid operators use the Dst index to analyze the strength and duration of geomagnetic storms.

Empirical models have been proposed as early as in 1975 to forecast Dst solely from solar-wind observations at the Lagrangian (L1) position by satellites such as NOAA’s Deep Space Climate Observatory (DSCOVR) or NASA's Advanced Composition Explorer (ACE). Over the past three decades, several models were proposed for solar wind forecasting of Dst, including empirical, physics-based, and machine learning approaches. While the ML models generally perform better than models based on the other approaches, there is still room to improve, especially when predicting extreme events. More importantly, solutions that work on the raw, real-time data streams and are agnostic to sensor malfunctions and noise.Improved models can provide more advanced warning of geomagnetic storms and reduce errors in magnetic navigation systems.

Content

The data is composed of solar wind measurements collected from two satellites: NASA's Advanced Composition Explorer (ACE) and NOAA's Deep Space Climate Observatory (DSCOVR).

To ensure similar distributions between the training and test data, the data is separated into three non-contiguous periods. All data are provided with a period and timedelta multi-index which indicates the relative timestep for each observation within a period, but not the real timestamp. The period identifiers and timedeltas are common across datasets.

The primary feature data are provided in solar_wind.csv. They are composed of solar-wind readings from the ACE and DSCOVR satellites:

  • bx_gse - Interplanetary-magnetic-field (IMF) X-component in geocentric solar ecliptic (GSE) coordinate (nanotesla (nT))
  • by_gse - Interplanetary-magnetic-field Y-component in GSE coordinate (nT)
  • bz_gse - Interplanetary-magnetic-field Z-component in GSE coordinate (nT)
  • theta_gse - Interplanetary-magnetic-field latitude in GSE coordinates (defined as the angle between the magnetic vector B and the ecliptic plane, being positive when B points North) (degrees)
  • phi_gse - Interplanetary-magnetic-field longitude in GSE coordinates (the angle between the projection of the IMF vector on the ecliptic and the Earth–Sun direction) (degrees)
  • bx_gsm - Interplanetary-magnetic-field X-component in geocentric solar magnetospheric (GSM) coordinate (nT)
  • by_gsm - Interplanetary-magnetic-field Y-component in GSM coordinate (nT)
  • bz_gsm - Interplanetary-magnetic-field Z-component in (GSM) coordinate (nT)
  • theta_gsm - Interplanetary-magnetic-field latitude in GSM coordinates (degrees)
  • phi_gsm - Interplanetary-magnetic-field longitude in GSM coordinates (degrees)
  • bt - Interplanetary-magnetic-field component magnitude (nT)
  • density - Solar wind proton density (N/cm^3)
  • speed - Solar wind bulk speed (km/s)
  • temperature - Solar wind ion temperature (Kelvin)
  • source - Starting in 2016, the solar wind data for any given point in time can be sourced from either DSCOVR or ACE satellites depending on the quality. "ac" denotes it was sourced from ACE, and "ds" from DSCOVR.

Satellite Data

ACE and DSCOVR satellites are not stationary. They actually orbit around the L1 point, in a relatively constant position with respect to the Earth as the Earth revolves around the sun.

satellite_pos.csv records the daily positions of the DSCOVR and ACE Spacecrafts in Geocentric Solar Ecliptic (GSE) Coordinates for projections in the XY, XZ, and YZ planes. The columns for each spacecraft are denoted by the suffixes _ace or _dscovr.

  • gse_x - Position of the satellite in the X direction of GSE coordinates (km)
  • gse_y - Position of the satellite in the Y direction of GSE coordinates (km)
  • gse_z - Position of the satellite in the Z direction of GSE coordinates (km)

Sunspots Data

The Sun exhibits a well-known, periodic variation in the number of spots on its disk over a period of about 11 years, called a solar cycle. In general, large geomagnetic storms occur more frequently during the peak of these cycles. Sunspot numbers might allow for calibration of models to the solar cycle.
Sunspots are indexed according to the first corresponding day in labels.csv.

Labels
The labels are hourly Dst values, indexed using the same period and timedelta multi-index. The goal to predict the current timestep (t0) and the following timestep (t+1). Remember not use historical Dst values as an input for prediction.

![dst](https://drivendata-public-assets.s3.amazonaws.com/noaa-magnetosphere.jpg =800x400)

Earth's magnetosphere. The Dst or disturbance-storm-time index is a measure of the “ring current” (blue) around the Earth. The ring current is an electric current carried by charged particles trapped in the magnetosphere.

Starter Kernel(s)

EDA, Preprocessing and Keras LSTM Base Model

Acknowledgements

This data is provided by NOAA, With support from NASA. The NOAA’s National Centers for Environmental Information (NCEI), in partnership with the University of Colorado’s Cooperative Institute for Research in Environmental Sciences (CIRES) is conducting an open data-science challenge to forecast Dst using the real-time solar-wind (RTSW) data in an operationally viable setup. Recent advances in machine learning research hold immediate promise for improving Dst forecasting even without formal training in space physics. This dataset is in this context and can help to identify solutions that are both operationally viable and highly accurate.

NOAA's National Centers for Environmental Information (NCEI) hosts and provides public access to one of the most significant archives for environmental data on Earth. NCEI contributes to the NESDIS mission by developing new products and services that span the science disciplines and enable better data discovery.

Inspiration

  • Develop models for forecasting Dst that push the boundary of predictive performance, under operationally viable constraints, using the real-time solar-wind (RTSW) data feeds from NOAA’s DSCOVR and NASA’s ACE satellites.

Share link

Anyone who has the link will be able to view this.