Baselight

Suspended Substances Prediction In River Water

Suspended substances prediction in river water on the monitoring stations data

@kaggle.vbmokin_suspended_substances_prediction_in_river_water

About this Dataset

Suspended Substances Prediction In River Water

Content

This dataset has data of the amount of suspended substances concentration ("Suspended") in river water.

There are 8 consecutive stations of the state water monitoring system. It's should predict the value in the eighth station by the first seven stations. The numbering of stations in the dataset is done from the target station upstream, ie closest to it - first, upstream - second, etc.

Data are average monthly. The number of observations on stations is different (from 4 to about 20 years).

Training and test data are chosen so that the percentage of non-NA values on stations with long and short series data is approximately the same. Test data do not contain target column, as in the future it is planned to organize a competition to predict this data.

Suspended substances concentration (SSC) is measured in mg/cub. dm (ie milligrams in the cubic decimeter).

The maximum permissible value of SSC in Ukraine is 15 mg/cub. dm.

Id - the unique id of a given monthly averaged data;

target - a values of monthly averaged data of SSC in target station, mg/cub. dm;

1-7 - a values of monthly averaged data of SSC in stations 1-7 (in seven stations located from the target station upstream), mg/cub. dm.

Acknowledgements

I thank the State Water Resources Agency of Ukraine and the Portal (https://data.gov.ua/) for providing data of water monitoring data which used for this dataset.

And grateful for the photo provided me for dataset image: photo by Samara Doole on Unsplash.

Inspiration

The most interesting are the following tasks:

  1. Analysis of data dependences, including EDA.

  2. Prediction the target data (water quaity in the target station) with the highest accuracy.

  3. Analysis of impact on the prediction accuracy of the first two stations (1-2) and the next five (3-7) stations separately.

Share link

Anyone who has the link will be able to view this.