Baselight

Time Series Practice Dataset

Synthetic time series data

@kaggle.samuelcortinhas_time_series_practice_dataset

About this Dataset

Time Series Practice Dataset

Context

This dataset is intended for anybody who wants to practice and improve their time series skills.

Content

This dataset contains simulated time series data covering 10 years (2010-2019). The features include date, store id, product id and number sold. The train.csv covers the years 2010-2018 and the test.csv covers 2019 only. The are 7 unique stores and 10 unique products. The are no null values. The objective is to predict the number sold feature in the test.csv.

I created this time series data using multiple features including various long term trends, year-long seasonality patterns, weekday/weekend effects and noise. Moreover, the products and the stores are supposed to be weakly correlated.

Assessment

To compare with other peoples solutions, I recommend using the MAPE (Mean Absolute Percentage Error) metric.

Share link

Anyone who has the link will be able to view this.