Baselight

F1 Archive 1950-2022

All race and qualifying data from 1950 to 2022

@kaggle.rprkh15_f1_race_and_qualifying_data

About this Dataset

F1 Archive 1950-2022

Context

Formula One is the highest class of international racing for open-wheel single-seater racing cars sanctioned by the Fédération Internationale de l'Automobile (FIA). Ever since its inaugural season in 1950, Formula1 has been regarded as the pinnacle of motorsport.

Content

This dataset contains detailed information about qualifying and race results for all the tracks over the course of multiple seasons. There is a separate directory for each season. There are 2 sub-directories for each season, namely: Qualifying Results and Race Results. The Race Results directory contains an overall_race_results.csv file which summarizes the race results throughout the entire season. It also contains multiple .csv files for the results of each race in the season. The Qualifying Results directory contains multiple .csv files for the qualifying results before the start of each race.

Note

For the 1982 season and before the qualifying results contain only 1 entry in the file which is that of the polesitter. The lap times of the other drivers were not accounted for, and on the official website there is only 1 entry under the qualifying results.

Inspiration

F1 is one of my favorite sports and I almost never miss a race 😄

The motivation behind creating this dataset was to learn more about web scraping and try to perform a statistical analysis of the data. Some of the things you could do with the entire dataset are as follows:

  • Identify the driver with the most poles
  • Compare qualifying times of different drivers (championship contenders, team-mates, etc)
  • Determine how often a particular driver out-qualifies his team-mate
  • Compare qualifying lap times of a race from previous seasons
  • Identify the driver with the most number of wins at a particular track
  • Analyze how the championship battle unfolded based on the number of points scored by the drivers (specially interesting for the 2021 f1 season 👀)
  • Identify drivers with the highest number of wins, podiums, DNFs, etc
  • Compare the average lap times of different tracks to identify the slowest and fastest tracks on the calendar
  • Compare the number of laps for each race in the season (Belgium 2021 being the clear winner 😂)
  • Find out who won the Driver's Championship based on the total number of points
  • Find out who won the Constructor's Championship based on the total number of points for each team

Some Common F1 Terms You Might Come Across

  • DNF: Did Not Finish. Commonly used nomenclature for drivers that crashed/failed to complete the entire race
  • DNQ: Did Not Qualify. Eliminated missing values from the qualifying datasets by introducing this abbreviation for drivers who failed to qualify.
  • NC: Not Confirmed. For drivers that DNF the term NC is used in the Position column
  • DQ: Disqualified. Generally drivers are disqualified from races due to technical infringements or a breach of sporting regulations (Example: Sebastian Vettel was disqualified from the 2021 Hungarian Grand Prix due to fuel irregularites and stripped of all the points he earned from finishing the race in P2)

Future Work

As I collect more data for the previous seasons, I will create new versions for the dataset. The goal with this dataset is to create an archive of qualifying and race data from 1950-2021. The dataset will also be updated when the 2022 season commences.

Share link

Anyone who has the link will be able to view this.