Dataset Overview
This dataset, created in collaboration with my classmate Sadık Can Barut, contains detailed match statistics for various football games. It consists of approximately 100,000 rows and 91 columns, offering rich data to analyze team and player performance across multiple dimensions. The columns cover a range of categories such as team names, scores, season years, match dates, and detailed in-game statistics like possession, distance covered, clearances, pass success rates, and much more.
This dataset includes 18 leagues, consisting of 3 leagues from each of the 6 countries:
-
Germany
- Bundesliga
-
- Bundesliga
- DFB-Pokal
-
France
- Ligue 1
- Ligue 2
- Coupe de France
-
England
- Premier League
- Championship
- FA Cup
-
Italy
- Serie A
- Serie B
- Coppa Italia
-
Spain
- La Liga
- La Liga 2
- Primera RFEF Group 1
-
Turkey
- Süper Lig
-
- Lig
-
- Lig White Group
More Data
If you want to fetch more data, I have shared the code and the Streamlit interface on my GitHub.
Tags
football
sports analytics
machine learning
soccer
data science
football statistics
streamlit
data visualization
football dataset
This dataset is ideal for exploring football analytics, developing machine learning models for match prediction, performance analysis, or even deep dives into team dynamics and player contributions throughout the seasons.
We believe that this dataset will be useful for anyone interested in football analytics, machine learning, or data science challenges, and we hope it can serve as a foundation for many exciting analyses and projects.
Columns and Descriptions
Match Information
- Country: The country where the match was played.
- Lig: The league or competition in which the match took place.
- home_team: The name of the home team.
- away_team: The name of the away team.
- home_score: The number of goals scored by the home team.
- away_score: The number of goals scored by the away team.
- season_year: The season year when the match occurred.
- Date_day: The day of the match.
- Date_hour: The hour when the match started.
Goals and Timing
- first_half: Details about the first-half performance (if available).
- second_half: Details about the second-half performance (if available).
- home_team_goals_current_time: The time of each goal scored by the home team.
- home_team_goals_current_score: The cumulative score of the home team after each goal.
- home_team_goals: Details about the goals scored by the home team.
- home_team_goals_assist: Players who assisted the home team's goals.
- away_team_goals_current_time: The time of each goal scored by the away team.
- away_team_goals_current_score: The cumulative score of the away team after each goal.
- away_team_goals: Details about the goals scored by the away team.
- away_team_goals_assist: Players who assisted the away team's goals.
Cards
- home_team_yellow_card_current_time: The time of each yellow card received by the home team.
- home_team_yellow_card: Players who received yellow cards on the home team.
- home_team_yellow_card_why: Reasons for the yellow cards received by the home team.
- away_team_yellow_card_current_time: The time of each yellow card received by the away team.
- away_team_yellow_card: Players who received yellow cards on the away team.
- away_team_yellow_card_why: Reasons for the yellow cards received by the away team.
- home_team_red_card_current_time: The time of each red card received by the home team.
- home_team_red_card: Players who received red cards on the home team.
- home_team_red_card_why: Reasons for the red cards received by the home team.
- away_team_red_card_current_time: The time of each red card received by the away team.
- away_team_red_card: Players who received red cards on the away team.
- away_team_red_card_why: Reasons for the red cards received by the away team.
Substitutions
- home_team_substitutions_current_time: The time of each substitution made by the home team.
- home_team_substitutions: Players substituted in the home team.
- home_team_substitutions_with: Players who replaced the substituted players in the home team.
- home_team_substitution_why: Reasons for substitutions in the home team.
- away_team_substitutions_current_time: The time of each substitution made by the away team.
- away_team_substitutions: Players substituted in the away team.
- away_team_substitutions_with: Players who replaced the substituted players in the away team.
- away_team_substitution_why: Reasons for substitutions in the away team.
Match Statistics
- expected_goals_xg_home: Expected goals for the home team.
- expected_goals_xg_host: Expected goals for the away team.
- Ball_Possession_Home: Ball possession percentage for the home team.
- Ball_Possession_Host: Ball possession percentage for the away team.
- Goal_Attempts_Home: Number of goal attempts by the home team.
- Goal_Attempts_Host: Number of goal attempts by the away team.
- Shots_on_Goal_Home: Number of shots on target by the home team.
- Shots_on_Goal_Host: Number of shots on target by the away team.
- Shots_off_Goal_Home: Number of shots off target by the home team.
- Shots_off_Goal_Host: Number of shots off target by the away team.
- Blocked_Shots_Home: Number of blocked shots by the home team.
- Blocked_Shots_Host: Number of blocked shots by the away team.
- Free_Kicks_Home: Number of free kicks awarded to the home team.
- Free_Kicks_Host: Number of free kicks awarded to the away team.
- Corner_Kicks_Home: Number of corner kicks awarded to the home team.
- Corner_Kicks_Host: Number of corner kicks awarded to the away team.
- Offsides_Home: Number of offsides committed by the home team.
- Offsides_Host: Number of offsides committed by the away team.
Additional Statistics
- Throw_ins_Home: Number of throw-ins by the home team.
- Throw_ins_Host: Number of throw-ins by the away team.
- Goalkeeper_Saves_Home: Number of saves by the home team's goalkeeper.
- Goalkeeper_Saves_Host: Number of saves by the away team's goalkeeper.
- Fouls_Home: Number of fouls committed by the home team.
- Fouls_Host: Number of fouls committed by the away team.
- Red_Cards_Home: Total number of red cards for the home team.
- Red_Cards_Host: Total number of red cards for the away team.
- Yellow_Cards_Home: Total number of yellow cards for the home team.
- Yellow_Cards_Host: Total number of yellow cards for the away team.
Passing and Tackles
- Total_Passes_Home: Total number of passes made by the home team.
- Total_Passes_Host: Total number of passes made by the away team.
- Completed_Passes_Home: Number of completed passes by the home team.
- Completed_Passes_Host: Number of completed passes by the away team.
- Tackles_Home: Number of tackles by the home team.
- Tackles_Host: Number of tackles by the away team.
Advanced Metrics
- Crosses_Completed_Home: Number of successful crosses by the home team.
- Crosses_Completed_Host: Number of successful crosses by the away team.
- Interceptions_Home: Number of interceptions by the home team.
- Interceptions_Host: Number of interceptions by the away team.
- Attacks_Home: Total number of attacks by the home team.
- Attacks_Host: Total number of attacks by the away team.
- Dangerous_Attacks_Home: Total number of dangerous attacks by the home team.
- Dangerous_Attacks_Host: Total number of dangerous attacks by the away team.
Physical Metrics
- Distance_Covered_(km)_Home: Distance covered by the home team in kilometers.
- Distance_Covered_(km)_Host: Distance covered by the away team in kilometers.
- Clearances_Completed_Home: Number of clearances by the home team.
- Clearances_Completed_Host: Number of clearances by the away team.
- Pass_Success_per_Home: Passing success percentage for the home team.
- Pass_Success_per_Host: Passing success percentage for the away team.
Miscellaneous
- referee: Name of the referee officiating the match.
- venue: Venue where the match was played.
- capacity: Seating capacity of the venue.
- attendance: Attendance count for the match.