The Dataset was made from the files available on https://cricsheet.org/. The site keeps updating it server after every few days.
The dataset includes matches played from 2002/12/29 to 2023/10/11
ball_by_ball:
- match_id - unique id of all the matches available
- date - date on which match is played
- inning_no - innginns number of the match (3 and 4 only when super over is bowled)
- batting_team - team batting
- over_num - over number of the innings
- balls - ball number of the over
- batsmen - name of batsmen on strike
- bowler - name of bowler bowling the over
- non_striker - name of non striker when ball bowled
- batsmen_runs - runs scored by batsmen on a ball
- extra_runs - extra runs if conceded
- extra_type - extra type
- wicket_type - wicket type if out
- player_out - name of player out
- fielder - name of fielder because of whom player is out
- total_runs - total runs on a ball
- is_wicket - if wicket fell on a ball
match_by_match:
- match_id - unique id of all the matches available
- date - date on which match is played
- event_name - series or tournament in which match is played
- match_type - type of match (ODI)
- gender - match played by male or female
- city - city in which match is played
- venue - venue in which match is played
- overs - total number of overs to be played
- team_1 - home team
- team_2 - away team
- toss_winner -winner of toss
- toss_decision - decision taken by toss winner (bat/bowl)
- match_winner - winner of match
- player_of_match - name of player of match
- by_runs - margin of runs by which a team won
- by_wickets - margin of wickets by which a team own
- umpire_1 - umpire 1 standing in the match
- umpire_2 - umpire 2 standing in the match
- tv_umpire - tv umpire of the match
- reserve_umpire - reserve umpire of the match
- match_refree - refree of the match
there are some matches on which data was not available so was not included in the dataset,
the complete list of matches not included can be found on https://cricsheet.org/missing/