NBA Final Game 5 Process Mining
Luka's Dallas Mavericks take on Tatum's Boston Celtics
@kaggle.kurtissmith_nba_final_game_5_process_mining
Luka's Dallas Mavericks take on Tatum's Boston Celtics
@kaggle.kurtissmith_nba_final_game_5_process_mining
I wanted to explore process mining techniques on basketball but couldn't find an appropriate dataset. The closest, is the play-by-play data but this is heavily stripped down with a focus on traditionally captured activities such as shot, rebound, or steal. Activities which are frequent like pass and dribble were not captured .
Manually enriching play-by-play data for one game took me about a month. It involved watching a basketball game at 0.25 speed and pausing several times to capture timestamp.
The data is from the NBA final 2024, game 5. The finals winner is the team to win 4 games first, the max amount of games that could be played is 7 but Boston beat Dallas in 5 games. The data covers the winning game.
CREATE TABLE nba_final_2024_game_5 (
"seq" BIGINT,
"case_id" VARCHAR,
"activity_id" VARCHAR,
"timestamp" VARCHAR,
"resource_id" VARCHAR,
"player_team" VARCHAR,
"teams_play" VARCHAR,
"quarter" VARCHAR,
"outcome" BIGINT,
"shot_type" VARCHAR,
"shot_distance_by_foot" DOUBLE,
"shot_points" DOUBLE,
"score_board_dal" BIGINT,
"score_board_bos" BIGINT
);Anyone who has the link will be able to view this.