Baselight

NBA Basic Player Data By Game

Aggregated Box Scores (1996-2020)

@kaggle.kenhuang41_nba_basic_game_data_by_player

Loading...
Loading...

About this Dataset

NBA Basic Player Data By Game

Description

Individual player data from all games from start of 1996-97 season (when my source began tracking individual plus-minus) to December 31, 2020

  • Scraped from Basketball Reference using Python (bs4) and added additional columns using Python and a VBA macro
  • Passed all assertion tests during scraping but not verified rigorously (I take no responsibility for screwing up your analysis hehe)
  • Users can append GAME_ID to "https://www.basketball-reference.com" to get URL of game webpage

Limitations

  • Does not include information about whether it is a playoff game
  • MP may not be entirely accurate (sometimes does not add to 48)
  • TOTAL_MINS and STARTER columns is only available on primary "games.csv" file
  • Had to round MP for one game due to TOTAL_MINS inaccuracy
  • A few players are identified with the wrong team in decade files (have been manually corrected in primary file)

Credits

If you use this data, I'd love to hear what projects my fellow basketball nerds are up to or even help collaborate! Shoot me an email at kh19@princeton.edu and please credit me with a link to this kaggle or my website. This is my first venture into scraping, so any suggestions or tips would be greatly appreciated. Enjoy!

Coding Time: 20 hours
Scraping Time: 9 hours

Tables

Games

@kaggle.kenhuang41_nba_basic_game_data_by_player.games
  • 9.64 MB
  • 743,423 rows
  • 27 columns
Loading...
CREATE TABLE games (
  "game_id" VARCHAR,
  "team" VARCHAR,
  "oppt" VARCHAR,
  "team_score" BIGINT,
  "oppt_score" BIGINT,
  "result" VARCHAR,
  "score_diff" BIGINT,
  "player" VARCHAR,
  "mp" DOUBLE,
  "fg" BIGINT,
  "fga" BIGINT,
  "fg3" BIGINT,
  "fg3a" BIGINT,
  "ft" BIGINT,
  "fta" BIGINT,
  "orb" BIGINT,
  "drb" BIGINT,
  "trb" BIGINT,
  "ast" BIGINT,
  "stl" BIGINT,
  "blk" BIGINT,
  "tov" BIGINT,
  "pf" BIGINT,
  "plus_minus" BIGINT,
  "pts" BIGINT,
  "total_mins" BIGINT,
  "starter" VARCHAR
);

Games 1990s

@kaggle.kenhuang41_nba_basic_game_data_by_player.games_1990s
  • 1.18 MB
  • 89,326 rows
  • 25 columns
Loading...
CREATE TABLE games_1990s (
  "game_id" VARCHAR,
  "team" VARCHAR,
  "oppt" VARCHAR,
  "team_score" BIGINT,
  "oppt_score" BIGINT,
  "result" VARCHAR,
  "score_diff" BIGINT,
  "player" VARCHAR,
  "mp" DOUBLE,
  "fg" BIGINT,
  "fga" BIGINT,
  "fg3" BIGINT,
  "fg3a" BIGINT,
  "ft" BIGINT,
  "fta" BIGINT,
  "orb" BIGINT,
  "drb" BIGINT,
  "trb" BIGINT,
  "ast" BIGINT,
  "stl" BIGINT,
  "blk" BIGINT,
  "tov" BIGINT,
  "pf" BIGINT,
  "plus_minus" BIGINT,
  "pts" BIGINT
);

Games 2000s

@kaggle.kenhuang41_nba_basic_game_data_by_player.games_2000s
  • 4.03 MB
  • 310,353 rows
  • 25 columns
Loading...
CREATE TABLE games_2000s (
  "game_id" VARCHAR,
  "team" VARCHAR,
  "oppt" VARCHAR,
  "team_score" BIGINT,
  "oppt_score" BIGINT,
  "result" VARCHAR,
  "score_diff" BIGINT,
  "player" VARCHAR,
  "mp" DOUBLE,
  "fg" BIGINT,
  "fga" BIGINT,
  "fg3" BIGINT,
  "fg3a" BIGINT,
  "ft" BIGINT,
  "fta" BIGINT,
  "orb" BIGINT,
  "drb" BIGINT,
  "trb" BIGINT,
  "ast" BIGINT,
  "stl" BIGINT,
  "blk" BIGINT,
  "tov" BIGINT,
  "pf" BIGINT,
  "plus_minus" BIGINT,
  "pts" BIGINT
);

Games 2010s

@kaggle.kenhuang41_nba_basic_game_data_by_player.games_2010s
  • 4.46 MB
  • 343,744 rows
  • 25 columns
Loading...
CREATE TABLE games_2010s (
  "game_id" VARCHAR,
  "team" VARCHAR,
  "oppt" VARCHAR,
  "team_score" BIGINT,
  "oppt_score" BIGINT,
  "result" VARCHAR,
  "score_diff" BIGINT,
  "player" VARCHAR,
  "mp" DOUBLE,
  "fg" BIGINT,
  "fga" BIGINT,
  "fg3" BIGINT,
  "fg3a" BIGINT,
  "ft" BIGINT,
  "fta" BIGINT,
  "orb" BIGINT,
  "drb" BIGINT,
  "trb" BIGINT,
  "ast" BIGINT,
  "stl" BIGINT,
  "blk" BIGINT,
  "tov" BIGINT,
  "pf" BIGINT,
  "plus_minus" BIGINT,
  "pts" BIGINT
);

Share link

Anyone who has the link will be able to view this.