Baselight

NBA Basic Player Data By Game

Aggregated Box Scores (1996-2020)

@kaggle.kenhuang41_nba_basic_game_data_by_player

About this Dataset

NBA Basic Player Data By Game

Description

Individual player data from all games from start of 1996-97 season (when my source began tracking individual plus-minus) to December 31, 2020

  • Scraped from Basketball Reference using Python (bs4) and added additional columns using Python and a VBA macro
  • Passed all assertion tests during scraping but not verified rigorously (I take no responsibility for screwing up your analysis hehe)
  • Users can append GAME_ID to "https://www.basketball-reference.com" to get URL of game webpage

Limitations

  • Does not include information about whether it is a playoff game
  • MP may not be entirely accurate (sometimes does not add to 48)
  • TOTAL_MINS and STARTER columns is only available on primary "games.csv" file
  • Had to round MP for one game due to TOTAL_MINS inaccuracy
  • A few players are identified with the wrong team in decade files (have been manually corrected in primary file)

Credits

If you use this data, I'd love to hear what projects my fellow basketball nerds are up to or even help collaborate! Shoot me an email at kh19@princeton.edu and please credit me with a link to this kaggle or my website. This is my first venture into scraping, so any suggestions or tips would be greatly appreciated. Enjoy!

Coding Time: 20 hours
Scraping Time: 9 hours

Share link

Anyone who has the link will be able to view this.