Name: March Madness Historical DataSet (2002 To 2025)
Creator: Kaggle
License: https://opensource.org/licenses/MIT

About this Dataset

March Madness Historical DataSet (2002 To 2025)

This Kaggle dataset comes from an output dataset that powers my March Madness Data Analysis dashboard in Domo.

Click here to view this dashboard: Dashboard Link
Click here to view this dashboard features in a Domo blog post: Hoops, Data, and Madness: Unveiling the Ultimate NCAA Dashboard

This dataset offers one the most robust resource you will find to discover key insights through data science and data analytics using historical NCAA Division 1 men's basketball data. This data, sourced from KenPom, goes as far back as 2002 and is updated with the latest 2025 data. This dataset is meticulously structured to provide every piece of information that I could pull from this site as an open-source tool for analysis for March Madness.

Key features of the dataset include:

Historical Data: Provides all historical KenPom data from 2002 to 2025 from the Efficiency, Four Factors (Offense & Defense), Point Distribution, Height/Experience, and Misc. Team Stats endpoints from KenPom's website. Please note that the Height/Experience data only goes as far back as 2007, but every other source contains data from 2002 onward.
Data Granularity: This dataset features an individual line item for every NCAA Division 1 men's basketball team in every season that contains every KenPom metric that you can possibly think of. This dataset has the ability to serve as a single source of truth for your March Madness analysis and provide you with the granularity necessary to perform any type of analysis you can think of.
2025 Tournament Insights: Contains all seed and region information for the 2025 NCAA March Madness tournament. Please note that I will continually update this dataset with the seed and region information for previous tournaments as I continue to work on this dataset.

These datasets were created by downloading the raw CSV files for each season for the various sections on KenPom's website (Efficiency, Offense, Defense, Point Distribution, Summary, Miscellaneous Team Stats, and Height). All of these raw files were uploaded to Domo and imported into a dataflow using Domo's Magic ETL. In these dataflows, all of the column headers for each of the previous seasons are standardized to the current 2025 naming structure so all of the historical data can be viewed under the exact same field names. All of these cleaned datasets are then appended together, and some additional clean up takes place before ultimately creating the intermittent (INT) datasets that are uploaded to this Kaggle dataset. Once all of the INT datasets were created, I joined all of the tables together on the team name and season so all of these different metrics can be viewed under one single view. From there, I joined an NCAAM Conference & ESPN Team Name Mapping table to add a conference field in its full length and respective acronyms they are known by as well as the team name that ESPN currently uses. Please note that this reference table is an aggregated view of all of the different conferences a team has been a part of since 2002 and the different team names that KenPom has used historically, so this mapping table is necessary to map all of the teams properly and differentiate the historical conferences from their current conferences. From there, I join a reference table that includes all of the current NCAAM coaches and their active coaching lengths because the active current coaching length typically correlates to a team's success in the March Madness tournament. I also join another reference table to include the post-season tournament teams, which currently shows 2024 teams but will eventually feature 2025 teams once that data becomes available. After some additional data clean-up, all of this cleaned data exports into the "DEV _ March Madness" file that contains the consolidated view of all of this data.

This dataset provides users with the flexibility to export data for further analysis in platforms such as Domo, Power BI, Tableau, Excel, and more. This dataset is designed for users who wish to conduct their own analysis, develop predictive models, or simply gain a deeper understanding of the intricacies that result in the excitement that Division 1 men's college basketball provides every year in March. Whether you are using this dataset for academic research, personal interest, or professional interest, I hope this dataset serves as a foundational tool for exploring the vast landscape of college basketball's most riveting and anticipated event of its season.

Tables

Dev March Madness

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.dev_march_madness

3.49 MB
8,314 rows
147 columns

CREATE TABLE dev_march_madness (
  "season" BIGINT,
  "short_conference_name" VARCHAR,
  "adjusted_temo" DOUBLE,
  "adjusted_tempo_rank" BIGINT,
  "raw_tempo" DOUBLE,
  "raw_tempo_rank" BIGINT,
  "adjusted_offensive_efficiency" DOUBLE,
  "adjusted_offensive_efficiency_rank" BIGINT,
  "raw_offensive_efficiency" DOUBLE,
  "raw_offensive_efficiency_rank" BIGINT,
  "adjusted_defensive_efficiency" DOUBLE,
  "adjusted_defensive_efficiency_rank" BIGINT,
  "raw_defensive_efficiency" DOUBLE,
  "raw_defensive_efficiency_rank" BIGINT,
  "avg_possession_length_offense" DOUBLE  -- Avg Possession Length (Offense),
  "avg_possession_length_offense_rank" DOUBLE  -- Avg Possession Length (Offense) Rank,
  "avg_possession_length_defense" DOUBLE  -- Avg Possession Length (Defense),
  "avg_possession_length_defense_rank" DOUBLE  -- Avg Possession Length (Defense) Rank,
  "efgpct" DOUBLE,
  "rankefgpct" BIGINT,
  "topct" DOUBLE,
  "ranktopct" BIGINT,
  "orpct" DOUBLE,
  "rankorpct" BIGINT,
  "ftrate" DOUBLE,
  "rankftrate" BIGINT,
  "offft" DOUBLE,
  "rankoffft" BIGINT,
  "off2ptfg" DOUBLE,
  "rankoff2ptfg" BIGINT,
  "off3ptfg" DOUBLE,
  "rankoff3ptfg" BIGINT,
  "defft" DOUBLE,
  "rankdefft" BIGINT,
  "def2ptfg" DOUBLE,
  "rankdef2ptfg" BIGINT,
  "def3ptfg" DOUBLE,
  "rankdef3ptfg" BIGINT,
  "tempo" DOUBLE,
  "ranktempo" BIGINT,
  "adjtempo" DOUBLE,
  "rankadjtempo" BIGINT,
  "oe" DOUBLE,
  "rankoe" BIGINT,
  "adjoe" DOUBLE,
  "rankadjoe" BIGINT,
  "de" DOUBLE,
  "rankde" BIGINT,
  "adjde" DOUBLE,
  "rankadjde" BIGINT,
  "adjem" DOUBLE,
  "rankadjem" BIGINT,
  "fg2pct" DOUBLE,
  "rankfg2pct" BIGINT,
  "fg3pct" DOUBLE,
  "rankfg3pct" BIGINT,
  "ftpct" DOUBLE,
  "rankftpct" BIGINT,
  "blockpct" DOUBLE,
  "rankblockpct" BIGINT,
  "oppfg2pct" DOUBLE,
  "rankoppfg2pct" BIGINT,
  "oppfg3pct" DOUBLE,
  "rankoppfg3pct" BIGINT,
  "oppftpct" DOUBLE,
  "rankoppftpct" BIGINT,
  "oppblockpct" DOUBLE,
  "rankoppblockpct" BIGINT,
  "fg3rate" DOUBLE,
  "rankfg3rate" BIGINT,
  "oppfg3rate" DOUBLE,
  "rankoppfg3rate" BIGINT,
  "arate" DOUBLE,
  "rankarate" BIGINT,
  "opparate" DOUBLE,
  "rankopparate" BIGINT,
  "stlrate" DOUBLE,
  "rankstlrate" BIGINT,
  "oppstlrate" DOUBLE,
  "rankoppstlrate" BIGINT,
  "dfp" DOUBLE,
  "nstrate" DOUBLE,
  "ranknstrate" DOUBLE,
  "oppnstrate" DOUBLE,
  "rankoppnstrate" DOUBLE,
  "avgheight" DOUBLE,
  "rankavgheight" DOUBLE,
  "centerheight" DOUBLE,
  "rankcenterheight" DOUBLE,
  "pfheight" DOUBLE,
  "rankpfheight" DOUBLE,
  "sfheight" DOUBLE,
  "ranksfheight" DOUBLE,
  "sgheight" DOUBLE,
  "ranksgheight" DOUBLE,
  "pgheight" DOUBLE,
  "rankpgheight" DOUBLE,
  "effectiveheight" DOUBLE,
  "rankeffectiveheight" DOUBLE,
  "experience" DOUBLE
);

Int Kenpom Defense

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_defense

394.09 kB
9,268 rows
10 columns

CREATE TABLE int_kenpom_defense (
  "season" BIGINT,
  "teamname" VARCHAR,
  "efgpct" DOUBLE,
  "rankefgpct" BIGINT,
  "topct" DOUBLE,
  "ranktopct" BIGINT,
  "orpct" DOUBLE,
  "rankorpct" BIGINT,
  "ftrate" DOUBLE,
  "rankftrate" BIGINT
);

Int Kenpom Efficiency

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_efficiency

201.61 kB
9,268 rows
19 columns

CREATE TABLE int_kenpom_efficiency (
  "season" BIGINT,
  "team" VARCHAR,
  "conference" VARCHAR,
  "adjusted_temo" DOUBLE,
  "adjusted_tempo_rank" BIGINT,
  "raw_tempo" DOUBLE,
  "raw_tempo_rank" BIGINT,
  "adjusted_offensive_efficiency" DOUBLE,
  "adjusted_offensive_efficiency_rank" BIGINT,
  "raw_offensive_efficiency" DOUBLE,
  "raw_offensive_efficiency_rank" BIGINT,
  "adjusted_defensive_efficiency" DOUBLE,
  "adjusted_defensive_efficiency_rank" BIGINT,
  "raw_defensive_efficiency" DOUBLE,
  "raw_defensive_efficiency_rank" BIGINT,
  "avg_possession_length_offense" DOUBLE  -- Avg Possession Length (Offense),
  "avg_possession_length_offense_rank" DOUBLE  -- Avg Possession Length (Offense) Rank,
  "avg_possession_length_defense" DOUBLE  -- Avg Possession Length (Defense),
  "avg_possession_length_defense_rank" DOUBLE  -- Avg Possession Length (Defense) Rank
);

Int Kenpom Height

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_height

626.89 kB
6,670 rows
50 columns

CREATE TABLE int_kenpom_height (
  "season" BIGINT,
  "teamname" VARCHAR,
  "avgheight" DOUBLE,
  "rankavgheight" BIGINT,
  "centerheight" DOUBLE,
  "rankcenterheight" BIGINT,
  "pfheight" DOUBLE,
  "rankpfheight" BIGINT,
  "sfheight" DOUBLE,
  "ranksfheight" BIGINT,
  "sgheight" DOUBLE,
  "ranksgheight" BIGINT,
  "pgheight" DOUBLE,
  "rankpgheight" BIGINT,
  "effectiveheight" DOUBLE,
  "rankeffectiveheight" BIGINT,
  "experience" DOUBLE,
  "rankexperience" BIGINT,
  "bench" DOUBLE,
  "benchrank" BIGINT,
  "centerpts" DOUBLE,
  "rankcenterpts" BIGINT,
  "pfpts" DOUBLE,
  "rankpfpts" BIGINT,
  "sfpts" DOUBLE,
  "ranksfpts" BIGINT,
  "sgpts" DOUBLE,
  "ranksgpts" BIGINT,
  "pgpts" DOUBLE,
  "rankpgpts" BIGINT,
  "centeror" DOUBLE,
  "rankcenteror" BIGINT,
  "pfor" DOUBLE,
  "rankpfor" BIGINT,
  "sfor" DOUBLE,
  "ranksfor" BIGINT,
  "sgor" DOUBLE,
  "ranksgor" BIGINT,
  "pgor" DOUBLE,
  "rankpgor" BIGINT,
  "centerdr" DOUBLE,
  "rankcenterdr" BIGINT,
  "pfdr" DOUBLE,
  "rankpfdr" BIGINT,
  "sfdr" DOUBLE,
  "ranksfdr" BIGINT,
  "sgdr" DOUBLE,
  "ranksgdr" BIGINT,
  "pgdr" DOUBLE,
  "rankpgdr" BIGINT
);

Int Kenpom Miscellaneous Team Stats

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_miscellaneous_team_stats

1.11 MB
8,314 rows
35 columns

CREATE TABLE int_kenpom_miscellaneous_team_stats (
  "season" BIGINT,
  "teamname" VARCHAR,
  "fg2pct" DOUBLE,
  "rankfg2pct" BIGINT,
  "fg3pct" DOUBLE,
  "rankfg3pct" BIGINT,
  "ftpct" DOUBLE,
  "rankftpct" BIGINT,
  "blockpct" DOUBLE,
  "rankblockpct" BIGINT,
  "oppfg2pct" DOUBLE,
  "rankoppfg2pct" BIGINT,
  "oppfg3pct" DOUBLE,
  "rankoppfg3pct" BIGINT,
  "oppftpct" DOUBLE,
  "rankoppftpct" BIGINT,
  "oppblockpct" DOUBLE,
  "rankoppblockpct" BIGINT,
  "fg3rate" DOUBLE,
  "rankfg3rate" BIGINT,
  "oppfg3rate" DOUBLE,
  "rankoppfg3rate" BIGINT,
  "arate" DOUBLE,
  "rankarate" BIGINT,
  "opparate" DOUBLE,
  "rankopparate" BIGINT,
  "stlrate" DOUBLE,
  "rankstlrate" BIGINT,
  "oppstlrate" DOUBLE,
  "rankoppstlrate" BIGINT,
  "dfp" DOUBLE,
  "nstrate" DOUBLE,
  "ranknstrate" DOUBLE,
  "oppnstrate" DOUBLE,
  "rankoppnstrate" DOUBLE
);

Int Kenpom Offense

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_offense

359.25 kB
9,268 rows
10 columns

CREATE TABLE int_kenpom_offense (
  "season" BIGINT,
  "teamname" VARCHAR,
  "efgpct" DOUBLE,
  "rankefgpct" BIGINT,
  "topct" DOUBLE,
  "ranktopct" BIGINT,
  "orpct" DOUBLE,
  "rankorpct" BIGINT,
  "ftrate" DOUBLE,
  "rankftrate" BIGINT
);

Int Kenpom Point Distribution

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_point_distribution

533.43 kB
9,269 rows
14 columns

CREATE TABLE int_kenpom_point_distribution (
  "season" BIGINT,
  "teamname" VARCHAR,
  "offft" DOUBLE,
  "rankoffft" BIGINT,
  "off2ptfg" DOUBLE,
  "rankoff2ptfg" BIGINT,
  "off3ptfg" DOUBLE,
  "rankoff3ptfg" BIGINT,
  "defft" DOUBLE,
  "rankdefft" BIGINT,
  "def2ptfg" DOUBLE,
  "rankdef2ptfg" BIGINT,
  "def3ptfg" DOUBLE,
  "rankdef3ptfg" BIGINT
);

Int Kenpom Summary

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.int_kenpom_summary

583.84 kB
9,268 rows
16 columns

CREATE TABLE int_kenpom_summary (
  "season" BIGINT,
  "teamname" VARCHAR,
  "tempo" DOUBLE,
  "ranktempo" BIGINT,
  "adjtempo" DOUBLE,
  "rankadjtempo" BIGINT,
  "oe" DOUBLE,
  "rankoe" BIGINT,
  "adjoe" DOUBLE,
  "rankadjoe" BIGINT,
  "de" DOUBLE,
  "rankde" BIGINT,
  "adjde" DOUBLE,
  "rankadjde" BIGINT,
  "adjem" DOUBLE,
  "rankadjem" BIGINT
);

Ref 2024 Post Season Tournament Teams

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.ref_2024_post_season_tournament_teams

8.26 kB
131 rows
6 columns

CREATE TABLE ref_2024_post_season_tournament_teams (
  "team_name" VARCHAR,
  "seed" BIGINT,
  "region" VARCHAR,
  "correct_team_name" VARCHAR  -- Correct Team Name?,
  "post_season_tournament" VARCHAR,
  "post_season_tournament_sorting_index" BIGINT
);

Ref Current Ncaam Coaches

@kaggle.jonathanpilafas_2024_march_madness_statistical_analysis.ref_current_ncaam_coaches

18.46 kB
364 rows
4 columns

CREATE TABLE ref_current_ncaam_coaches (
  "current_coach" VARCHAR,
  "team" VARCHAR,
  "since" BIGINT,
  "join_team" VARCHAR
);