Baseball
MLB Predict Home Runs
@kaggle.jcraggy_baseball
MLB Predict Home Runs
@kaggle.jcraggy_baseball
Try to find the best predictors indicative of a home run, reported as log loss
Log Loss quantifies the accuracy of a classifier by penalizing false classifications.
Minimizing the Log Loss is equivalent to maximizing the accuracy of the classifier.
Data set adapted from SLICE Competition Season 01 Episode 09 https://www.kaggle.com/c/sliced-s01e09-playoffs-1
These are only 2 hour competitions so time is limited. Here we can use the data set and take more time for analysis.
Adapted largely from David Robinson on YouTube. His modeling techniques are greatly appreciated in traversing the tidyverse()
Try to find the best predictors indicative of a home run, reported as log loss
Numeric predictors, categorical predictors, hybrid, how low can we go?
same as train.csv but without the target variable is_home_run
CREATE TABLE park_dimensions (
"park" BIGINT,
"name" VARCHAR,
"cover" VARCHAR,
"lf_dim" BIGINT,
"cf_dim" BIGINT,
"rf_dim" BIGINT,
"lf_w" BIGINT,
"cf_w" BIGINT,
"rf_w" BIGINT
);
CREATE TABLE test (
"bip_id" BIGINT,
"game_date" TIMESTAMP,
"home_team" VARCHAR,
"away_team" VARCHAR,
"batter_team" VARCHAR,
"batter_name" VARCHAR,
"pitcher_name" VARCHAR,
"batter_id" BIGINT,
"pitcher_id" BIGINT,
"is_batter_lefty" BIGINT,
"is_pitcher_lefty" BIGINT,
"bb_type" VARCHAR,
"bearing" VARCHAR,
"pitch_name" VARCHAR,
"park" BIGINT,
"inning" BIGINT,
"outs_when_up" BIGINT,
"balls" BIGINT,
"strikes" BIGINT,
"plate_x" DOUBLE,
"plate_z" DOUBLE,
"pitch_mph" DOUBLE,
"launch_speed" DOUBLE,
"launch_angle" DOUBLE
);
CREATE TABLE train (
"bip_id" BIGINT,
"game_date" TIMESTAMP,
"home_team" VARCHAR,
"away_team" VARCHAR,
"batter_team" VARCHAR,
"batter_name" VARCHAR,
"pitcher_name" VARCHAR,
"batter_id" BIGINT,
"pitcher_id" BIGINT,
"is_batter_lefty" BIGINT,
"is_pitcher_lefty" BIGINT,
"bb_type" VARCHAR,
"bearing" VARCHAR,
"pitch_name" VARCHAR,
"park" BIGINT,
"inning" BIGINT,
"outs_when_up" BIGINT,
"balls" BIGINT,
"strikes" BIGINT,
"plate_x" DOUBLE,
"plate_z" DOUBLE,
"pitch_mph" DOUBLE,
"launch_speed" DOUBLE,
"launch_angle" DOUBLE,
"is_home_run" BIGINT
);
Anyone who has the link will be able to view this.