Adult Census Income Data
Dataset contains train_data file, test_data file and description file
@kaggle.mikolajbabula_adult_census_income_data
Dataset contains train_data file, test_data file and description file
@kaggle.mikolajbabula_adult_census_income_data
Below dataset contains three files:
adult_data.csv - train dataset
adult_test.csv - test dataset
adult_descr.csv - file with description of the data
In my kernel I start with looking what is presented in the dataset, what features are placed inside, what informations can be found and compared with eachothers, next I clean and prepare data into the form that is good for models I test. Starting with basic classification models, through hyperparameters tuning, ending on boosting algorithms I try to find best model, that is finally tested on the test dataset
CREATE TABLE adult_data (
"n_39" BIGINT -- 39,
"n__state_gov" VARCHAR -- State-gov,
"n__77516" BIGINT -- 77516,
"n__bachelors" VARCHAR -- Bachelors,
"n__13" BIGINT -- 13,
"n__never_married" VARCHAR -- Never-married,
"n__adm_clerical" VARCHAR -- Adm-clerical,
"n__not_in_family" VARCHAR -- Not-in-family,
"n__white" VARCHAR -- White,
"n__male" VARCHAR -- Male,
"n__2174" BIGINT -- 2174,
"n__0" BIGINT -- 0,
"n__40" BIGINT -- 40,
"n__united_states" VARCHAR -- United-States,
"n__50k" VARCHAR -- \u003c\u003d50K
);CREATE TABLE adult_descr (
"unnamed_0" VARCHAR -- Unnamed: 0,
"n__this_was_extracted_from_the_census_bureau_database_found_at" VARCHAR -- This Data Was Extracted From The Census Bureau Database Found At
);CREATE TABLE adult_test (
"unnamed_0" VARCHAR -- Unnamed: 0,
"n_1x3_cross_validator" VARCHAR -- 1x3 Cross Validator
);Anyone who has the link will be able to view this.