Titanic With Names Replaced By Honorifics
@kaggle.pietromaldini1_titanic_with_names_replaced_by_honorifics
@kaggle.pietromaldini1_titanic_with_names_replaced_by_honorifics
This is a simple variant of the titanic survival Dataset.
The main difference is the presence of only the honorifics of people instead of their name. This feature should be more easy to use. Only 4 Honorifics are retained and more uncommon ones are grouped inside a new "Rare" honorific.
2 columns substitutes the Cabin column. We have the code of the first cabin in the column and the number of cabins in that column. If the Cabin is not available we insert a new cabin category "N" (Not available).
Lastly there is an indicator column for missing age values and the missing values are filled with a -1.
Open to feedback and suggestions
CREATE TABLE test (
"passengerid" BIGINT,
"pclass" BIGINT,
"sex" VARCHAR,
"age" DOUBLE,
"sibsp" BIGINT,
"parch" BIGINT,
"ticket" VARCHAR,
"fare" DOUBLE,
"embarked" VARCHAR,
"honorific" VARCHAR,
"cabin_code" VARCHAR,
"cabin_amount" BIGINT,
"missing_age" BIGINT
);CREATE TABLE train (
"passengerid" BIGINT,
"survived" BIGINT,
"pclass" BIGINT,
"sex" VARCHAR,
"age" DOUBLE,
"sibsp" BIGINT,
"parch" BIGINT,
"ticket" VARCHAR,
"fare" DOUBLE,
"embarked" VARCHAR,
"honorific" VARCHAR,
"cabin_code" VARCHAR,
"cabin_amount" BIGINT,
"missing_age" BIGINT
);Anyone who has the link will be able to view this.