Context
preprocessed dataset for the titanic disaster competition
Content
the train_set contains following features : PassengerId, Survived, Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, CabinFloor, CabinNumber, FamilySize, IsAlone, Title.
- Age, Fare and Title have been simplified into differents categories.
- Cabin has been split into CabinFloor and CabinNumber
- CabinNumber has been simplified into two sides.
- SibSp & Parch have been used to create new features FamilySize & IsAlone
- Name, Cabin and Ticket have been dropped.
All of them have been mapped into floats.
Acknowledgements
Most of the ideas come from tutorials.