Large Passenger Plane Crashes 1933-2009
Clustered by High and Low Fatality
@kaggle.juancarlosventosa_large_passenger_plane_crashes_19332009
Clustered by High and Low Fatality
@kaggle.juancarlosventosa_large_passenger_plane_crashes_19332009
This small data set contains information on 456 large passenger plane crashes that occurred between 1933 and 2009. It was derived from the popular Airplane_Crashes_and_Fatalities_Since_1908.csv data made available on Kaggle by Sauro Grandi. In this version large passenger plane crashes were extracted from the original data set and labeled as either “High Fatality” or “Low Fatality” crashes based on the ratio of survivors-to-total aboard from each plane crash. The labels were determined by K-means clustering (see Surviving Air Disasters: Cluster Analysis).
Date: Date of accident
Time: Local time, in 24 hr. in the format hh:mm
Location: Location of the accident
Operator: Airline or operator of the aircraft
Flight: Flight number assigned by the aircraft operator
Route: Complete or partial route flown prior to the accident
Type: Aircraft type
Registration: ICAO registration of the aircraft
cn/In: Construction or serial number / Line or fuselage number
Aboard: Total people aboard
Fatalities: Total fatalities aboard
Ground: Total killed on the ground
Survivors: Total survivors aboard
Survival Rate: Survivors divided by Aboard represented as a float (% of survivors)
Summary: Brief description of the accident and cause if known
ClustID: Label describing the fatalities aboard - Boolean ("High Fatality", "Low Fatality")
Inspired by the many fellow Kagglers who analyzed this popular dataset and shared their insights. I chose to add this version to be able to compare the differences between High Fatality and Low Fatality large passenger plane crashes and investigate possible causes.
CREATE TABLE large_passenger_plane_crashes_1933_to_2009 (
"date" TIMESTAMP,
"time" VARCHAR,
"location" VARCHAR,
"operator" VARCHAR,
"flight" VARCHAR -- Flight..,
"route" VARCHAR,
"type" VARCHAR,
"registration" VARCHAR,
"cn_in" VARCHAR,
"aboard" BIGINT,
"fatalities" BIGINT,
"ground" BIGINT,
"survivors" BIGINT,
"survivalrate" DOUBLE,
"summary" VARCHAR,
"clustid" VARCHAR
);Anyone who has the link will be able to view this.