Airplane Crashes Since 1908
Full history of airplane crashes throughout the world, from 1908-present
@kaggle.saurograndi_airplane_crashes_since_1908
Full history of airplane crashes throughout the world, from 1908-present
@kaggle.saurograndi_airplane_crashes_since_1908
At the time this Dataset was created in Kaggle (2016-09-09), the original version was hosted by Open Data by Socrata at the at: https://opendata.socrata.com/Government/Airplane-Crashes-and-Fatalities-Since-1908/q2te-8cvq, but unfortunately that is not available anymore. The dataset contains data of airplane accidents involving civil, commercial and military transport worldwide from 1908-09-17 to 2009-06-08.
While applying for a data scientist job opportunity, I was asked the following questions on this dataset:
My solution was:
The following bar charts display the answers requested by point 1. of the assignment, in particular:
The following answers regard point 2 of the assignment
I have identified 7 clusters using k-means clustering technique on a matrix obtained by a text corpus created by using Text Analysis (plain text, remove punctuation, to lower, etc.)
The following table summarize for each cluster the number of crashes and death.
The following picture shows clusters using the first 2 principal components:
For each clusters I will summarize the most used words and I will try to identify the causes of the crash
Cluster 1 (258)
aircraft, crashed, plane, shortly, taking.
No many information about this cluster can be deducted using Text Analysis
Cluster 2 (500)
aircraft, airport, altitude, crashed, crew, due, engine, failed, failure, fire, flight, landing, lost, pilot, plane, runway, takeoff, taking.
Engine failure on the runway after landing or takeoff
Cluster 3 (211):
aircraft, crashed, fog
Crash caused by fog
Cluster 4 (1014):
aircraft, airport, attempting, cargo, crashed, fire, land, landing, miles, pilot, plane, route, runway, struck, takeoff
Struck a cargo during landing or takeoff
Cluster 5 (2749):
accident, aircraft, airport, altitude, approach, attempting, cargo, conditions, control, crashed, crew, due, engine, failed, failure, feet, fire, flight, flying, fog, ground, killed, land, landing, lost, low, miles, mountain, pilot. plane, poor, route, runway, short, shortly, struck, takeoff, taking, weather
Struck a cargo due to engine failure or bad weather conditions mainly fog
Cluster 6 (195):
aircraft, crashed, engine, failure, fire, flight, left, pilot, plane, runway
Engine failure on the runway
Cluster 7 (341):
accident, aircraft, altitude, cargo, control, crashed, crew, due, engine, failure, flight, landing, loss, lost, pilot, plane, takeoff
Engine failure during landing or takeoff
Better solutions are welcome.
Thanks,
Sauro
CREATE TABLE airplane_crashes_and_fatalities_since_1908 (
"date" TIMESTAMP,
"time" VARCHAR,
"location" VARCHAR,
"operator" VARCHAR,
"flight" VARCHAR -- Flight #,
"route" VARCHAR,
"type" VARCHAR,
"registration" VARCHAR,
"cn_in" VARCHAR,
"aboard" DOUBLE,
"fatalities" DOUBLE,
"ground" DOUBLE,
"summary" VARCHAR
);Anyone who has the link will be able to view this.