Baselight

Of Genomes And Genetics

HackerEarth Machine Learning Challenge

@kaggle.aryarishabh_of_genomes_and_genetics_hackerearth_ml_challenge

Loading...
Loading...

About this Dataset

Of Genomes And Genetics

Context

Since the dawn of human life on the face of the earth, the global population has been booming. The population was estimated to be 1 billion people in the year 1800. The figure had increased to a new high of 6 billion humans by the turn of the twentieth century. Day in and day out, 227,000 people are being added to the world; it is projected that by the end of the 21st century, the world's population may exceed 11 billion.

As per reports, as a consequence of the unsustainable increase in population and a lack of access to adequate health care, food, and shelter, the number of genetic disorder ailments have increased. Hereditary illnesses are becoming more common due to a lack of understanding about the need for genetic testing. Often kids die as a result of these illnesses, thus genetic testing during pregnancy is critical.

Task

You are hired as a Machine Learning Engineer from a government agency. You are given a dataset that contains medical information about children who have genetic disorders. Your task is to predict the following:

Genetic disorder
Disorder subclass

Dataset Details

The dataset folder contains the following files:
train.csv: 22083 x 45
test.csv: 9465 x 43
sample_submission.csv: 5 x 3

Check details of each attribute of the dataset here

Source

https://www.hackerearth.com/challenges/competitive/hackerearth-machine-learning-challenge-genetic-testing/

Submission

The participants are encouraged to submit their solution at here

Tables

Sample Submission

@kaggle.aryarishabh_of_genomes_and_genetics_hackerearth_ml_challenge.sample_submission
  • 3.26 kB
  • 5 rows
  • 3 columns
Loading...
CREATE TABLE sample_submission (
  "patient_id" VARCHAR,
  "genetic_disorder" VARCHAR,
  "disorder_subclass" VARCHAR
);

Test

@kaggle.aryarishabh_of_genomes_and_genetics_hackerearth_ml_challenge.test
  • 461.56 kB
  • 9,465 rows
  • 43 columns
Loading...
CREATE TABLE test (
  "patient_id" VARCHAR,
  "patient_age" BIGINT,
  "genes_in_mother_s_side" VARCHAR  -- Genes In Mother\u0027s Side,
  "inherited_from_father" VARCHAR,
  "maternal_gene" VARCHAR,
  "paternal_gene" VARCHAR,
  "blood_cell_count_mcl" DOUBLE  -- Blood Cell Count (mcL),
  "patient_first_name" VARCHAR,
  "family_name" VARCHAR,
  "father_s_name" VARCHAR  -- Father\u0027s Name,
  "mother_s_age" BIGINT  -- Mother\u0027s Age,
  "father_s_age" BIGINT  -- Father\u0027s Age,
  "institute_name" VARCHAR,
  "location_of_institute" VARCHAR,
  "status" VARCHAR,
  "respiratory_rate_breaths_min" VARCHAR  -- Respiratory Rate (breaths/min),
  "heart_rate_rates_min" VARCHAR  -- Heart Rate (rates/min,
  "test_1" BIGINT,
  "test_2" BIGINT,
  "test_3" BIGINT,
  "test_4" BIGINT,
  "test_5" BIGINT,
  "parental_consent" VARCHAR,
  "follow_up" VARCHAR,
  "gender" VARCHAR,
  "birth_asphyxia" VARCHAR,
  "autopsy_shows_birth_defect_if_applicable" VARCHAR  -- Autopsy Shows Birth Defect (if Applicable),
  "place_of_birth" VARCHAR,
  "folic_acid_details_peri_conceptional" VARCHAR  -- Folic Acid Details (peri-conceptional),
  "h_o_serious_maternal_illness" VARCHAR,
  "h_o_radiation_exposure_x_ray" VARCHAR  -- H/O Radiation Exposure (x-ray),
  "h_o_substance_abuse" VARCHAR,
  "assisted_conception_ivf_art" VARCHAR,
  "history_of_anomalies_in_previous_pregnancies" VARCHAR,
  "no_of_previous_abortion" BIGINT  -- No. Of Previous Abortion,
  "birth_defects" VARCHAR,
  "white_blood_cell_count_thousand_per_microliter" DOUBLE  -- White Blood Cell Count (thousand Per Microliter),
  "blood_test_result" VARCHAR,
  "symptom_1" BOOLEAN,
  "symptom_2" BOOLEAN,
  "symptom_3" BOOLEAN,
  "symptom_4" BOOLEAN,
  "symptom_5" BOOLEAN
);

Train

@kaggle.aryarishabh_of_genomes_and_genetics_hackerearth_ml_challenge.train
  • 1.14 MB
  • 22,083 rows
  • 45 columns
Loading...
CREATE TABLE train (
  "patient_id" VARCHAR,
  "patient_age" DOUBLE,
  "genes_in_mother_s_side" VARCHAR  -- Genes In Mother\u0027s Side,
  "inherited_from_father" VARCHAR,
  "maternal_gene" VARCHAR,
  "paternal_gene" VARCHAR,
  "blood_cell_count_mcl" DOUBLE  -- Blood Cell Count (mcL),
  "patient_first_name" VARCHAR,
  "family_name" VARCHAR,
  "father_s_name" VARCHAR  -- Father\u0027s Name,
  "mother_s_age" DOUBLE  -- Mother\u0027s Age,
  "father_s_age" DOUBLE  -- Father\u0027s Age,
  "institute_name" VARCHAR,
  "location_of_institute" VARCHAR,
  "status" VARCHAR,
  "respiratory_rate_breaths_min" VARCHAR  -- Respiratory Rate (breaths/min),
  "heart_rate_rates_min" VARCHAR  -- Heart Rate (rates/min,
  "test_1" DOUBLE,
  "test_2" DOUBLE,
  "test_3" DOUBLE,
  "test_4" DOUBLE,
  "test_5" DOUBLE,
  "parental_consent" VARCHAR,
  "follow_up" VARCHAR,
  "gender" VARCHAR,
  "birth_asphyxia" VARCHAR,
  "autopsy_shows_birth_defect_if_applicable" VARCHAR  -- Autopsy Shows Birth Defect (if Applicable),
  "place_of_birth" VARCHAR,
  "folic_acid_details_peri_conceptional" VARCHAR  -- Folic Acid Details (peri-conceptional),
  "h_o_serious_maternal_illness" VARCHAR,
  "h_o_radiation_exposure_x_ray" VARCHAR  -- H/O Radiation Exposure (x-ray),
  "h_o_substance_abuse" VARCHAR,
  "assisted_conception_ivf_art" VARCHAR,
  "history_of_anomalies_in_previous_pregnancies" VARCHAR,
  "no_of_previous_abortion" DOUBLE  -- No. Of Previous Abortion,
  "birth_defects" VARCHAR,
  "white_blood_cell_count_thousand_per_microliter" DOUBLE  -- White Blood Cell Count (thousand Per Microliter),
  "blood_test_result" VARCHAR,
  "symptom_1" DOUBLE,
  "symptom_2" DOUBLE,
  "symptom_3" DOUBLE,
  "symptom_4" DOUBLE,
  "symptom_5" DOUBLE,
  "genetic_disorder" VARCHAR,
  "disorder_subclass" VARCHAR
);

Share link

Anyone who has the link will be able to view this.