Esophageal Cancer Dataset
Comprehensive Esophageal Cancer Dataset for AI-Driven Early Detection & Research
@kaggle.abhinaba1biswas_esophageal_cancer_dataset
Comprehensive Esophageal Cancer Dataset for AI-Driven Early Detection & Research
@kaggle.abhinaba1biswas_esophageal_cancer_dataset
Esophageal cancer remains one of the most aggressive cancers with a high mortality rate worldwide, presenting significant challenges for early detection and effective treatment. To support the global fight against this disease, we introduce a comprehensive clinical dataset on esophageal cancer, available on Kaggle. This dataset includes patient demographics, clinical data, and cancer-specific attributes that can be leveraged to develop AI models for detection, prognosis, and treatment planning.
This dataset is a valuable resource for healthcare professionals and researchers working on cancer detection, personalized treatments, and prognosis models. It includes:
The Esophageal Cancer Dataset provides high-quality, comprehensive clinical data, essential for advancing research in esophageal cancer detection, treatment, and prognosis. We encourage the research community to utilize this dataset to drive innovation and improve patient outcomes.
CREATE TABLE esophageal_dataset (
"unnamed_0" BIGINT -- Unnamed: 0,
"patient_barcode" VARCHAR,
"tissue_source_site" VARCHAR,
"patient_id" VARCHAR,
"bcr_patient_uuid" VARCHAR,
"informed_consent_verified" VARCHAR,
"icd_o_3_site" VARCHAR,
"icd_o_3_histology" VARCHAR,
"icd_10" VARCHAR,
"tissue_prospective_collection_indicator" VARCHAR,
"tissue_retrospective_collection_indicator" VARCHAR,
"days_to_birth" BIGINT,
"country_of_birth" VARCHAR,
"gender" VARCHAR,
"height" DOUBLE,
"weight" DOUBLE,
"country_of_procurement" VARCHAR,
"state_province_of_procurement" VARCHAR,
"city_of_procurement" VARCHAR,
"race_list" VARCHAR,
"ethnicity" VARCHAR,
"other_dx" VARCHAR,
"history_of_neoadjuvant_treatment" VARCHAR,
"person_neoplasm_cancer_status" VARCHAR,
"vital_status" VARCHAR,
"days_to_last_followup" DOUBLE,
"days_to_death" DOUBLE,
"tobacco_smoking_history" DOUBLE,
"age_began_smoking_in_years" DOUBLE,
"stopped_smoking_year" DOUBLE,
"number_pack_years_smoked" DOUBLE,
"alcohol_history_documented" VARCHAR,
"frequency_of_alcohol_consumption" DOUBLE,
"amount_of_alcohol_consumption_per_day" DOUBLE,
"reflux_history" VARCHAR,
"antireflux_treatment_types" VARCHAR,
"h_pylori_infection" VARCHAR,
"initial_diagnosis_by" VARCHAR,
"barretts_esophagus" VARCHAR,
"goblet_cells_present" VARCHAR,
"history_of_esophageal_cancer" VARCHAR,
"number_of_relatives_diagnosed" DOUBLE,
"has_new_tumor_events_information" VARCHAR,
"day_of_form_completion" BIGINT,
"month_of_form_completion" BIGINT,
"year_of_form_completion" BIGINT,
"has_follow_ups_information" VARCHAR,
"has_drugs_information" VARCHAR,
"has_radiations_information" VARCHAR,
"project" VARCHAR,
"stage_event_system_version" VARCHAR,
"stage_event_clinical_stage" VARCHAR,
"stage_event_pathologic_stage" VARCHAR,
"stage_event_tnm_categories" VARCHAR,
"stage_event_psa" VARCHAR,
"stage_event_gleason_grading" VARCHAR,
"stage_event_ann_arbor" VARCHAR,
"stage_event_serum_markers" VARCHAR,
"stage_event_igcccg_stage" VARCHAR,
"stage_event_masaoka_stage" VARCHAR,
"primary_pathology_tumor_tissue_site" VARCHAR,
"primary_pathology_esophageal_tumor_cental_location" VARCHAR,
"primary_pathology_esophageal_tumor_involvement_sites" VARCHAR,
"primary_pathology_histological_type" VARCHAR,
"primary_pathology_columnar_metaplasia_present" VARCHAR,
"primary_pathology_columnar_mucosa_goblet_cell_present" VARCHAR,
"primary_pathology_columnar_mucosa_dysplasia" VARCHAR,
"primary_pathology_neoplasm_histologic_grade" VARCHAR,
"primary_pathology_days_to_initial_pathologic_diagnosis" BIGINT,
"primary_pathology_age_at_initial_pathologic_diagnosis" BIGINT,
"primary_pathology_year_of_initial_pathologic_diagnosis" DOUBLE,
"primary_pathology_initial_pathologic_diagnosis_method" VARCHAR,
"primary_pathology_init_pathology_dx_method_other" VARCHAR,
"primary_pathology_lymph_node_metastasis_radiographic_evidence" VARCHAR,
"primary_pathology_primary_lymph_node_presentation_assessment" VARCHAR,
"primary_pathology_lymph_node_examined_count" DOUBLE,
"primary_pathology_number_of_lymphnodes_positive_by_he" DOUBLE,
"primary_pathology_number_of_lymphnodes_positive_by_ihc" DOUBLE,
"primary_pathology_planned_surgery_status" VARCHAR,
"primary_pathology_treatment_prior_to_surgery" VARCHAR,
"primary_pathology_residual_tumor" VARCHAR,
"primary_pathology_karnofsky_performance_score" DOUBLE,
"primary_pathology_eastern_cancer_oncology_group" DOUBLE,
"primary_pathology_radiation_therapy" VARCHAR,
"primary_pathology_postoperative_rx_tx" VARCHAR
);
Anyone who has the link will be able to view this.