Symptoms To Diseases
Symptom-Based Disease Dataset
@kaggle.abhishekgodara_symptoms_to_diseases
Symptom-Based Disease Dataset
@kaggle.abhishekgodara_symptoms_to_diseases
This dataset is part of the Digital Diagnosis Project, an AI-based initiative to create a comprehensive, machine-readable symptom–disease dataset for research, experimentation, ML models and medical NLP tasks.
It combines two versions of the same data source:
A structured, raw dataset with 713 diseases and 377 binary symptom columns.
A processed, NLP-ready dataset with 254 diseases and natural-language symptom descriptions.
Together, they form one of the most versatile open-source datasets for both classical ML and deep learning (Transformer-based) medical research.
It’s designed for classical machine learning tasks, such as multi-label classification, and NLP tasks and also for Fine-Tuning llms.
| Attribute | Description |
|---|---|
| Rows (Diseases) | 713 |
| Columns (Symptoms) | 377 |
| Data Type | Binary (0 = symptom absent, 1 = symptom present) |
| Target Variable | Disease |
| Use Case | ML-based disease prediction |
Sample Dataset..
| disease | fever | cough | headache | nausea | chest_pain | ... |
|----------|--------|--------|-----------|----------|-------------|
| influenza | 1 | 1 | 1 | 0 | 0 | ... |
| migraine | 0 | 0 | 1 | 1 | 0 | ... |
| heart_attack | 0 | 0 | 0 | 1 | 1 | ... |
| Attribute | Description |
|---|---|
| Rows (Diseases) | 254 |
| Format | Each row represents a natural-language description of symptoms and its corresponding disease. |
| Data Type | Text + Label |
| Target Variable | Disease |
| Use Case | NLP and deep learning models such as BERT, BioBERT, DistilBERT, and LSTM. |
Sample Dataset..
| disease | symptom_text |
|---|---|
| influenza | fever, cough, sore throat, and headache |
| asthma | persistent cough, chest tightness, wheezing |
| heart_attack | sudden chest pain, sweating, nausea |
CREATE TABLE data (
"diseases" VARCHAR,
"anxiety_and_nervousness" BIGINT,
"depression" BIGINT,
"shortness_of_breath" BIGINT,
"depressive_or_psychotic_symptoms" BIGINT,
"sharp_chest_pain" BIGINT,
"dizziness" BIGINT,
"insomnia" BIGINT,
"abnormal_involuntary_movements" BIGINT,
"chest_tightness" BIGINT,
"palpitations" BIGINT,
"irregular_heartbeat" BIGINT,
"breathing_fast" BIGINT,
"hoarse_voice" BIGINT,
"sore_throat" BIGINT,
"difficulty_speaking" BIGINT,
"cough" BIGINT,
"nasal_congestion" BIGINT,
"throat_swelling" BIGINT,
"diminished_hearing" BIGINT,
"lump_in_throat" BIGINT,
"throat_feels_tight" BIGINT,
"difficulty_in_swallowing" BIGINT,
"skin_swelling" BIGINT,
"retention_of_urine" BIGINT,
"groin_mass" BIGINT,
"leg_pain" BIGINT,
"hip_pain" BIGINT,
"suprapubic_pain" BIGINT,
"blood_in_stool" BIGINT,
"lack_of_growth" BIGINT,
"emotional_symptoms" BIGINT,
"elbow_weakness" BIGINT,
"back_weakness" BIGINT,
"pus_in_sputum" BIGINT,
"symptoms_of_the_scrotum_and_testes" BIGINT,
"swelling_of_scrotum" BIGINT,
"pain_in_testicles" BIGINT,
"flatulence" BIGINT,
"pus_draining_from_ear" BIGINT,
"jaundice" BIGINT,
"mass_in_scrotum" BIGINT,
"white_discharge_from_eye" BIGINT,
"irritable_infant" BIGINT,
"abusing_alcohol" BIGINT,
"fainting" BIGINT,
"hostile_behavior" BIGINT,
"drug_abuse" BIGINT,
"sharp_abdominal_pain" BIGINT,
"feeling_ill" BIGINT,
"vomiting" BIGINT,
"headache" BIGINT,
"nausea" BIGINT,
"diarrhea" BIGINT,
"vaginal_itching" BIGINT,
"vaginal_dryness" BIGINT,
"painful_urination" BIGINT,
"involuntary_urination" BIGINT,
"pain_during_intercourse" BIGINT,
"frequent_urination" BIGINT,
"lower_abdominal_pain" BIGINT,
"vaginal_discharge" BIGINT,
"blood_in_urine" BIGINT,
"hot_flashes" BIGINT,
"intermenstrual_bleeding" BIGINT,
"hand_or_finger_pain" BIGINT,
"wrist_pain" BIGINT,
"hand_or_finger_swelling" BIGINT,
"arm_pain" BIGINT,
"wrist_swelling" BIGINT,
"arm_stiffness_or_tightness" BIGINT,
"arm_swelling" BIGINT,
"hand_or_finger_stiffness_or_tightness" BIGINT,
"wrist_stiffness_or_tightness" BIGINT,
"lip_swelling" BIGINT,
"toothache" BIGINT,
"abnormal_appearing_skin" BIGINT,
"skin_lesion" BIGINT,
"acne_or_pimples" BIGINT,
"dry_lips" BIGINT,
"facial_pain" BIGINT,
"mouth_ulcer" BIGINT,
"skin_growth" BIGINT,
"eye_deviation" BIGINT,
"diminished_vision" BIGINT,
"double_vision" BIGINT,
"cross_eyed" BIGINT,
"symptoms_of_eye" BIGINT,
"pain_in_eye" BIGINT,
"eye_moves_abnormally" BIGINT,
"abnormal_movement_of_eyelid" BIGINT,
"foreign_body_sensation_in_eye" BIGINT,
"irregular_appearing_scalp" BIGINT,
"swollen_lymph_nodes" BIGINT,
"back_pain" BIGINT,
"neck_pain" BIGINT,
"low_back_pain" BIGINT,
"pain_of_the_anus" BIGINT,
"pain_during_pregnancy" BIGINT,
"pelvic_pain" BIGINT
);CREATE TABLE final_symptoms_to_disease (
"diseases" VARCHAR,
"symptom_text" VARCHAR
);Anyone who has the link will be able to view this.