Baselight

Symptoms To Diseases

Symptom-Based Disease Dataset

@kaggle.abhishekgodara_symptoms_to_diseases

Loading...
Loading...

About this Dataset

Symptoms To Diseases

This dataset is part of the Digital Diagnosis Project, an AI-based initiative to create a comprehensive, machine-readable symptom–disease dataset for research, experimentation, ML models and medical NLP tasks.

It combines two versions of the same data source:

A structured, raw dataset with 713 diseases and 377 binary symptom columns.

A processed, NLP-ready dataset with 254 diseases and natural-language symptom descriptions.

Together, they form one of the most versatile open-source datasets for both classical ML and deep learning (Transformer-based) medical research.

It’s designed for classical machine learning tasks, such as multi-label classification, and NLP tasks and also for Fine-Tuning llms.

First Dataset(data.csv file)

Attribute Description
Rows (Diseases) 713
Columns (Symptoms) 377
Data Type Binary (0 = symptom absent, 1 = symptom present)
Target Variable Disease
Use Case ML-based disease prediction

Sample Dataset..
| disease | fever | cough | headache | nausea | chest_pain | ... |
|----------|--------|--------|-----------|----------|-------------|
| influenza | 1 | 1 | 1 | 0 | 0 | ... |
| migraine | 0 | 0 | 1 | 1 | 0 | ... |
| heart_attack | 0 | 0 | 0 | 1 | 1 | ... |

Second Dataset(final_symptoms_to_disease.csv)

Attribute Description
Rows (Diseases) 254
Format Each row represents a natural-language description of symptoms and its corresponding disease.
Data Type Text + Label
Target Variable Disease
Use Case NLP and deep learning models such as BERT, BioBERT, DistilBERT, and LSTM.

Sample Dataset..

disease symptom_text
influenza fever, cough, sore throat, and headache
asthma persistent cough, chest tightness, wheezing
heart_attack sudden chest pain, sweating, nausea

Tables

Data

@kaggle.abhishekgodara_symptoms_to_diseases.data
  • 1.53 MB
  • 246,945 rows
  • 378 columns
Loading...
CREATE TABLE data (
  "diseases" VARCHAR,
  "anxiety_and_nervousness" BIGINT,
  "depression" BIGINT,
  "shortness_of_breath" BIGINT,
  "depressive_or_psychotic_symptoms" BIGINT,
  "sharp_chest_pain" BIGINT,
  "dizziness" BIGINT,
  "insomnia" BIGINT,
  "abnormal_involuntary_movements" BIGINT,
  "chest_tightness" BIGINT,
  "palpitations" BIGINT,
  "irregular_heartbeat" BIGINT,
  "breathing_fast" BIGINT,
  "hoarse_voice" BIGINT,
  "sore_throat" BIGINT,
  "difficulty_speaking" BIGINT,
  "cough" BIGINT,
  "nasal_congestion" BIGINT,
  "throat_swelling" BIGINT,
  "diminished_hearing" BIGINT,
  "lump_in_throat" BIGINT,
  "throat_feels_tight" BIGINT,
  "difficulty_in_swallowing" BIGINT,
  "skin_swelling" BIGINT,
  "retention_of_urine" BIGINT,
  "groin_mass" BIGINT,
  "leg_pain" BIGINT,
  "hip_pain" BIGINT,
  "suprapubic_pain" BIGINT,
  "blood_in_stool" BIGINT,
  "lack_of_growth" BIGINT,
  "emotional_symptoms" BIGINT,
  "elbow_weakness" BIGINT,
  "back_weakness" BIGINT,
  "pus_in_sputum" BIGINT,
  "symptoms_of_the_scrotum_and_testes" BIGINT,
  "swelling_of_scrotum" BIGINT,
  "pain_in_testicles" BIGINT,
  "flatulence" BIGINT,
  "pus_draining_from_ear" BIGINT,
  "jaundice" BIGINT,
  "mass_in_scrotum" BIGINT,
  "white_discharge_from_eye" BIGINT,
  "irritable_infant" BIGINT,
  "abusing_alcohol" BIGINT,
  "fainting" BIGINT,
  "hostile_behavior" BIGINT,
  "drug_abuse" BIGINT,
  "sharp_abdominal_pain" BIGINT,
  "feeling_ill" BIGINT,
  "vomiting" BIGINT,
  "headache" BIGINT,
  "nausea" BIGINT,
  "diarrhea" BIGINT,
  "vaginal_itching" BIGINT,
  "vaginal_dryness" BIGINT,
  "painful_urination" BIGINT,
  "involuntary_urination" BIGINT,
  "pain_during_intercourse" BIGINT,
  "frequent_urination" BIGINT,
  "lower_abdominal_pain" BIGINT,
  "vaginal_discharge" BIGINT,
  "blood_in_urine" BIGINT,
  "hot_flashes" BIGINT,
  "intermenstrual_bleeding" BIGINT,
  "hand_or_finger_pain" BIGINT,
  "wrist_pain" BIGINT,
  "hand_or_finger_swelling" BIGINT,
  "arm_pain" BIGINT,
  "wrist_swelling" BIGINT,
  "arm_stiffness_or_tightness" BIGINT,
  "arm_swelling" BIGINT,
  "hand_or_finger_stiffness_or_tightness" BIGINT,
  "wrist_stiffness_or_tightness" BIGINT,
  "lip_swelling" BIGINT,
  "toothache" BIGINT,
  "abnormal_appearing_skin" BIGINT,
  "skin_lesion" BIGINT,
  "acne_or_pimples" BIGINT,
  "dry_lips" BIGINT,
  "facial_pain" BIGINT,
  "mouth_ulcer" BIGINT,
  "skin_growth" BIGINT,
  "eye_deviation" BIGINT,
  "diminished_vision" BIGINT,
  "double_vision" BIGINT,
  "cross_eyed" BIGINT,
  "symptoms_of_eye" BIGINT,
  "pain_in_eye" BIGINT,
  "eye_moves_abnormally" BIGINT,
  "abnormal_movement_of_eyelid" BIGINT,
  "foreign_body_sensation_in_eye" BIGINT,
  "irregular_appearing_scalp" BIGINT,
  "swollen_lymph_nodes" BIGINT,
  "back_pain" BIGINT,
  "neck_pain" BIGINT,
  "low_back_pain" BIGINT,
  "pain_of_the_anus" BIGINT,
  "pain_during_pregnancy" BIGINT,
  "pelvic_pain" BIGINT
);

Final Symptoms To Disease

@kaggle.abhishekgodara_symptoms_to_diseases.final_symptoms_to_disease
  • 2.64 MB
  • 192,715 rows
  • 2 columns
Loading...
CREATE TABLE final_symptoms_to_disease (
  "diseases" VARCHAR,
  "symptom_text" VARCHAR
);

Share link

Anyone who has the link will be able to view this.