Baselight

Heart Desease Dataset

Identifying patients at high cardiovascular risk

@kaggle.ronanazarias_heart_desease_dataset

About this Dataset

Heart Desease Dataset

Cardiovascular diseases are the number 1 cause of death worldwide, claiming an estimated 17.9 million lives each year, which represents 31% of all deaths worldwide. Four out of every 5 deaths from cardiovascular disease are due to heart attacks and strokes, and a third of these deaths occur prematurely in people under 70 years of age.

People with cardiovascular disease or at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidemia, or established disease) need early detection and management, where a machine learning model can be of great benefit.

  • Measures of 11 variables that characterize each sample (the features of the problem):
    • 1 - Age: patient's age (years)
    • 2 - Sex: patient's sex (M: Male, F: Female)
    • 3 - ChestPainType: chest pain type (TA: Typical Angina, ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic)
    • 4 - RestingBP: resting blood pressure (mm Hg)
    • 5 - Cholesterol: serum cholesterol (mm/dl)
    • 6 - FastingBS: fasting blood glucose (1: if FastingBS > 120 mg/dl, 0: otherwise)
    • 7 - RestingECG: Resting ECG results (Normal: normal, ST: with ST-T wave abnormality, LVH: showing probable or definite left ventricular hypertrophy by Estes criteria)
    • 8 - MaxHR: maximum heart rate reached (Numeric value between 60 and 202)
    • 9 - ExerciseAngina: exercise-induced angina (Y: Yes, N: No)
    • 10 - Oldpeak: old peak = ST (Numerical value measured in depression)
    • 11 - ST_Slope: the slope of the peak exercise ST segment (Up upsloping, Flat: flat, Down downsloping)

In addition, there is the response variable, which in this case is a binary variable:

  • 12 - HeartDisease: output class (1: heart disease, 0: normal)

Share link

Anyone who has the link will be able to view this.