About the Dataset
Context
This dataset, compiled in 1988, encompasses information from four distinct databases: Cleveland, Hungary, Switzerland, and Long Beach V. Comprising 76 attributes, inclusive of the predicted attribute, the dataset has been predominantly utilized in published experiments focusing on a subset of 14 key features. The critical "target" field denotes the percentage of heart attack risk in patients.
Heart disease, a collective term for ailments impacting the heart and circulatory system, is a global health concern and a leading cause of disability. Given the heart's pivotal role in bodily functions, diseases affecting it can have far-reaching consequences on other organs and physiological processes. Various forms of heart disease exist, including those causing coronary artery narrowing, valve malfunctions, heart enlargement, and more, often leading to heart failure and heart attacks.
This dataset, specifically tailored to heart disease, provides a valuable resource for extracting insights that illuminate the significance of each feature and their interrelationships. In this analysis, our primary objective is to ascertain the probability of an individual being susceptible to a severe heart problem.
Content
Attribute Information:
Age: Numeric (e.g., 52)
Sex: Categorical (0: Female, 1: Male)
Chest Pain Type: Categorical (0: Typical Angina, 1: Atypical Angina, 2: Non-anginal Pain, 3: Asymptomatic)
Resting Blood Pressure: Numeric (e.g., 125)
Serum Cholesterol: Numeric in mg/dL (e.g., 212)
Fasting Blood Sugar: Categorical (0: <= 120 mg/dL, 1: > 120 mg/dL)
Resting Electrocardiographic Results: Categorical (0: Normal, 1: Abnormality, 2: Hypertrophy)
Maximum Heart Rate Achieved: Numeric (e.g., 168)
Exercise-Induced Angina: Categorical (0: No, 1: Yes)
Oldpeak (ST Depression): Numeric (e.g., 1.0)
Slope of Peak Exercise ST Segment: Categorical (0: Upsloping, 1: Flat, 2: Downsloping)
Number of Major Vessels Colored by Fluoroscopy: Numeric (0 to 3)
Thalassemia: Categorical (0: Normal, 1: Fixed Defect, 2: Reversible Defect)
To uphold privacy, the dataset has undergone recent modifications, with the removal of patients' names and social security numbers, replaced with anonymized dummy values.