Real breast cancer sample dataset, for healthcare and cancer data analysis.
Dataset Description
Content
The period is over short time frame but it useful for hypothesis testing and statistical analysis. There are >400 rows so is a great beginners dataset.
Background
This dataset consists of a group of breast cancer patients, who had surgery to remove their tumour. The dataset consists of the following variables:
Patient_ID: unique identifier id of a patient
Age: age at diagnosis (Years)
Gender: Male/Female
Protein1, Protein2, Protein3, Protein4: expression levels (undefined units)
Tumour_Stage: I, II, III
Histology: Infiltrating Ductal Carcinoma, Infiltrating Lobular Carcinoma, Mucinous Carcinoma
ER status: Positive/Negative
PR status: Positive/Negative
HER2 status: Positive/Negative
Surgery_type: Lumpectomy, Simple Mastectomy, Modified Radical Mastectomy, Other
Date_of_Surgery: Date on which surgery was performed (in DD-MON-YY)
Date_of_Last_Visit: Date of last visit (in DD-MON-YY) [can be null, in case the patient didn’t visited again after the surgery]
Patient_Status: Alive/Dead [can be null, in case the patient didn’t visited again after the surgery and there is no information available whether the patient is alive or dead].
Related Datasets
-
Breast Cancer Dataset
@kaggle
-
Dhds Dataset
@cdc