Baselight

Real Breast Cancer Data

Real breast cancer sample dataset, for healthcare and cancer data analysis.

@kaggle.amandam1_breastcancerdataset

Loading...
Loading...

About this Dataset

Real Breast Cancer Data

Content

The period is over short time frame but it useful for hypothesis testing and statistical analysis. There are >400 rows so is a great beginners dataset.

Background

This dataset consists of a group of breast cancer patients, who had surgery to remove their tumour. The dataset consists of the following variables:

Patient_ID: unique identifier id of a patient

Age: age at diagnosis (Years)

Gender: Male/Female

Protein1, Protein2, Protein3, Protein4: expression levels (undefined units)

Tumour_Stage: I, II, III

Histology: Infiltrating Ductal Carcinoma, Infiltrating Lobular Carcinoma, Mucinous Carcinoma

ER status: Positive/Negative

PR status: Positive/Negative

HER2 status: Positive/Negative

Surgery_type: Lumpectomy, Simple Mastectomy, Modified Radical Mastectomy, Other

Date_of_Surgery: Date on which surgery was performed (in DD-MON-YY)

Date_of_Last_Visit: Date of last visit (in DD-MON-YY) [can be null, in case the patient didn’t visited again after the surgery]

Patient_Status: Alive/Dead [can be null, in case the patient didn’t visited again after the surgery and there is no information available whether the patient is alive or dead].

Tables

Brca

@kaggle.amandam1_breastcancerdataset.brca
  • 30.17 kB
  • 341 rows
  • 16 columns
Loading...
CREATE TABLE brca (
  "patient_id" VARCHAR,
  "age" DOUBLE,
  "gender" VARCHAR,
  "protein1" DOUBLE,
  "protein2" DOUBLE,
  "protein3" DOUBLE,
  "protein4" DOUBLE,
  "tumour_stage" VARCHAR,
  "histology" VARCHAR,
  "er_status" VARCHAR,
  "pr_status" VARCHAR,
  "her2_status" VARCHAR,
  "surgery_type" VARCHAR,
  "date_of_surgery" VARCHAR,
  "date_of_last_visit" VARCHAR,
  "patient_status" VARCHAR
);

Share link

Anyone who has the link will be able to view this.