Baselight

Cirrhosis Patient Survival Prediction

Utilize 17 clinical features to predict survival of patient with liver cirrhosis

@kaggle.joebeachcapital_cirrhosis_patient_survival_prediction

Loading...
Loading...

About this Dataset

Cirrhosis Patient Survival Prediction

Utilize 17 clinical features for predicting survival state of patients with liver cirrhosis. The survival states include 0 = D (death), 1 = C (censored), 2 = CL (censored due to liver transplantation).

For what purpose was the dataset created?

Cirrhosis results from prolonged liver damage, leading to extensive scarring, often due to conditions like hepatitis or chronic alcohol consumption. The data provided is sourced from a Mayo Clinic study on primary biliary cirrhosis (PBC) of the liver carried out from 1974 to 1984.

Who funded the creation of the dataset?

Mayo Clinic

What do the instances in this dataset represent?

People

Does the dataset contain data that might be considered sensitive in any way?

Gender, Age

Was there any data preprocessing performed?

  1. Drop all the rows where miss value (NA) were present in the Drug column
  2. Impute missing values with mean results
  3. One-hot encoding for all category attributes

Additional Information

During 1974 to 1984, 424 PBC patients referred to the Mayo Clinic qualified for the randomized placebo-controlled trial testing the drug D-penicillamine. Of these, the initial 312 patients took part in the trial and have mostly comprehensive data. The remaining 112 patients didn't join the clinical trial but agreed to record basic metrics and undergo survival tracking. Six of these patients were soon untraceable after their diagnosis, leaving data for 106 of these individuals in addition to the 312 who were part of the randomized trial.

Tables

Cirrhosis

@kaggle.joebeachcapital_cirrhosis_patient_survival_prediction.cirrhosis
  • 29.38 KB
  • 418 rows
  • 20 columns
Loading...

CREATE TABLE cirrhosis (
  "id" BIGINT,
  "n_days" BIGINT,
  "status" VARCHAR,
  "drug" VARCHAR,
  "age" BIGINT,
  "sex" VARCHAR,
  "ascites" VARCHAR,
  "hepatomegaly" VARCHAR,
  "spiders" VARCHAR,
  "edema" VARCHAR,
  "bilirubin" DOUBLE,
  "cholesterol" DOUBLE,
  "albumin" DOUBLE,
  "copper" DOUBLE,
  "alk_phos" DOUBLE,
  "sgot" DOUBLE,
  "tryglicerides" DOUBLE,
  "platelets" DOUBLE,
  "prothrombin" DOUBLE,
  "stage" DOUBLE
);

Share link

Anyone who has the link will be able to view this.