Baselight

Comprehensive Diabetes Clinical Dataset(100k Rows)

100,000 Diabetes Dataset for Predictive Modeling and Health Analytics

@kaggle.priyamchoksi_100000_diabetes_clinical_dataset

Loading...
Loading...

About this Dataset

Comprehensive Diabetes Clinical Dataset(100k Rows)

Detailed dataset comprising health and demographic data of 100,000 individuals, aimed at facilitating diabetes-related research and predictive modeling. This dataset includes information on gender, age, location, race, hypertension, heart disease, smoking history, BMI, HbA1c level, blood glucose level, and diabetes status.

Dataset Use Cases

This dataset can be used for various analytical and machine learning purposes, such as:

  1. Predictive Modeling: Build models to predict the likelihood of diabetes based on demographic and health-related features.
  2. Health Analytics: Analyze the correlation between different health metrics (e.g., BMI, HbA1c level) and diabetes.
  3. Demographic Studies: Examine the distribution of diabetes across different demographic groups and locations.
  4. Public Health Research: Identify risk factors for diabetes and target interventions to high-risk groups.
  5. Clinical Research: Study the relationship between comorbid conditions like hypertension and heart disease with diabetes.

Potential Analyses

  • Descriptive Statistics: Summarize the dataset to understand the central tendencies and dispersion of features.
  • Correlation Analysis: Identify the relationships between features.
  • Classification Models: Use machine learning algorithms to classify individuals as diabetic or non-diabetic.
  • Trend Analysis: Analyze trends over the years to see how diabetes prevalence has changed.

Tables

Diabetes Dataset

@kaggle.priyamchoksi_100000_diabetes_clinical_dataset.diabetes_dataset
  • 567.52 KB
  • 100000 rows
  • 16 columns
Loading...

CREATE TABLE diabetes_dataset (
  "year" BIGINT,
  "gender" VARCHAR,
  "age" DOUBLE,
  "location" VARCHAR,
  "race_africanamerican" BIGINT,
  "race_asian" BIGINT,
  "race_caucasian" BIGINT,
  "race_hispanic" BIGINT,
  "race_other" BIGINT,
  "hypertension" BIGINT,
  "heart_disease" BIGINT,
  "smoking_history" VARCHAR,
  "bmi" DOUBLE,
  "hba1c_level" DOUBLE,
  "blood_glucose_level" BIGINT,
  "diabetes" BIGINT
);

Share link

Anyone who has the link will be able to view this.