Dataset: Comprehensive Diabetes Clinical Dataset(100k Rows)

About this Dataset

Comprehensive Diabetes Clinical Dataset(100k Rows)

Detailed dataset comprising health and demographic data of 100,000 individuals, aimed at facilitating diabetes-related research and predictive modeling. This dataset includes information on gender, age, location, race, hypertension, heart disease, smoking history, BMI, HbA1c level, blood glucose level, and diabetes status.

Dataset Use Cases

This dataset can be used for various analytical and machine learning purposes, such as:

Predictive Modeling: Build models to predict the likelihood of diabetes based on demographic and health-related features.
Health Analytics: Analyze the correlation between different health metrics (e.g., BMI, HbA1c level) and diabetes.
Demographic Studies: Examine the distribution of diabetes across different demographic groups and locations.
Public Health Research: Identify risk factors for diabetes and target interventions to high-risk groups.
Clinical Research: Study the relationship between comorbid conditions like hypertension and heart disease with diabetes.

Potential Analyses

Descriptive Statistics: Summarize the dataset to understand the central tendencies and dispersion of features.
Correlation Analysis: Identify the relationships between features.
Classification Models: Use machine learning algorithms to classify individuals as diabetic or non-diabetic.
Trend Analysis: Analyze trends over the years to see how diabetes prevalence has changed.

Tables

Diabetes Dataset

@kaggle.priyamchoksi_100000_diabetes_clinical_dataset.diabetes_dataset

567.52 KB
100000 rows
16 columns


CREATE TABLE diabetes_dataset (
  "year" BIGINT,
  "gender" VARCHAR,
  "age" DOUBLE,
  "location" VARCHAR,
  "race_africanamerican" BIGINT,
  "race_asian" BIGINT,
  "race_caucasian" BIGINT,
  "race_hispanic" BIGINT,
  "race_other" BIGINT,
  "hypertension" BIGINT,
  "heart_disease" BIGINT,
  "smoking_history" VARCHAR,
  "bmi" DOUBLE,
  "hba1c_level" DOUBLE,
  "blood_glucose_level" BIGINT,
  "diabetes" BIGINT
);