Baselight

Wine Quality

Analyzing "Vinho Verde" Wine Quality: A Data Science Approach

@kaggle.abdelazizsami_wine_quality

Loading...
Loading...

About this Dataset

Wine Quality

Dataset Overview

Input Variables: Physicochemical properties (e.g., pH, alcohol content, acidity).
Output Variable: Sensory ratings (quality), which are ordered categories.

Tasks

Classification or Regression:

Treat the output as a categorical variable (classification) or as a continuous score (regression).
Outlier Detection:

Identify outliers (e.g., excellent or poor wines) using techniques like Isolation Forest or Local Outlier Factor (LOF).
Feature Selection:

Apply methods such as Recursive Feature Elimination (RFE), LASSO, or tree-based feature importance to identify relevant features.

Suggested Analysis Steps

Data Preprocessing:

  • Handle missing values if any.
  • Normalize or standardize input features for better model performance.

Exploratory Data Analysis (EDA):

  • Visualize the distribution of quality ratings.
  • Use pair plots or correlation heatmaps to understand relationships between features.

Modeling:

For Classification:

Try models like Logistic Regression, Decision Trees, Random Forest, or Gradient Boosting.

For Regression:

Use Linear Regression, SVR, or Tree-based models like Random Forest Regressor.

Evaluation:

  • Use metrics like accuracy, F1-score, or ROC-AUC for classification.
  • For regression, consider MAE, MSE, or R².

Feature Importance:

Analyze which features contribute the most to the predictions to aid in understanding the data.

Tables

Winequality Red

@kaggle.abdelazizsami_wine_quality.winequality_red
  • 34.21 kB
  • 1,599 rows
  • 12 columns
Loading...
CREATE TABLE winequality_red (
  "fixed_acidity" DOUBLE,
  "volatile_acidity" DOUBLE,
  "citric_acid" DOUBLE,
  "residual_sugar" DOUBLE,
  "chlorides" DOUBLE,
  "free_sulfur_dioxide" DOUBLE,
  "total_sulfur_dioxide" DOUBLE,
  "density" DOUBLE,
  "ph" DOUBLE,
  "sulphates" DOUBLE,
  "alcohol" DOUBLE,
  "quality" BIGINT
);

Winequality White

@kaggle.abdelazizsami_wine_quality.winequality_white
  • 76.61 kB
  • 4,898 rows
  • 12 columns
Loading...
CREATE TABLE winequality_white (
  "fixed_acidity" DOUBLE,
  "volatile_acidity" DOUBLE,
  "citric_acid" DOUBLE,
  "residual_sugar" DOUBLE,
  "chlorides" DOUBLE,
  "free_sulfur_dioxide" DOUBLE,
  "total_sulfur_dioxide" DOUBLE,
  "density" DOUBLE,
  "ph" DOUBLE,
  "sulphates" DOUBLE,
  "alcohol" DOUBLE,
  "quality" BIGINT
);

Share link

Anyone who has the link will be able to view this.