Context
The two differents datasets are related to Red Wine and White Wine variants of the Portuguese "Vinho Verde" wine.
For more details, consult the reference [Paulo Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis, 2009]. Due to privacy and logistic issues, only physicochemical (inputs) and sensory (the output) variables are available (e.g. there is no data about grape types, wine brand, wine selling price, etc.)
These datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are much more normal wines than excellent or poor ones).
Content
For more information, please read [Cortez et al., 2009].
Number of
Number of Instances:
Tables |
Count |
Red Wine |
1599 |
White Wine |
4898 |
Number of Attributes:
11 + output attribute. Input and Output of feature:
Input variables (based on physicochemical tests):
1. fixed acidity
2. volatile acidity
3. citric acid
4. residual sugar
5. chlorides
6. free sulfur dioxide
7. total sulfur dioxide
8. density
9. pH
10. sulphates
11. alcohol
Output variable (based on sensory data):
12. quality (score between 0 and 10)
Acknowledgements
This dataset is also available from the UCI machine learning repository, Source
I just shared it to Kaggle for Convenience. If I am mistaken and the public license type disallowed me from doing so, I will take to remove this dataset, if requested and notified to me. I am not the owner of this dataset. Also, if you plan to use this database in your article research or else you must taken and read main Source in the UCI machine learning repository.
Inspiration - Relevant Papers:
- P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis. Modeling wine preferences by data mining from physicochemical properties. For Research
In Decision Support Systems, Elsevier, 47(4):547-553, 2009.
- Additional Information about Wine: For a good evaluation, I recommend you to know a little more about wine. WikiPedia will be good for you. Source 1: Acids in Wine | Source 2: Chemistry of Wine