UCI Air Quality Dataset
Comprehensive Data on Pollutant Concentrations Over Time
@kaggle.dakshbhalala_uci_air_quality_dataset
Comprehensive Data on Pollutant Concentrations Over Time
@kaggle.dakshbhalala_uci_air_quality_dataset
This dataset encompasses comprehensive air quality measurements collected over several months, focusing on various pollutants. It is intended for use in predictive modeling and data analysis within the fields of environmental science and public health. The data offers valuable insights into the concentration levels of different gases, making it suitable for both regression and classification tasks in machine learning applications.
Feature | Description |
---|---|
Date | The date of the measurement. |
Time | The time of the measurement. |
CO(GT) | Concentration of carbon monoxide (CO) in the air (µg/m³). |
PT08.S1(CO) | Sensor measurement for CO concentration. |
NMHC(GT) | Concentration of non-methane hydrocarbons (NMHC) (µg/m³). |
C6H6(GT) | Concentration of benzene (C6H6) in the air (µg/m³). |
PT08.S2(NMHC) | Sensor measurement for NMHC concentration. |
NOx(GT) | Concentration of nitrogen oxides (NOx) in the air (µg/m³). |
PT08.S3(NOx) | Sensor measurement for NOx concentration. |
NO2(GT) | Concentration of nitrogen dioxide (NO2) in the air (µg/m³). |
The dataset includes frequency distributions for each feature, categorized into specified ranges. Key statistics include:
This dataset is publicly available for research purposes. If you use this dataset, please cite it as follows:
[Insert citation details based on the original source of the dataset].
Created by: [Include authors or organizations responsible for the dataset].
The dataset has been utilized in numerous studies focusing on air quality analysis and its implications for public health. It serves as a foundational resource for applying various data mining techniques to explore pollutant concentrations and their correlations with health outcomes.
The dataset features temporal measurements related to air quality, enabling the assessment of pollution trends over time. It can be leveraged for both classification and regression tasks, with a focus on data normalization and strategies for handling missing values.
Anyone who has the link will be able to view this.