Air Quality Measurements Dataset
Description
This dataset encompasses comprehensive air quality measurements collected over several months, focusing on various pollutants. It is intended for use in predictive modeling and data analysis within the fields of environmental science and public health. The data offers valuable insights into the concentration levels of different gases, making it suitable for both regression and classification tasks in machine learning applications.
Features
Feature |
Description |
Date |
The date of the measurement. |
Time |
The time of the measurement. |
CO(GT) |
Concentration of carbon monoxide (CO) in the air (µg/m³). |
PT08.S1(CO) |
Sensor measurement for CO concentration. |
NMHC(GT) |
Concentration of non-methane hydrocarbons (NMHC) (µg/m³). |
C6H6(GT) |
Concentration of benzene (C6H6) in the air (µg/m³). |
PT08.S2(NMHC) |
Sensor measurement for NMHC concentration. |
NOx(GT) |
Concentration of nitrogen oxides (NOx) in the air (µg/m³). |
PT08.S3(NOx) |
Sensor measurement for NOx concentration. |
NO2(GT) |
Concentration of nitrogen dioxide (NO2) in the air (µg/m³). |
Statistical Overview
The dataset includes frequency distributions for each feature, categorized into specified ranges. Key statistics include:
- CO(GT): Values can range significantly, with minimums around -200 µg/m³.
- NOx(GT): Concentration values span various ranges, with some exceeding 2000 µg/m³.
Citation Request
This dataset is publicly available for research purposes. If you use this dataset, please cite it as follows:
[Insert citation details based on the original source of the dataset].
Sources
Created by: [Include authors or organizations responsible for the dataset].
Past Usage
The dataset has been utilized in numerous studies focusing on air quality analysis and its implications for public health. It serves as a foundational resource for applying various data mining techniques to explore pollutant concentrations and their correlations with health outcomes.
Relevant Information
The dataset features temporal measurements related to air quality, enabling the assessment of pollution trends over time. It can be leveraged for both classification and regression tasks, with a focus on data normalization and strategies for handling missing values.
Number of Instances
- Total Records: 951 (across specified time frames)
Number of Attributes
- Input Attributes: 10 attributes related to air quality measurements.
Missing Attribute Values
- Some measurements may be recorded as -200, indicating missing or invalid data points.