This dataset contains information on vehicle specifications, fuel consumption, and CO2 emissions, collected to analyze the environmental impact of vehicles and predict their CO2 emissions using regression models. The dataset is structured to support both Simple Linear Regression (SLR) and Multiple Linear Regression (MLR) approaches for machine learning projects.
Key Features
Brand: The brand or manufacturer of the vehicle (e.g., Toyota, Ford, BMW).
Vehicle Type: Classification of vehicles based on size and usage (e.g., SUV, Sedan).
Engine Size (L): Engine displacement volume in liters.
Cylinders: Number of cylinders in the engine.
Transmission: Type of transmission (e.g., Automatic, Manual).
Fuel Type: Type of fuel used by the vehicle (e.g., Gasoline, Diesel, Hybrid).
Fuel Consumption (City, Hwy, and Combined): Fuel efficiency measured in liters per 100 kilometers (L/100 km).
CO2 Emissions (g/km): Carbon dioxide emissions per kilometer (target variable for prediction).
Use Cases
Exploratory Data Analysis (EDA):
Clean and analyze the dataset to understand trends in CO2 emissions based on engine size, fuel type, and vehicle class.
Simple Linear Regression (SLR):
Use Engine Size (L) to predict CO2 Emissions (g/km).
Multiple Linear Regression (MLR):
Incorporate additional features like Fuel Consumption and Cylinders to create more accurate prediction models.
Objective
This dataset is designed to:
Help understand the relationship between vehicle specifications and their environmental impact.
Enable the application of regression models for predicting CO2 emissions.
Explore the impact of fuel efficiency and engine size on carbon emissions.
Why This Dataset?
Environmental concerns are a critical issue, and analyzing CO2 emissions is essential to mitigate climate change.
It offers a practical application for machine learning students and professionals to develop predictive models.
Provides an opportunity to practice EDA, data cleaning, and regression modeling techniques.
Dataset Highlights
Suitable for both beginners and advanced practitioners in data science.
Provides a hands-on opportunity to work on real-world data.
Perfect for showcasing machine learning skills in regression analysis.
Acknowledgements
This dataset is provided for educational purposes and is not intended for commercial use. If used in research or publications, please provide proper citation.