Automotive Price Prediction Dataset
Synthetic dataset of 1M+ vehicles for high-accuracy price prediction.
@kaggle.metawave_vehicle_price_prediction
Synthetic dataset of 1M+ vehicles for high-accuracy price prediction.
@kaggle.metawave_vehicle_price_prediction
This comprehensive dataset contains 1,000,000 entries for used vehicles, designed specifically for training high-accuracy price prediction models. The data was synthetically generated using a Python script that establishes realistic correlations between a vehicle's attributes and its market price. It includes 25 of the most common car brands, covering a wide range of models and specifications.
The dataset was created programmatically. The script's logic ensures realistic data distributions and relationships, such as:
This dataset is ideal for a variety of machine learning tasks, including:
price
column.CREATE TABLE vehicle_price_prediction (
"make" VARCHAR,
"model" VARCHAR,
"year" BIGINT,
"mileage" BIGINT,
"engine_hp" BIGINT,
"transmission" VARCHAR,
"fuel_type" VARCHAR,
"drivetrain" VARCHAR,
"body_type" VARCHAR,
"exterior_color" VARCHAR,
"interior_color" VARCHAR,
"owner_count" BIGINT,
"accident_history" VARCHAR,
"seller_type" VARCHAR,
"condition" VARCHAR,
"trim" VARCHAR,
"vehicle_age" BIGINT,
"mileage_per_year" DOUBLE,
"brand_popularity" DOUBLE,
"price" DOUBLE
);
Anyone who has the link will be able to view this.