Baselight is shutting down June 30.
Export any data that you want to keep. Read the announcement.

Baselight
Sign In
kaggle

Agricultural Yield Prediction Dataset (247K Rows)

Kaggle

@kaggle.hvanman2code_agricultural_yield_prediction_dataset_247k_rows

Loading...
Loading...

Weather, soil, fertilizer, irrigation, and crop data for yield prediction

Dataset Description

Crop Yield Prediction Dataset

This dataset contains 247,321 synthetic agricultural records generated using agronomic-inspired relationships between weather conditions, soil properties, nutrient levels, farming practices, and crop yield.

The dataset was designed for machine learning regression tasks and includes both numerical and categorical features.

Features
Numerical Features
rainfall_mm
avg_temp_c
humidity_pct
soil_ph
nitrogen_kg_ha
phosphorus_kg_ha
potassium_kg_ha
irrigation_mm
days_from_last_harvest
pest_index
Categorical Features
soil_type
crop_type
fertilized
weather_zone
seed_quality
irrigation_method
season
Target Variable
crop_yield_tons
Dataset Characteristics
Rows: 247,321
Features: 17
Target: 1
Total Columns: 18
Mixed numerical and categorical data
Missing values included
Non-linear feature interactions
Crop-specific yield response functions
Approximately 5-10% stochastic noise
Intended Use

Suitable for:

Regression
Feature engineering
Missing value imputation
Explainable AI (SHAP)
Ensemble models
AutoML benchmarking
Agricultural analytics projects
Important Note

This is a synthetic dataset generated for educational, benchmarking, and machine learning experimentation purposes. It does not represent measurements collected from real farms.


Related Datasets

Share link

Anyone who has the link will be able to view this.