Description
This dataset is a synthetic representation of student performance, designed to mimic real-world scenarios by considering key factors such as study habits, sleep patterns, socioeconomic background, and class attendance. Each row represents a hypothetical student, and the dataset includes both input features and the calculated target variable (grades).
The dataset can be used for predictive modeling, exploratory data analysis, or even as a beginner-friendly introduction to machine learning workflows.
Key Features
-
Study Hours
- Description: Average daily hours spent studying.
-
Sleep Hours
- Description: Average daily hours spent sleeping.
-
Socioeconomic Score
- Description: A normalized score (0-1) indicating the student's socioeconomic background.
-
Attendance (%)
- Description: The percentage of classes attended by the student.
-
Grades (TARGET)
- Description: The final performance score of the student, derived from a combination of study hours, sleep hours, socioeconomic score, and attendance.
Usage Notes
- This dataset is synthetic and does not represent real students. It was created for educational and demonstration purposes.
- While the dataset strives for realism, it includes controlled randomness and noise to simulate real-world data variability.
Acknowledgments
This dataset is a beginner-friendly introduction to synthetic dataset creation, aiming to help the community experiment with realistic yet controlled data. Happy exploring! 🚀