Marathon Time Predictions
Predict Marathon Results from Athletes Open Data Sources
@kaggle.girardi69_marathon_time_predictions
Predict Marathon Results from Athletes Open Data Sources
@kaggle.girardi69_marathon_time_predictions
Every Marathoner has a time goal in mind, and this is the result of all the training done in months of exercises. Long runs, Strides, Kilometers and phisical exercise, all add improvement to the result. Marathon time prediction is an art, generally guided by expert physiologists that prescribe the weekly exercises and the milestones to the marathon.
Unfortunately, Runners have a lot of distractions while preparing the marathon, work, family, illnes, and therefore each one of us arrives to the marathon with his own story.
The "simple" approach is to look at data after the competition, the Leaderboard.
As a start, I'll take just two data from the Athlete History, easy to extract. Two meaningful data, the average km run during the 4 weeks before the marathon, and the average speed that the athlete has run these km.
Meaningful, because in the last month of the training I have the recap of all the previous months that brought me to the marathon.
Easy to extract, because I can go to Strava and I have a "side-by-side" comparison, myself and the reference athlete. I said easy, well, that's not so easy, since I have to search every athlete and write down those numbers, the exact day the marathon happened, otherwise I will put in the average the rest days after the marathon.
I've set my future work in extracting more data and build better algorithms. Thank you for helping me to understand or suggest.
id:
simple counter
Marathon:
the Marathon name where the data were extracted. I use the data coming out from Strava "Side by side comparison" and the data coming from the final marathon result
Name:
The athlete's name, still some problems with UTF-8, I'll fix that soon
Category:
the sex and age group of a runner
km4week
This is the total number of kilometers run in the last 4 weeks before the marathon, marathon included. If, for example, the km4week is 100, the athlete has run 400 km in the four weeks before the marathon
sp4week
This is the average speed of the athlete in the last 4 training weeks. The average counts all the kilometers done, included the slow kilometers done before and after the training. A typic running session can be of 2km of slow running, then 12-14km of fast running, and finally other 2km of slow running. The average of the speed is this number, and with time this is one of the numbers that has to be refined
cross training:
If the runner is also a cyclist, or a triathlete, does it counts? Use this parameter to see if the athlete is also a cross trainer in other disciplines
Wall21:
In decimal. The tricky field. To acknowledge a good performance, as a marathoner, I have to run the first half marathon with the same split of the second half. If, for example, I run the first half marathon in 1h30m, I must finish the marathon in 3h (for doing a good job). If I finish in 3h20m, I started too fast and I hit "the wall". My training history is, therefore, less valid, since I was not estimating my result
Marathon time:
In decimal. This is the final result. Based on my training history, I must predict my expected Marathon time
Category:
This is an ancillary field. It gives some direction, so feel free to use or discard it. It groups in:
Thank you to the main Athletes data sources, GARMIN and STRAVA
Based on my training history, I must predict my expected Marathon time. Which other relevant data could help me to be more precise? Heart rate, cadence, speed training, what else? And how could I get those data?
Anyone who has the link will be able to view this.