Intro
Hi 👋,
As a fanatic climber and a junior data engineer, I see that there is a lack of climbing datasets for all the people like us that want to learn and play around with our passion. With that in mind I uploaded this 3 tables.
It is based on the original dataset scrapped from 8a.nu by David Cohen (https://www.kaggle.com/datasets/dcohen21/8anu-climbing-logbook). Please refer to that dataset if you have any questions on this one.
Use of this dataset
Feel free to use this dataset as you please, just dont forget to mention the source.
Contents
Grading table: grades_conversion_table.csv
This table gives you the conversion from numbered to french grading.
Routes table: routes_rated.csv
name_id -> the route id
I cleaned the ascensions table and gave some shape so you dont have to worry about problems like having 10 different names for the same route or crag.
grade_mean -> mean of all ascensions
I changed a little bit the ascension grading: if someone stated that a route was hard 7a, then I put a 7a/+, same with soft grading. After that I calculated the median for all the grading of each route (more robust with outlayers)
rating_total -> I did this calculation based on 3 features and taking the first component of the PCA:
- comment sentiment
- rating
- recomendations
tall_recommend_sum -> For each rute I am adding up the following:
- if the person is tall and consider route easy +1
- if the person is tall and consider route hard -1
- if the person is short and consider route easy -1
- if the person is short and consider route hard +1
(considering tall > 180cm, short < 170cm)
cluster -> I clustered the routes in 9 different clusters that can be more or less identified like:
0 - Soft routes
1 - Routes for some reason preferred by women
2 - Famouse routes
3 - Very hard routes
4 - Very repeated routes
5 - Chipped routes, with soft rate
6 - Traditiona, not chipped routes
7 - Easy to On-sight routes, not very repeated
8 - Very famouse routes but not so repeated and not so traditional
Climbers table: climber_df.csv
date_first -> date of the first ascension
date_last -> date of the last ascension
grades_first -> grade of the first ascension
grades_last -> grade of the last ascension
years_cl -> years climbing
grades_count -> number of routes done by climber
year_first -> year of the first ascension
year_last -> year of the last ascension
How to obtain this data
If you want to see how I obtained these 3 tables from the raw data, please check my github repos at:
Climber table -> https://github.com/jordi-zaragoza/Climbing-Data-Analysis/blob/master/src/1.Project_clean.ipynb
Routes table -> https://github.com/jordi-zaragoza/Climbing-Route-Recommender/blob/master/src/1.get_routes_table.ipynb
Acknowladgement
Thanks to David Cohen (https://www.kaggle.com/datasets/dcohen21/8anu-climbing-logbook).