Baselight

Climb Dataset

Climber and Route tables clean from the 8a.nu dataset

@kaggle.jordizar_climb_dataset

Loading...
Loading...

About this Dataset

Climb Dataset

Intro

Hi 👋,

As a fanatic climber and a junior data engineer, I see that there is a lack of climbing datasets for all the people like us that want to learn and play around with our passion. With that in mind I uploaded this 3 tables.

It is based on the original dataset scrapped from 8a.nu by David Cohen (https://www.kaggle.com/datasets/dcohen21/8anu-climbing-logbook). Please refer to that dataset if you have any questions on this one.

Use of this dataset

Feel free to use this dataset as you please, just dont forget to mention the source.

Contents

Grading table: grades_conversion_table.csv

This table gives you the conversion from numbered to french grading.

Routes table: routes_rated.csv

name_id -> the route id
I cleaned the ascensions table and gave some shape so you dont have to worry about problems like having 10 different names for the same route or crag.

grade_mean -> mean of all ascensions
I changed a little bit the ascension grading: if someone stated that a route was hard 7a, then I put a 7a/+, same with soft grading. After that I calculated the median for all the grading of each route (more robust with outlayers)

rating_total -> I did this calculation based on 3 features and taking the first component of the PCA:

  • comment sentiment
  • rating
  • recomendations

tall_recommend_sum -> For each rute I am adding up the following:

  • if the person is tall and consider route easy +1
  • if the person is tall and consider route hard -1
  • if the person is short and consider route easy -1
  • if the person is short and consider route hard +1
    (considering tall > 180cm, short < 170cm)

cluster -> I clustered the routes in 9 different clusters that can be more or less identified like:
0 - Soft routes
1 - Routes for some reason preferred by women
2 - Famouse routes
3 - Very hard routes
4 - Very repeated routes
5 - Chipped routes, with soft rate
6 - Traditiona, not chipped routes
7 - Easy to On-sight routes, not very repeated
8 - Very famouse routes but not so repeated and not so traditional

Climbers table: climber_df.csv

date_first -> date of the first ascension
date_last -> date of the last ascension
grades_first -> grade of the first ascension
grades_last -> grade of the last ascension
years_cl -> years climbing
grades_count -> number of routes done by climber
year_first -> year of the first ascension
year_last -> year of the last ascension

How to obtain this data

If you want to see how I obtained these 3 tables from the raw data, please check my github repos at:

Climber table -> https://github.com/jordi-zaragoza/Climbing-Data-Analysis/blob/master/src/1.Project_clean.ipynb
Routes table -> https://github.com/jordi-zaragoza/Climbing-Route-Recommender/blob/master/src/1.get_routes_table.ipynb

Acknowladgement

Thanks to David Cohen (https://www.kaggle.com/datasets/dcohen21/8anu-climbing-logbook).

Tables

Climber Df

@kaggle.jordizar_climb_dataset.climber_df
  • 321.3 KB
  • 10927 rows
  • 16 columns
Loading...

CREATE TABLE climber_df (
  "user_id" BIGINT,
  "country" VARCHAR,
  "sex" BIGINT,
  "height" BIGINT,
  "weight" BIGINT,
  "age" DOUBLE,
  "years_cl" BIGINT,
  "date_first" TIMESTAMP,
  "date_last" TIMESTAMP,
  "grades_count" BIGINT,
  "grades_first" BIGINT,
  "grades_last" BIGINT,
  "grades_max" BIGINT,
  "grades_mean" DOUBLE,
  "year_first" BIGINT,
  "year_last" BIGINT
);

Grades Conversion Table

@kaggle.jordizar_climb_dataset.grades_conversion_table
  • 4.03 KB
  • 85 rows
  • 3 columns
Loading...

CREATE TABLE grades_conversion_table (
  "unnamed_0" BIGINT,
  "grade_id" BIGINT,
  "grade_fra" VARCHAR
);

Routes Rated

@kaggle.jordizar_climb_dataset.routes_rated
  • 1.91 MB
  • 55858 rows
  • 10 columns
Loading...

CREATE TABLE routes_rated (
  "unnamed_0" BIGINT,
  "name_id" BIGINT,
  "country" VARCHAR,
  "crag" VARCHAR,
  "sector" VARCHAR,
  "name" VARCHAR,
  "tall_recommend_sum" BIGINT,
  "grade_mean" DOUBLE,
  "cluster" BIGINT,
  "rating_tot" DOUBLE
);

Share link

Anyone who has the link will be able to view this.