Skills of 17.000 github.com users.

Context

Sample of 17.000 github.com developers, and programming language they know - or want to -.

Content

I acquired the data listing the 1.000 most starred repos dataset, and getting the first 30 users that starred each repo. Cleaning the dupes.
Then for each of the 17.000 users, I calculate the frequency of each of the 1.400 technologies in the user and forked repositories metadata.

Acknowledgements

Thanks to Jihye Sofia Seo, because their dataset Top 980 Starred Open Source Projects on GitHub is the source for this dataset.

Inspiration

I am using this dataset for my github recommendation engine, I use it to find similar developers, to use his stared repositories as recommendation.
Also, I use this dataset to categorize developer types, trying to understand the weight of a developer in a team, specially when a developer leaves the company, so It is possible to draw the talent lost for the team and the company.

Related Datasets

GitHub Programming Languages Data

@kaggle
AI Performance On Language Tasks

@owid
SFC2014 - REACT EU Overview Allocation Vs Decided

@esifunds
AI Performance On Coding Problems

@owid
Trust Questions In The European Social Survey, Latinobarómetro And Afrobarometer

@owid
AI Performance On Math Problems

@owid

GitHub Programming Languages Data

AI Performance On Language Tasks

SFC2014 - REACT EU Overview Allocation Vs Decided

AI Performance On Coding Problems

Trust Questions In The European Social Survey, Latinobarómetro And Afrobarometer

AI Performance On Math Problems