Context
Sample of 17.000 github.com developers, and programming language they know - or want to -.
Content
I acquired the data listing the 1.000 most starred repos dataset, and getting the first 30 users that starred each repo. Cleaning the dupes.
Then for each of the 17.000 users, I calculate the frequency of each of the 1.400 technologies in the user and forked repositories metadata.
Acknowledgements
Thanks to Jihye Sofia Seo, because their dataset Top 980 Starred Open Source Projects on GitHub is the source for this dataset.
Inspiration
I am using this dataset for my github recommendation engine, I use it to find similar developers, to use his stared repositories as recommendation.
Also, I use this dataset to categorize developer types, trying to understand the weight of a developer in a team, specially when a developer leaves the company, so It is possible to draw the talent lost for the team and the company.