Baselight

Complete Kaggle Datasets Collection

A dataset of Kaggle datasets, so you can explore while you explore

@kaggle.jessevent_all_kaggle_datasets

About this Dataset

Complete Kaggle Datasets Collection

Complete Kaggle Datasets Collection

A dataset of Kaggle datasets, so you can explore while you explore

Summary

> Observations: 8,036 unique datasets
> Variables: 14
> Current As: 16/01/2018

Description

For a bit of fun I thought i'd write a quick script to retrieve all of the Kaggle datasets and do a bit of analysis on it.
The dataset contains all the unique datasets hosted on Kaggle since existence, and each one links off to it.

Future Temptations

If the community is interested I am tempted to scrape over each one and retrieve each datasets metadata, consolidate a huge Kaggle data dictionary?

Data Structure

Observations: 8,036 
Variables: 14 
 $ title          <chr> "Trending YouTube Video Statistics (UPDATED)", "7ecb8f4fe2ece9f4c8ffd2... 
 $ description    <chr> "Daily statistics (views, likes, category, tags+) for trending YouTube... 
 $ url            <chr> "https://www.kaggle.com/datasnaek/youtube-new", "https://www.kaggle.co.. 
 $ owner          <chr> "Mitchell J", "Vera Lei", "chfly2000", "snow2011", "Tjb5670", "gabro",... 
 $ kernels        <int> 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... 
 $ discussions    <int> 4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... 
 $ views          <int> 9484, 55, 26, 12, 7, 6, 5, 5, 5, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3... 
 $ downloads      <int> 1668, 2, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0... 
 $ last_updated   <date> 2018-01-16, 2018-01-16, 2018-01-16, 2018-01-16, 2018-01-16, 2018-01-1... 
 $ license        <chr> "CC0", "Other", "Other", "CC0", "CC0", "Other", "Other", "CC0", "Other... 
 $ size           <dbl> 35087677, 127264365, 0, 1635900, 18, 777566, 404381, 137847611, 807171... 
 $ featured       <dbl> 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... 
 $ super_featured <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... 
 $ upvotes        <int> 46, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... 

Authors

Jesse Vent - Author - jessevent

Acknowledgments

  • Github - crypto R-Package
  • Kaggle - Kaggle; Need I say more?

Community Acknowledgements

Share link

Anyone who has the link will be able to view this.