Baselight

Kaggle Dataset Metadata Repository

Comprehensive Metadata for Kaggle Datasets Including Owner, Usage, and Licensing

@kaggle.ijajdatanerd_kaggle_dataset_metadata_repository

About this Dataset

Kaggle Dataset Metadata Repository

Kaggle Dataset Metadata Collection πŸ“Š

This dataset provides comprehensive metadata on various Kaggle datasets, offering detailed information about the dataset owners, creators, usage statistics, licensing, and more. It can help researchers, data scientists, and Kaggle enthusiasts quickly analyze the key attributes of different datasets on Kaggle. πŸ“š

Dataset Overview:

  • Purpose: To provide detailed insights into Kaggle dataset metadata.
  • Content: Information related to the dataset's owner, creator, usage metrics, licensing, and more.
  • Target Audience: Data scientists, Kaggle competitors, and dataset curators.

Columns Description πŸ“‹

  • datasetUrl 🌐: The URL of the Kaggle dataset page. This directs you to the specific dataset's page on Kaggle.

  • ownerAvatarUrl πŸ–ΌοΈ: The URL of the dataset owner's profile avatar on Kaggle.

  • ownerName πŸ‘€: The name of the dataset owner. This can be the individual or organization that created and maintains the dataset.

  • ownerUrl 🌍: A link to the Kaggle profile page of the dataset owner.

  • ownerUserId πŸ’Ό: The unique user ID of the dataset owner on Kaggle.

  • ownerTier πŸŽ–οΈ: The ownership tier, such as "Tier 1" or "Tier 2," indicating the owner's status or level on Kaggle.

  • creatorName πŸ‘©β€πŸ’»: The name of the dataset creator, which could be different from the owner.

  • creatorUrl 🌍: A link to the Kaggle profile page of the dataset creator.

  • creatorUserId πŸ’Ό: The unique user ID of the dataset creator.

  • scriptCount πŸ“œ: The number of scripts (kernels) associated with this dataset.

  • scriptsUrl πŸ”—: A link to the scripts (kernels) page for the dataset, where you can explore related code.

  • forumUrl πŸ’¬: The URL to the discussion forum for this dataset, where users can ask questions and share insights.

  • viewCount πŸ‘€: The number of views the dataset page has received on Kaggle.

  • downloadCount ⬇️: The number of times the dataset has been downloaded by users.

  • dateCreated πŸ“…: The date when the dataset was first created and uploaded to Kaggle.

  • dateUpdated πŸ”„: The date when the dataset was last updated or modified.

  • voteButton πŸ‘: The metadata for the dataset's vote button, showing how users interact with the dataset's quality ratings.

  • categories 🏷️: The categories or tags associated with the dataset, helping users filter datasets based on topics of interest (e.g., "Healthcare," "Finance").

  • licenseName πŸ›‘οΈ: The name of the license under which the dataset is shared (e.g., "CC0," "MIT License").

  • licenseShortName πŸ”‘: A short form or abbreviation of the dataset's license name (e.g., "CC0" for Creative Commons Zero).

  • datasetSize πŸ“¦: The size of the dataset in terms of storage, typically measured in MB or GB.

  • commonFileTypes πŸ“‚: A list of common file types included in the dataset (e.g., .csv, .json, .xlsx).

  • downloadUrl ⬇️: A direct link to download the dataset files.

  • newKernelNotebookUrl πŸ“: A link to a new kernel or notebook related to this dataset, for those who wish to explore it programmatically.

  • newKernelScriptUrl πŸ’»: A link to a new script for running computations or processing data related to the dataset.

  • usabilityRating 🌟: A rating or score representing how usable the dataset is, based on user feedback.

  • firestorePath πŸ”: A reference to the path in Firestore where this dataset’s metadata is stored.

  • datasetSlug 🏷️: A URL-friendly version of the dataset name, typically used for URLs.

  • rank πŸ“ˆ: The dataset's rank based on certain metrics (e.g., downloads, votes, views).

  • datasource 🌐: The source or origin of the dataset (e.g., government data, private organizations).

  • medalUrl πŸ…: A URL pointing to the dataset's medal or badge, indicating the dataset's quality or relevance.

  • hasHashLink πŸ”—: Indicates whether the dataset has a hash link for verifying data integrity.

  • ownerOrganizationId 🏒: The unique organization ID of the dataset's owner if the owner is an organization rather than an individual.

  • totalVotes πŸ—³οΈ: The total number of votes the dataset has received from users, reflecting its popularity or quality.

  • category_names πŸ“‘: A comma-separated string of category names that represent the dataset’s classification.


This dataset is a valuable resource for those who want to analyze Kaggle's ecosystem, discover high-quality datasets, and explore metadata in a structured way. πŸŒπŸ“Š

Share link

Anyone who has the link will be able to view this.