Baselight

YouTube Dislikes Dataset

Data about trending YouTube videos as of December 13, 2021

@kaggle.dmitrynikolaev_youtube_dislikes_dataset

About this Dataset

YouTube Dislikes Dataset

YouTube Dislikes Dataset

This dataset contains information about trending YouTube videos from August 2020 to December 2021 for the USA, Canada, and Great Britain.

Context and Content

Youtube announced the decision to hide the number of dislikes from users around November 2021. However, the official YouTube Data API allowed you to get information about dislikes until December 13, 2021.

This dataset contains the latest possible information about dislikes, which was collected just before December 13. The information was collected by videos that had been trending in the USA, Canada, and Great Britain for a year prior.

The information is aimed at the English audience. In particular, all non-ASCII and non-Latin characters have been removed from the text fields.

The comments were received using the following query and combined into one string:

request = youtube.commentThreads().list(
                part="snippet",
                maxResults=20,
                order="relevance",
                textFormat="plainText",
                videoId=video_id)
response = request.execute()

order=relevance parameter is ignored when videoId is specified, so, basically, it's 20 random comments.

The code used to collect this dataset is available here.

To know more visit this GitLab repo.

Acknowledgements

This dataset was collected using the official YouTube Data API v3.
Unique video IDs were extracted from YouTube Trending Video Dataset.
Banner image - photo by Alexander Shatov on Unsplash.

Inspiration

Possible uses of this dataset may include a wide range of tasks:

  • Exploratory Data Analysis and Sentiment Analysis
  • Clustering YouTube videos
  • Training neural networks to analyze comments or video descriptions
  • and so on

Share link

Anyone who has the link will be able to view this.