Baselight

Top 10000 Popular Movies Dataset

Top 10000 Popular movies based on TMDB ratings

@kaggle.omkarborikar_top_10000_popular_movies

About this Dataset

Top 10000 Popular Movies Dataset

Context

Recommendation systems are used everywhere now a days. Netflix , Amazon Prime , YouTube , Online shopping sites etc. Datasets like this are great way to start working on Recommendation system.
The Dataset was created from the official API provied by TMDB

Content

What's inside is more than just rows and columns. This is the dataset for 10000 Popular movies based on the TMDB ratings. Ideal database to start off with Recommendation algorithms.

Column Name Description
id Every movie has its unique ID.
original_language There are total 44 languages present in this column. Total 7771 movies with 'English' as original language. Values in this column are ISO 639-1 codes of languages. I.e 'en' for 'English' , 'hi' for 'Hindi' etc.
original_title Title of the movie.
popularity Popularity of movie. Bigger the number , higher the popularity.
release_date Release date of the movie. If release date is not present for any movie , then that movie is not released yet.
vote_average Average of rating/vote for the movie.
vote_count Number of ratings/vote recorded for the movie.
genre Genre of the movie.
overview Brief description of movie in string format.
revenue Revenue of Movie
runtime Runtime of movie in minutes.
tagline Tagline of the movie

Origin

The code which was used to extract this dataset can be found here - Creating Dataset of top 10000 popular movies

Update

Added Overview , Revenue , Runtime, tagline column for each movie.

Share link

Anyone who has the link will be able to view this.