IMDb Movies
IMDb Movies Dataset (Sorted by popularity)
@kaggle.elvinrustam_imdb_movies_dataset
IMDb Movies Dataset (Sorted by popularity)
@kaggle.elvinrustam_imdb_movies_dataset
This dataset was scraped based on the popularity of IMDb movies (highest to lowest popularity).
There are total 9083 movies in the dataset.
!UNCLEAN VERSION: IMDbMovies
Title: The name of the movie.
Summary: A brief overview of the movie's plot.
Director: The person responsible for overseeing the creative aspects of the film.
Writer: The individual who crafted the screenplay and story for the movie.
Main Genres: The primary categories or styles that the movie falls under.
Motion Picture Rating: The age-appropriate classification for viewers.
*Motion Picture Rating Categories: *
G (General Audience): Suitable for all ages; no offensive content.
PG (Parental Guidance): May contain mild language, violence, or thematic elements; parental guidance advised.
PG-13 (Parents Strongly Cautioned): Some material may be inappropriate for those under 13; more intense violence, language, or suggestive content.
R (Restricted): Restricted to viewers over 17 or 18; may contain adult themes, strong language, sexual content, or violence.
NC-17 (Adults Only): Restricted to adults 17 and older; may contain explicit sexual content or graphic violence.
Runtime: The total duration of the movie.
Release Year: The year in which the movie was officially released.
Rating: The average score given to the movie by viewers.
Number of Ratings: The total count of ratings submitted by viewers.
Budget: The estimated cost of producing the movie.
Gross in US & Canada: The total earnings from the movie's screening in the United States and Canada.
Gross worldwide: The overall worldwide earnings of the movie.
Opening Weekend Gross in US & Canada: The amount generated during the initial weekend of the movie's release in the United States and Canada.
!CLEAN VERSION: IMDbMovies-Clean
What I did:
I keep all missing values. Most of the cases missing values stem from lack of information in the website. There is few cases missing values stem from scraper. For example: Some movies will release in 2024 and there are no runtimes and ratings for these movies.
I changed the syntax of the 'Runtime', 'Rating', 'Number of Ratings', 'Budget', 'Gross in US & Canada', 'Gross worldwide', and 'Opening Weekend Gross in US & Canada' columns.
In some cases, I utilized the information from a single column to create two separate columns.
Anyone who has the link will be able to view this.