Context
This dataset contains IMDb data in conjunction with descriptions for the titles taken from Rotten Tomatoes.
This data was collected in an attempt to aid my Books dataset to help with projects concerning cross-content analysis/recommendations for instance.
Please Upvote if this helps you!
Content
The most (Total 21 features) prominent features are:
- Title Type
- Primary Title
- Original Title
- Is Adult?
- Year
- Run-time Minutes
- Genres (Multiple)
- Average Rating (as on IMDb)
- Num. of Votes
- Region
- Genres
- Average Rating
- Number of Ratings
- Types
- Attributes
- Description
Important Data Context
The initial dataset being too large has been filtered. The following are the criteria for it:
- The data is from the 90s and onwards
- Only 'en' (English) language titles have been retained
- The regions range from Canada, Greater Britain, India and USA Only
- Movie/shows from the 90s-00s with ratings greater than or equal to 7.9 have been retained
- Movie/shows from the 2000s and onwards with ratings greater than or equal to 6.5 have been retained
- Only titles with num of rating votes greater than 3000 have been retained for Canada and India
Inspiration
- Cluster movies/shows based on Descriptions and Genres
- Content-based recommendation system using Genre, Description, and Ratings
- Genre prediction from Description data (Multi-label classification)