Japanese Anime: An In-Depth IMDb Data Set
Unlocking Insights into Popularity, Ratings, and Trends in Japanese Animation
@kaggle.lorentzyeung_all_japanese_anime_titles_in_imdb
Unlocking Insights into Popularity, Ratings, and Trends in Japanese Animation
@kaggle.lorentzyeung_all_japanese_anime_titles_in_imdb
The dataset is fetched on 8 Sept, 2023, at 18:00 pm London time.
The dataset was generated using a web scraping script written in Python, utilizing the Scrapy library. The script navigates through IMDb's list of animations originating from Japan, scraping relevant information from each listing. The spider starts from the URL https://www.imdb.com/search/title/?genres=Animation&countries=jp and follows the "Next" links to traverse through multiple pages of listings.
The dataset provides a comprehensive view of various animations listed on IMDb that are categorized under the genre "Animation" and originate from Japan. It includes details such as the title, genre, user rating, number of votes, runtime, year of release, summary, stars, certificate, metascore, gross earnings, episode flag, and episode title when applicable.
However, the dataset also includes some animations not regarded as Japanese Anime, e.g. Toy Storys.
It is because I can only filter the Anime by using regions, but the origin of production.
Title: The name of the animation.
Genre: The genre(s) under which the animation falls, e.g., Action, Adventure, etc.
User Rating: The IMDb user rating out of 10.
Number of Votes: The total number of IMDb users who have rated the animation.
Runtime: The duration of the animation in minutes.
Year: The year the animation was released or started airing.
Summary: A brief or full summary of the animation's plot. Full summaries are fetched when available.
Stars: List of main actors or voice actors involved in the animation.
Certificate: The certification of the animation, e.g., PG, PG-13, etc.
Metascore: The Metascore rating, if available, which is an aggregated score from various critics.
Gross: The gross earnings or box office collection of the animation.
Episode: A binary flag indicating whether the listing is for an episode of a series (1 for yes, 0 for no).
Episode Title: The title of the episode if the listing is for an episode; otherwise, it will be None.
Exploratory Data Analysis (EDA)
Genre Popularity: Analyze which genres are most popular based on user ratings and number of votes.
Year-wise Trends: Examine how the popularity of anime has evolved over the years.
Predictive Modeling
Rating Prediction: Use machine learning algorithms to predict the rating of an anime based on features like genre, runtime, and stars.
Success Prediction: Predict the financial success (Gross earnings) of an anime based on various features.
Content Recommendation
Personalized Recommendations: Use user ratings and genre information to build a recommendation system.
Sentiment Analysis
Summary Sentiment: Perform sentiment analysis on the summary to see if the tone of the summary correlates with user ratings or other features.
**Network Analysis
Actor Collaboration: Create a network graph to analyze frequent collaborations between actors.
Time-Series Analysis
Rating Over Time: Analyze how ratings evolve over time for long-running series.
Market Research
Target Audience: Use the certificate and genre information to identify target demographics for marketing anime-related products.
Academic Research
Cultural Impact: Study the cultural impact of anime by analyzing its popularity, genres, and actors.
Data Visualization
Interactive Dashboards: Create dashboards to visualize the data and allow users to filter by various criteria like genre, year, or rating.
Natural Language Processing (NLP)
Topic Modeling: Use NLP techniques to identify common themes or topics in the summaries.
By leveraging Python for data analysis, you can use libraries like Pandas for data manipulation, Matplotlib and Seaborn for data visualization, and scikit-learn for machine learning to extract valuable insights from this dataset.
Anyone who has the link will be able to view this.