Overview
The CSV data file contains tweets scraped from twitter about Monkeypox. The file contains eight significant columns namely:
date
- Date of the tweet
time
- Time of tweet
id
- Twitter username ID of the person who tweeted about monkeypox
tweet
- Text about monkeypox
language
- Language used in the tweet
replies_count
- Number of replies for the tweet
retweets_count
- Number of retweets
likes_count
- Number of likes
Similar Dataset
You may also want to check out the Monkeypox Reddit Dataset: https://www.kaggle.com/datasets/vencerlanz09/monkeypox-reddit-topics
Monkeypox Reddit Topics EDA + Sentiment Analysis Notebook: https://www.kaggle.com/code/vencerlanz09/monkeypox-reddit-topics-eda-sentiment-analysis
Inspiration
I'm currently starting to learn about NLP and I'm planning to create an algorithm that could predict whether a certain tweet is about monkey pox or not. Hopefully, I could grasp the concepts quickly and gather an appropriate dataset as my personal project.