Baselight
Sign In
kaggle

Covid Twitter Emotion Analysis

Kaggle

@kaggle.saurabhshahane_covid_twitter_emotion_analysis

Loading...
Loading...

Analysis of tweets in Covid Period

Dataset Description

Context

Twitter data was collected using Twitter’s Application Programming Interface(API) and Tweepy, a python library to access the twitter API. Certain keywords related to COVID’19 like Coronavirus, ncov, Wuhan, China, Covid-19, Epidemic, Pandemic, SocialDistancing, etc. were used to collect the tweets. Only the tweets that were in English and the ones that had a geo-tag were collected. During the exploratory data analysis, we noticed that a number of tweets consisted of only certain words and not proper sentences and analyzing the emotion of such tweets might not give us a proper overview of the emotions. Thus, only the tweets with at least 6 words in them were used. This significantly reduced the number of tweets collected. Finally, we had over 1 million tweets over the span of February, March, April, May, and June. The tweets were then further processed to remove all the HTML text, ‘@’ mentions, URL links, and #hashtags.

Content

The data was analyzed using a machine learning model and tweets were categorized into various emotions. The dataset provides the count of tweets per country per emotion for 5 months.

Acknowledgements

Matta, Nikhil (2020), “Covid Twitter Emotion Analysis Data”, Mendeley Data, V1, doi: 10.17632/47hy8yyky5.1


Related Datasets

Share link

Anyone who has the link will be able to view this.