Predict emotion from textual data : Multi-class text classification

Context

Emotion detection from text is one of the challenging problems in Natural Language Processing. The reason is the unavailability of labeled dataset and the multi-class nature of the problem. Humans have a variety of emotions and it is difficult to collect enough records for each emotion and hence the problem of class imbalance arises. Here we have a labeled data for emotion detection and the objective is to build an efficient model to detect emotion.

Content

The data is basically a collection of tweets annotated with the emotions behind them. We have three columns tweet_id, sentiment, and content. In "content" we have the raw tweet. In "sentiment" we have the emotion behind the tweet. Refer to the starter notebook for more insights.

Acknowledgements

This public domain dataset is collected from data.world platform. Thanks, data.world for releasing it under Public License.

Inspiration

The data that we have is having 13 different emotion 40000 records. So it's challenging to build an efficient multiclass classification model. We may need to logically reduce the number of classes here and use some advanced methods to build efficient model.

Related Datasets

Sentiment Analysis Of Tweets

@kaggle
AI Performance On Language Tasks

@owid
Economic Lexicon

@ecjrc
AI Performance On Coding Problems

@owid
AI Performance On Math Problems

@owid
Ethnic Power Relations Dataset (ETH, 2021)

@owid

Sentiment Analysis Of Tweets

AI Performance On Language Tasks

Economic Lexicon

AI Performance On Coding Problems

AI Performance On Math Problems

Ethnic Power Relations Dataset (ETH, 2021)