Baselight

Stock Market News Data In Portuguese

Sentiment Analysis Dataset for Financial News in Brazilian Portuguese

@kaggle.mateuspicanco_financial_phrase_bank_portuguese_translation

About this Dataset

Stock Market News Data In Portuguese

Stock Market News Data in Portuguese

The Financial Phrase Bank is a dataset originally developed for the paper Good Debt or Bad Debt: Detecting Semantic Orientations in Economic Texts, made available by researchers from Aalto University and the Indian Institute of Management. The dataset allows for a useful benchmark for fine-tuning Language Models on Sentiment Analysis Tasks.

As the amount of annotated text data (especially about the financial market) in Portuguese, I went ahead and translated the entire dataset for people to try out Sentiment Analysis tasks in Portuguese.

Content

The dataset originally contains about 4840 manually annotated financial news in English and consists of three columns:

  1. y: the annotated label for the sentiment of the news text (neutral, positive, negative);
  2. text: the original text for each record;
  3. text_pt: the translated and that I manually validated version of the original record;

Acknowledgments

[1] Malo, P., Sinha, A., Korhonen, P., Wallenius, J., & Takala, P. (2014). Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65(4), 782-796.

Photo by Markus Winkler on Unsplash

Share link

Anyone who has the link will be able to view this.