FIFA World Cup 2022 Tweets
A Twitter dataset about the FIFA World Cup 2022
@kaggle.tirendazacademy_fifa_world_cup_2022_tweets
A Twitter dataset about the FIFA World Cup 2022
@kaggle.tirendazacademy_fifa_world_cup_2022_tweets
Football is one of the most loved sports worldwide. The FIFA World Cup, a global football sporting event that takes place every four years, is in Qatar this year. This dataset contains 30,000 tweets from the first day of the FIFA World Cup 2022.
The dataset was created using the Snscrape and the cardiffnlp/twitter-roberta-base-sentiment-latest model in Hugging Face Hub.
The dataset includes tweets in English containing the hashtag #WorldCup2022. For data preprocessing, we used a tokenizer for the cardiffnlp/twitter-roberta-base-sentiment-latest model and the following function:
def preprocess(text):
new_text = []
for t in text.split(" "):
t = '@user' if t.startswith('@') and len(t) > 1 else t
t = 'http' if t.startswith('http') else t
new_text.append(t)
return " ".join(new_text)
The collected tweets have been consolidated into a single dataset & shared as a Comma Separated Values file, "fifa_world_cup_2022_tweets.csv".
The dataset contains as following columns:
More information about this dataset, you can check this blog post.
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.
Happy learning 😀
CREATE TABLE fifa_world_cup_2022_tweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date_created" VARCHAR,
"number_of_likes" BIGINT,
"source_of_tweet" VARCHAR,
"tweet" VARCHAR,
"sentiment" VARCHAR
);Anyone who has the link will be able to view this.