Twitter Vaccination Dataset
Tweets on vaccination 2019-3-10 : 2019-21-06
@kaggle.keplaxo_twitter_vaccination_dataset
Tweets on vaccination 2019-3-10 : 2019-21-06
@kaggle.keplaxo_twitter_vaccination_dataset
There is a lot more that can we attain from social media sentiment and data than mere likes and shares especially where health care is concerned. This dataset is part of the data collected for the Vaccine hesitancy challenge on JOGL. We believe it is important to capture the views and trends of the public, social media sites like twitter provide a good window into this area.
We collected all tweets containing at the search string: vaccination. Along with the tweet text, we downloaded the date and time when the tweet was published, and the location of the user (if provided). We also downloaded the user id, follower ids, and friends ids. The followers of a user A are those users who will receive messages from user A. The friends of a user A are those users from whom user A receives messages. Thus, information flows from a user to his followers. We collected tweets using the open source information tool, TWINT.(https://github.com/twintproject) and a python algorithm.
In contrast to the open Twitter Search API, which only allows one to query tweets posted within the last seven days, Twint makes it possible to collect a much larger sample of Twitter posts, ranging several years. We queried Twint for different key terms that relate to the topic of vaccination ranging from the year 2006 to 30th of November 2019 and stored in an aggregated CSV file.
We wouldn't be here without the help of others.
To my knowledge there is no active program that is currently actively carrying out qualitative analysis on Twitter data for sentiment associated with Vaccination. However, a number of studies have been carried out to analyse twitter for social media trends on Vaccination.
The Dataset can be used for analysis Including:
CREATE TABLE master (
"id" DOUBLE,
"conversation_id" DOUBLE,
"created_at" DOUBLE,
"date" TIMESTAMP,
"time" VARCHAR,
"timezone" VARCHAR,
"user_id" DOUBLE,
"username" VARCHAR,
"name" VARCHAR,
"place" VARCHAR,
"tweet" VARCHAR,
"mentions" VARCHAR,
"urls" VARCHAR,
"photos" VARCHAR,
"replies_count" BIGINT,
"retweets_count" BIGINT,
"likes_count" BIGINT,
"hashtags" VARCHAR,
"cashtags" VARCHAR,
"link" VARCHAR,
"retweet" BOOLEAN,
"quote_url" VARCHAR,
"video" BIGINT,
"near" VARCHAR,
"geo" VARCHAR,
"source" VARCHAR,
"user_rt_id" VARCHAR,
"user_rt" VARCHAR,
"retweet_id" VARCHAR,
"reply_to" VARCHAR,
"retweet_date" VARCHAR
);CREATE TABLE vaccination2 (
"id" BIGINT,
"conversation_id" BIGINT,
"created_at" BIGINT,
"date" TIMESTAMP,
"time" VARCHAR,
"timezone" VARCHAR,
"user_id" BIGINT,
"username" VARCHAR,
"name" VARCHAR,
"place" VARCHAR,
"tweet" VARCHAR,
"mentions" VARCHAR,
"urls" VARCHAR,
"photos" VARCHAR,
"replies_count" BIGINT,
"retweets_count" BIGINT,
"likes_count" BIGINT,
"hashtags" VARCHAR,
"cashtags" VARCHAR,
"link" VARCHAR,
"retweet" BOOLEAN,
"quote_url" VARCHAR,
"video" BIGINT,
"near" VARCHAR,
"geo" VARCHAR,
"source" VARCHAR,
"user_rt_id" VARCHAR,
"user_rt" VARCHAR,
"retweet_id" VARCHAR,
"reply_to" VARCHAR,
"retweet_date" VARCHAR
);Anyone who has the link will be able to view this.