Baselight

Australian Election 2019 Tweets

May 18th 2019, 180k+ tweets

@kaggle.taniaj_australian_election_2019_tweets

Loading...
Loading...

About this Dataset

Australian Election 2019 Tweets

Context

During the 2019 Australian election I noticed that almost everything I was seeing on Twitter was unusually left-wing. So I decided to scrape some data and investigate. Unfortunately my sentiment analysis has so far been too inaccurate to come to any useful conclusions. I decided to share the data so that others may be able to help with the sentiment or any other interesting analysis.

Content

Over 180,000 tweets collected using Twitter API keyword search between 10.05.2019 and 20.05.2019.
Columns are as follows:

  • created_at: Date and time of tweet creation
  • id: Unique ID of the tweet
  • full_text: Full tweet text
  • retweet_count: Number of retweets
  • favorite_count: Number of likes
  • user_id: User ID of tweet creator
  • user_name: Username of tweet creator
  • user_screen_name: Screen name of tweet creator
  • user_description: Description on tweet creator's profile
  • user_location: Location given on tweet creator's profile
  • user_created_at: Date the tweet creator joined Twitter

The latitude and longitude of user_location is also available in location_geocode.csv. This information was retrieved using the Google Geocode API.

Acknowledgements

Thanks to Twitter for providing the free API.

Inspiration

There are a lot of interesting things that could be investigated with this data. Primarily I was interested to do sentiment analysis, before and after the election results were known, to determine whether Twitter users are indeed a left-leaning bunch. Did the tweets become more negative as the results were known?

Other ideas for investigation include:

  • Take into account retweets and favourites to weight overall sentiment analysis.

  • Which parts of the world are interested (ie: tweet about) the Australian elections, apart from Australia?

  • How do the users who tweet about this sort of thing tend to describe themselves?

  • Is there a correlation between when the user joined Twitter and their political views (this assumes the sentiment analysis is already working well)?

  • Predict gender from username/screen name and segment tweet count and sentiment by gender

Tables

Auspol2019

@kaggle.taniaj_australian_election_2019_tweets.auspol2019
  • 35.6 MB
  • 183379 rows
  • 11 columns
Loading...

CREATE TABLE auspol2019 (
  "created_at" VARCHAR,
  "id" VARCHAR,
  "full_text" VARCHAR,
  "retweet_count" DOUBLE,
  "favorite_count" DOUBLE,
  "user_id" DOUBLE,
  "user_name" VARCHAR,
  "user_screen_name" VARCHAR,
  "user_description" VARCHAR,
  "user_location" VARCHAR,
  "user_created_at" TIMESTAMP
);

Location Geocode

@kaggle.taniaj_australian_election_2019_tweets.location_geocode
  • 266.18 KB
  • 11153 rows
  • 3 columns
Loading...

CREATE TABLE location_geocode (
  "name" VARCHAR,
  "lat" DOUBLE,
  "long" DOUBLE
);

Share link

Anyone who has the link will be able to view this.