Raw Twitter Timelines W/ No Retweets
These are complete twitter timelines of various popular celebs with no retweets
@kaggle.speckledpingu_rawtwitterfeeds
These are complete twitter timelines of various popular celebs with no retweets
@kaggle.speckledpingu_rawtwitterfeeds
This is a dataset of tweets from various active scientists and personalities ranging from Donald Trump and Hillary Clinton to Neil deGrasse Tyson. More are forthcoming.
They were obtained through javascript scraping of the browser twitter timeline rather than a Tweepy python API or the twitter timeline API.
The inspiration for this twitter dataset is comparing tweets in my own twitter analysis to find who tweets like whom, e.g. does Trump or Hillary tweet more like Kim Kardashian than one another?
Thus, this goes further back in time than anything directly available from Twitter.
The data is in JSON format rather than CSV, which will be forthcoming as well.
Kim Kardashian, Adam Savage, BillNye, Neil deGrasse Tyson, Donald Trump, and Hillary Clinton have been collected up to 2016-10-14
Richard Dawkins, Commander Scott Kelly, Barack Obama, NASA, and The Onion, tweets up to 2016-10-15.
For your own pleasure, with special thanks to the Trump Twitter Archive for providing some of the code, here is the JavaScript used to scrape tweets off of a timeline and output the results to the clipboard in JSON format:
Construct the query with from:TWITTERHANDLE since:DATE until:DATE
In the browser console set up automatic scrolling with:
setInterval(function(){ scrollTo(0, document.body.scrollHeight) }, 2500)
Scrape the resulting timeline with:
var allTweets = []; var tweetElements = document.querySelectorAll('li.stream-item');
for (var i = 0; i < tweetElements.length; i++) { try {
var el = tweetElements[i]; var text = el.querySelector('.tweet-text').textContent; allTweets.push({ id: el.getAttribute('data-item-id'), date: el.querySelector('.time a').textContent, text: text, link: el.querySelector('div.tweet').getAttribute('data-permalink-path'), retweet: text.indexOf('"@') == 0 && text.includes(':') ? true : false }); } catch(err) {}
};
copy(allTweets);
Have fun!
CREATE TABLE adamsavagetweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE alltweets (
"unnamed_0_1" BIGINT -- Unnamed: 0.1,
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE barackobama (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE donaldtrump2014_01_01to2016_10_14tweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE donaldtrumptweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE fivethirtyeighttweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE hillaryclinton2014_01_01to2016_10_14tweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE hillaryclintontweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE kimkardashiantweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE n_10460kdnuggetstweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE neildegrassetysontweets (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE richarddawkins (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
CREATE TABLE scottkelly (
"unnamed_0" BIGINT -- Unnamed: 0,
"date" VARCHAR,
"id" BIGINT,
"link" VARCHAR,
"retweet" BOOLEAN,
"text" VARCHAR,
"author" VARCHAR
);
Anyone who has the link will be able to view this.