Reddit Comments
Comments, authors and conversations
@kaggle.ignaciorusso_reddit_comments
Comments, authors and conversations
@kaggle.ignaciorusso_reddit_comments
reddit_comments.json. It's a jsonarray where every json element is representing one comment. For each comment there are several attributes to analyze. I take just body like mandatory, there is the comment text. Also it's available one label on the variable is_hate where the codification is: hate speech (1) o not hate (0).
conversations.csv. Every row on the file is representing one conversational thread. Comma is the current separator for differents comments on the same thread.
reddit_authors.json. One jsonarray where every json element is representing one author. It's a complement to the informatión of reddit_comments.json with all the attributes related to the authors. It could be that not all the authors being on the file due that some of them could been suspended by Reddit.
CREATE TABLE melbourne_house_prices_less (
"suburb" VARCHAR,
"address" VARCHAR,
"rooms" BIGINT,
"type" VARCHAR,
"price" DOUBLE,
"method" VARCHAR,
"sellerg" VARCHAR,
"date" TIMESTAMP,
"postcode" BIGINT,
"regionname" VARCHAR,
"propertycount" BIGINT,
"distance" DOUBLE,
"councilarea" VARCHAR
);CREATE TABLE melbourne_housing_full (
"suburb" VARCHAR,
"address" VARCHAR,
"rooms" BIGINT,
"type" VARCHAR,
"price" DOUBLE,
"method" VARCHAR,
"sellerg" VARCHAR,
"date" TIMESTAMP,
"distance" DOUBLE,
"postcode" DOUBLE,
"bedroom2" DOUBLE,
"bathroom" DOUBLE,
"car" DOUBLE,
"landsize" DOUBLE,
"buildingarea" DOUBLE,
"yearbuilt" DOUBLE,
"councilarea" VARCHAR,
"lattitude" DOUBLE,
"longtitude" DOUBLE,
"regionname" VARCHAR,
"propertycount" DOUBLE
);Anyone who has the link will be able to view this.