Baselight
Sign In
kaggle

Reddit Comments

Kaggle

@kaggle.ignaciorusso_reddit_comments

Loading...
Loading...

Comments, authors and conversations

Dataset Description

reddit_comments.json. It's a jsonarray where every json element is representing one comment. For each comment there are several attributes to analyze. I take just body like mandatory, there is the comment text. Also it's available one label on the variable is_hate where the codification is: hate speech (1) o not hate (0).

conversations.csv. Every row on the file is representing one conversational thread. Comma is the current separator for differents comments on the same thread.

reddit_authors.json. One jsonarray where every json element is representing one author. It's a complement to the informatión of reddit_comments.json with all the attributes related to the authors. It could be that not all the authors being on the file due that some of them could been suspended by Reddit.


Related Datasets

Share link

Anyone who has the link will be able to view this.