Baselight

Suicide And Depression Detection

A dataset that can be used to detect suicide and depression in a text.

@kaggle.nikhileswarkomati_suicide_watch

About this Dataset

Suicide And Depression Detection

Context

When I thought of building a text classifier to detect Suicide Ideation I couldn't find any public dataset. Hope this can be useful to anyone looking for suicide detection datasets and can save their time 💜.

Content

The dataset is a collection of posts from the "SuicideWatch" and "depression" subreddits of the Reddit platform. The posts are collected using Pushshift API. All posts that were made to "SuicideWatch" from Dec 16, 2008(creation) till Jan 2, 2021, were collected while "depression" posts were collected from Jan 1, 2009, to Jan 2, 2021. All posts collected from SuicideWatch are labeled as suicide, While posts collected from the depression subreddit are labeled as depression. Non-suicide posts are collected from r/teenagers.

Version

The current version has only suicide & non-suicide labels.
Version V13 has suicide, depression & teenagers(normal conversations) as labels.

Collection

A notebook is provided to show how posts from Reddit can be collected using PushShift API.

Share link

Anyone who has the link will be able to view this.