Suicidal Behaviors And Attempts
@kaggle.thedevastator_suicidal_behaviors_and_attempts
@kaggle.thedevastator_suicidal_behaviors_and_attempts
By [source]
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Before you begin your analysis, it is important that you are familiar with the dataset:
The columns include: User (a unique identifier for each user), Post (provides text of each post), Label (indicates whether a post is associated with suicidal behavior or not).
Each row in this dataset provides detailed information regarding one user’s post on Reddit related to suicide.
Now that you have an understanding of what’s included in this set, let's dive into working with it! First off, we recommend exploring within Jupyter Notebook given its ease of use and interactive nature - just open up a new notebook in Kaggle Notebooks. Here are some helpful tips:
Explore Data Types : Take some time getting familiarized with what type of data is found in each column by using various commands such as .dtypes or .info(). Knowing which type each column holds will make it easier when filtering columns later on. You could also explore any missing values using .isnull().sum() command which provides a good indication if any preprocessing such as filling missing values needs to take place prior to analysis.
Analyze Labels & Posts : Have a better understanding of labels attached to posts using value_counts() command which helps summarize proportions between these two variables so that more informed decisions can be made later on during analysis/modeling stages. Having an understanding when dealing real world problems often requires analyzing different aspects/labels associated before proceeding further so take your time here! For example, grouping posts based on labels can be done via groupby(Label).
Visualize your Results : Visualization makes findings easier to interpret; try leveraging matplotlib packages such as plt xy or seaborn sns heatmap; alternatively use Tableau externally once data preparation has been completed previously within Jupyter Notebook along side Python libraries like Scikit Learn or Numpy used for modeling techniques such machine learning algorithm implementations or complex computations like linear algebraic analyses respectively should there ever come an instance were
- Analyzing which risk factors associated with suicidal behavior are most prevalent in certain demographic groups, such as gender and age.
- Examining the potential outcomes of different methods of self-harm and understanding their lethality levels to create more effective prevention and response strategies.
- Creating predictive models for mental health workers to use when assessing individuals at risk of suicide so they can identify individuals who may need immediate intervention or follow up care
If you use this dataset in your research, please credit the original authors.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: 500_Reddit_users_posts_labels.csv
| Column name | Description |
|---|---|
| User | Unique identifier for each user. (String) |
| Post | Text of the post. (String) |
| Label | Label indicating whether the post is related to suicidal behavior or not. (Boolean) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .
@kaggle
@owid
@owid
Anyone who has the link will be able to view this.