Name: Reddit's /r/funny Subreddit
Creator: Kaggle
License: https://creativecommons.org/publicdomain/zero/1.0/

About this Dataset

Reddit's /r/funny Subreddit

Explore Reddit's Funny Subreddit & Analyze Community Engagement!

Quantifying Community Interaction Through Reddit Posts

By Reddit [source]

About this dataset

This dataset offers an insightful analysis into one of the most talked-about online communities today: Reddit. Specifically, we are focusing on the funny subreddit, a subsection of the main forum that enjoys the highest engagement across all Reddit users. Not only does this dataset include post titles, scores and other details regarding post creation and engagement; it also includes powerful metrics to measure active community interaction such as comment numbers and timestamps. By diving deep into this data, we can paint a fuller picture in terms of what people find funny in our digital age - how well do certain topics draw responses? How does sentiment change over time? And how can community managers use these insights to grow their platforms and better engage their userbase for lasting success? With this comprehensive dataset at your fingertips, you'll be able to answer each question - and more

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

Introduction

Welcome to the Reddit's Funny Subreddit Kaggle Dataset. In this dataset you will explore and analyze posts from the popular subreddit to gain insights into community engagement. With this dataset, you can understand user engagement trends and learn how people interact with content from different topics. This guide will provide further information about how to use this dataset for your data analysis projects.

Important Columns

This datasets contains columns such as: title, score, url, comms_num (number of comments), created (date of post), body (content of post) and timestamp. All these columns are important in understanding user interactions with each post on Reddit’s Funny Subreddit.

Exploratory Data Analysis

In order to get a better understanding of user engagement on the subreddit, some initial exploration is necessary. By using graphical tools such as histograms or boxplots we can understand basic parameter values like scores or comments numbers for each post in the subreddit easily by just observing their distribution over time or through different parameters (for example: type of joke).

Dimensionality reduction

For more advanced analytics it is recommended that a dimensionality reduction technique like PCA should be used first before tackling any real analysis tasks so that similar posts can be grouped together and easier conclusions regarding them can be drawn out later on more confidently by leaving out any kind of conflicting/irrelevant variables which could cloud up any data-driven decisions taken forward at a later date if not properly accounted for early on in an appropriate manner after dimensional consolidation has been performed successfully first correctly effectively right off the bat once starting out cleanly and properly upfront accordingly throughout..

Further Guidance

If further assistance with using this dataset is required then further readings into topics like text mining, natural language processing , machine learning , etc are highly recommended where detailed explanation related to various steps which could help unlock greater value from Reddit's funny subreddits are explained elaborately hopefully giving readers or researchers ideas over what sort of approaches need being taking when it comes analyzing text-based online service platforms such as Reddit during data analytics/science related tasks

Research Ideas

Analyzing post title length vs. engagement (i.e., score, comments).

Comparing sentiment of post bodies between posts that have high/low scores and comments.

Comparing topics within the posts that have high/low scores and comments to look for any differences in content or style of writing based on engagement level

Acknowledgements

If you use this dataset in your research, please credit the original authors.

Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: funny.csv

Column name	Description
title	The title of the post. (String)
score	The number of upvotes the post has received. (Integer)
url	The URL of the post. (String)
comms_num	The number of comments the post has received. (Integer)
created	The date the post was created. (Date)
body	The content of the post. (String)
timestamp	The time the post was created. (Time)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Reddit.

Tables

Funny

@kaggle.thedevastator_explore_reddit_s_funny_subreddit_analyze_communi.funny

151.76 kB
1,798 rows
8 columns

CREATE TABLE funny (
  "title" VARCHAR,
  "score" BIGINT,
  "id" VARCHAR,
  "url" VARCHAR,
  "comms_num" BIGINT,
  "created" DOUBLE,
  "body" VARCHAR,
  "timestamp" TIMESTAMP
);