Baselight

Reddit: /r/NotTheOnion

Discriminating Truth and Satire

@kaggle.thedevastator_discovering_fact_through_humor_investigating_r_n

Loading...
Loading...

About this Dataset

Reddit: /r/NotTheOnion


Reddit: /r/NotTheOnion

Discriminating Truth and Satire

By Reddit [source]


About this dataset

This dataset offers an inside look at the often humorous world of news media. With content that combines truth and satire, it's all too easy to be fooled by false, outlandish stories or satirical headlines. But with this dataset you can better understand what makes the headlines found on r/NotTheOnion so funny--and differentiate between fact and fiction.

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains Reddit posts from the r/NotTheOnion subreddit, which is dedicated to uncovering humorous news content. This can make it difficult for readers to distinguish false stories from real ones, and the purpose of this dataset is to help with that discernment.

The following columns are included in this dataset: title, score, url, comms_num, created, body and timestamp.

Here’s how you can use each column in your analysis:

  • title: This will help you get an idea of what the post is about before you dive into reading it.
  • score: This indicates how well liked a particular post is amongst viewers; keep in mind that a high view count can indicate either good or bad reviews.
  • url: URL links lead users directly to articles or websites which confirm the story's legitimacy (or lack thereof).
  • comms_num: This reflects the number of comments a post has received; typically higher comms_nums suggest stories with more intrigue and controversy.
  • created: The date and time at which point in time the post was created - this will give context as certain events happen at certain times of year or may affect people differently throughout various seasons or holidays around the world (and thus affect their outlook on posted content). - body : Actual article text will allow readers additional insight beyond simply looking at headlines without facts; they might need these details to properly interpret any jokes or understand sarcasm used in a piece which alters its perception entirely. Moreover analyzing word choice could provide further insight into any biases present when writing an article too along with finding further evidence based on its length as longer articles typically include more background information than shorter ones do hence reflecting authenticity/factual basis better usually speaking. - timestamp : The exact moment something was uploaded/updated so that we know exactly when did something happen - rather than relying on memory alone for timestamps aiming for accuracy should gives us insights into temporal correlations amongst posts!

Using these columns together will give readers more context when evaluating if a story is humorous fact or funny fiction!

Research Ideas

  • Analyzing the most popular stories and examining their implications for news reporting.
  • Identifying the narratives which capture discussions about current events and topics of interest to readers in a humorous way.
  • Comparing different subreddits to view how r/NotTheOnion contributes to the international conversation around news media and what people find funny or entertaining when it comes to news content

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: nottheonion.csv

Column name Description
title The title of the post. (String)
score The number of upvotes the post has received. (Integer)
url The URL of the post. (String)
comms_num The number of comments the post has received. (Integer)
created The date and time the post was created. (DateTime)
body The body of the post. (String)
timestamp The timestamp of the post. (Integer)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Reddit.

Tables

Nottheonion

@kaggle.thedevastator_discovering_fact_through_humor_investigating_r_n.nottheonion
  • 253.5 KB
  • 1627 rows
  • 8 columns
Loading...

CREATE TABLE nottheonion (
  "title" VARCHAR,
  "score" BIGINT,
  "id" VARCHAR,
  "url" VARCHAR,
  "comms_num" BIGINT,
  "created" DOUBLE,
  "body" VARCHAR,
  "timestamp" TIMESTAMP
);

Share link

Anyone who has the link will be able to view this.