Baselight

One Million Reddit Jokes

One million jokes straight from Reddit's top comedy community.

@kaggle.pavellexyr_one_million_reddit_jokes

Loading...
Loading...

About this Dataset

One Million Reddit Jokes

Context

Even in data science, there's always an opportunity to have a bit of fun. With this, we present you with a million jokes.

Content

The following dataset comprises a million joke posts from Apr 1 2020 and backwards, taken from the subreddit /r/jokes using SocialGrep.

All the posts are annotated with their score.

Acknowledgements

We would like to thank Stewart Munro for generously providing the cover image for this dataset.

Inspiration

This dataset is inspired by the elusive notion of comedy. What makes something funny? This question is so elusive, many professionals can't answer it properly. We hope that with this dataset, you can find something important. You know what they say - there's a grain of truth in every joke.

Tables

One Million Reddit Jokes

@kaggle.pavellexyr_one_million_reddit_jokes.one_million_reddit_jokes
  • 144.26 MB
  • 1000000 rows
  • 12 columns
Loading...

CREATE TABLE one_million_reddit_jokes (
  "type" VARCHAR,
  "id" VARCHAR,
  "subreddit_id" VARCHAR,
  "subreddit_name" VARCHAR,
  "subreddit_nsfw" BOOLEAN,
  "created_utc" BIGINT,
  "permalink" VARCHAR,
  "domain" VARCHAR,
  "url" VARCHAR,
  "selftext" VARCHAR,
  "title" VARCHAR,
  "score" BIGINT
);

Share link

Anyone who has the link will be able to view this.