Baselight

One Million Reddit Questions

One million questions from /r/AskReddit, going back from September 2021.

@kaggle.pavellexyr_one_million_reddit_questions

Loading...
Loading...

About this Dataset

One Million Reddit Questions

Context

Ah, questions. One of the most important parts of natural dialogue. Automated question answering has been a long-standing problem in the NLP field. To help solve it, we present you with this dataset.

Content

The following dataset comprises one million questions from /r/AskReddit, procured using SocialGrep.
The questions are labelled with date of creation and their score.

Acknowledgements

We would like to thank Etienne Girardet for generously providing us with a background image for this dataset.

Inspiration

  • What makes a popular Reddit question?
  • What makes a good Reddit question?
  • Can Reddit teach us more about how to ask questions properly?

Tables

One Million Reddit Questions

@kaggle.pavellexyr_one_million_reddit_questions.one_million_reddit_questions
  • 108.69 MB
  • 1000000 rows
  • 12 columns
Loading...

CREATE TABLE one_million_reddit_questions (
  "type" VARCHAR,
  "id" VARCHAR,
  "subreddit_id" VARCHAR,
  "subreddit_name" VARCHAR,
  "subreddit_nsfw" BOOLEAN,
  "created_utc" BIGINT,
  "permalink" VARCHAR,
  "domain" VARCHAR,
  "url" VARCHAR,
  "selftext" VARCHAR,
  "title" VARCHAR,
  "score" BIGINT
);

Share link

Anyone who has the link will be able to view this.