Reddit R/AskScience Flair Dataset
@kaggle.sumitm004_reddit_raskscience_flair_dataset
@kaggle.sumitm004_reddit_raskscience_flair_dataset
Reddit is a massive platform for news, content, and discussions, hosting millions of active users daily. Among its vast number of subreddits, we focus on the r/AskScience community, where users engage in science-related discussions and questions.
This dataset is derived from the r/AskScience subreddit, collected between January 1, 2016, and May 20, 2022. It includes 612,668 datapoints across 22 columns, featuring diverse information such as the content of the questions, submission descriptions, associated flairs, NSFW/SFW status, year of submission, and more. The data was extracted using Python and Pushshift's API, followed by some cleaning with NumPy and pandas. Detailed column descriptions are available for clarity.
@kaggle
@euremarkable
Share link
Anyone who has the link will be able to view this.