Reddit: /r/CryptoCurrency
Posts, Scores, Comment Counts and Creation Timestamps
@kaggle.thedevastator_unlocking_financial_opportunities_through_crypto
Posts, Scores, Comment Counts and Creation Timestamps
@kaggle.thedevastator_unlocking_financial_opportunities_through_crypto
By Reddit [source]
This dataset contains detailed information on posts, scores and comments from the Reddit subreddit ‘CryptoCurrency’ - a fascinating online community devoted to discussion and analysis of the latest developments in blockchain investments, digital currencies, and other associated topics. Dive into the data to see what ultimate insights cryptocurrency enthusiasts are offering each other - their post titles, scores (the net upvotes a post has received), comment counts, created dates and timestamps are all laid out here for easy exploration. By taking advantage of this unique snapshot into crypto discussions and trends you can gain a better understanding not only of what topics have been popular over time but also how they're being discussed across this passionate community. Are there particular trends or patterns that emerge? It's up to you to uncover them!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset contains posts and comments from the subreddit ‘CryptoCurrency’, which is a widely-followed discussion board devoted to discussing cryptocurrencies, blockchain investments, and other related topics. The dataset contains a large number of posts from the subreddit and their associated scores, comment counts and creation timestamps. This dataset can be used in numerous ways for both research and practical business applications.
First, let's explore what columns are contained within this dataset: title, score, url, comms_num (number of comments), created (date and time post was created), body (actual content of the post), timestamp.
With this information at hand you can begin answering key questions such as: What type of topics bring more attention? What topics are not popular? Are there any correlations between posts with higher scores(upvotes) or more comments?
To better understand these questions there are numerous tools that can be employed on this data including Natural Language Processing tools such as TF-IDF vectorizers or Latent Dirichlet Allocation to understand what type of themes dominate these conversations. Additionally machine learning algorithms such as clustering techniques like K Nearest Neighbors or Unsupervised Learning techniques like Principal Component Analysis could help uncover insights from this data set. For example if we wanted to find out which words in titles correlated with higher scores then KNN could give us a better understanding as it would build clusters based on similar titles/words and show how each vary in relation score wise giving us an overview on how related words influence scores before analyzing content or any other factors within the data set.
Furthermore Reddit users actively engage with posts so by looking at comment counts insight can also be taken into effect regarding popularity etc... For example one may observe that whenever new coin values arise they tend to have more comments than usual - an insight indicating high levels of user engagement at certain moments in time when compared to regular periods which could be useful when making comparisons between individual coins etc..
Overall this data can provide tremendous value depending on its usage case - whether it stands for research purposes only or applied analytics geared towards predicting prices/engagement/ user sentiment etc it all depends but nonetheless opportunities lie within unlocking financial opportunities through cryptocurrency discussion found on reddit thus making it highly valuable for multiple purposes utilized properly!
- This dataset can be used to create a sentiment analysis of the comments and posts on CryptoCurrency topics and how these conversations have changed over time. This can help ascertain how different events within the crypto market have been received by investors, speculators, and other users on the subreddit.
- The dataset can also be utilized to identify trends in successful topics of conversation (in terms of post scores) and give insight into what types of topics are popular among Redditors in the CryptoCurrency space.
- Furthermore, this dataset could provide insight into user behavior on CryptoCurrency subreddits by enabling analysis around peak times for certain conversations or post popularity as well as which users tend to comment or post more frequently in response times vs others
If you use this dataset in your research, please credit the original authors.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: CryptoCurrency.csv
Column name | Description |
---|---|
title | The title of the post. (String) |
score | The number of upvotes the post has received. (Integer) |
url | A direct link to the post. (String) |
comms_num | The number of comments the post has received. (Integer) |
created | The date and time the post was created. (DateTime) |
body | The content of the post. (String) |
timestamp | The timestamp of when the post was submitted. (DateTime) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Reddit.
CREATE TABLE cryptocurrency (
"title" VARCHAR,
"score" BIGINT,
"id" VARCHAR,
"url" VARCHAR,
"comms_num" BIGINT,
"created" DOUBLE,
"body" VARCHAR,
"timestamp" TIMESTAMP
);
Anyone who has the link will be able to view this.