Identifying Influential Bloggers: Techcrunch
@kaggle.lakritidis_identifying_influential_bloggers_techcrunch
@kaggle.lakritidis_identifying_influential_bloggers_techcrunch
This dataset is a crawl of the blog posts of the Techcrunch technology blog which was conducted on April of 2010. It was used as an experimental dataset for the requirements of the research paper:
L. Akritidis, D. Katsaros, P. Bozanis, "Identifying the Productive and Influential Bloggers in a Community", IEEE Transactions on Systems, Man, and Cybernetics-Part C: Applications and Reviews, vol. 41, no 5, pp. 759-764, 2011.
The primary goal of this dataset was to provide an active community for the identification of members who are both productive and influential. However, since the full text of the posts is present, it can also be used for a wide variety of text mining tasks, such as sentiment analysis, opinion retrieval, and NLP. There is also a (My)SQL version that is available from here.
The researchers who used, or will use this dataset, are kindly asked to cite the aforementioned article in their work/s.
If you found this dataset useful, you may also check my TUAW dataset for identifying influential bloggers.
The repository consisfts of four files:
Precise descriptions and record counts for each file are provided below.
@kaggle
@owid
Anyone who has the link will be able to view this.