One Week Of Global News Feeds
@kaggle.therohk_global_news_week
@kaggle.therohk_global_news_week
This dataset is a snapshot of most of the new news content published online over one week. It covers the 7 Day-period of August 24 through August 30 for the years 2017 and 2018.
Year 2017: 1,398,431 ; Year 2018: 1,912,872
It includes approximately 3.3 million articles, with 20,000 news sources and 20+ languages.
This dataset has just four fields:
See the "Basic Feed-Code Exploration" notebook for a quick look at the dataset contents.
The sources include news feeds, news websites, government agencies, tech journals, company websites, blogs and wikipedia updates. The data has been collected by polling RSS feeds and by crawling other large news aggregators.
As of 2018, these 7-Day slices were selected as there wasn't any downtime or outage during the intervals. New news content is produced at this rate by publishers everyday, throughout the year.
This dataset is free to use with the following citation:
Rohit Kulkarni (2018), One Week of Global Feeds [News CSV Dataset], doi:10.7910/DVN/ILAT5B, Retrieved from: [this url]
Original paper by M Trampus, B Novak: Internals of An Aggregated Web News Feed
Hosted By: Josef Stefan Institute, Slovenia : (http://ailab.ijs.si/si/people)
Further Exploration and Live News: (eventregistry.org)
@kaggle
@owid
Share link
Anyone who has the link will be able to view this.