Baselight

News About Major Cryptocurrencies 2013-2018 (40k)

Scraped from CCN Bitcoin, CoinDesk, NewsBTC etc.

@kaggle.kashnitsky_news_about_major_cryptocurrencies_20132018_40k

Loading...
Loading...

About this Dataset

News About Major Cryptocurrencies 2013-2018 (40k)

Context

Cryptocurrencies used to be a very hot topic recently. It still is, though may be not as hot. It is interesting to see what media writes about major cryptocurrencies - BTC, Ethereum, TRON, Litecoin etc.

Content

These are 40k news about major cryptocurrencies scraped from most popular online sources like CCN Bitcoin, CoinDesk, NewsBTC etc. They go with several fields:

  • URL
  • Title
  • Text body of a news
  • HTML body of a news
  • Year
  • Author
  • Source

Inspiration

This data set can be used to benchmark summarization algorithms or to perform classification (eg. into sentiments) although additional labeling will be needed in such case.

Tables

Crypto News Parsed 2013–2017 Train

@kaggle.kashnitsky_news_about_major_cryptocurrencies_20132018_40k.crypto_news_parsed_2013_2017_train
  • 128.28 MB
  • 28,069 rows
  • 7 columns
Loading...
CREATE TABLE crypto_news_parsed_2013_2017_train (
  "url" VARCHAR,
  "title" VARCHAR,
  "text" VARCHAR,
  "html" VARCHAR,
  "year" BIGINT,
  "author" VARCHAR,
  "source" VARCHAR
);

Crypto News Parsed 2018 Validation

@kaggle.kashnitsky_news_about_major_cryptocurrencies_20132018_40k.crypto_news_parsed_2018_validation
  • 47.1 MB
  • 11,239 rows
  • 7 columns
Loading...
CREATE TABLE crypto_news_parsed_2018_validation (
  "url" VARCHAR,
  "title" VARCHAR,
  "text" VARCHAR,
  "html" VARCHAR,
  "year" BIGINT,
  "author" VARCHAR,
  "source" VARCHAR
);

Share link

Anyone who has the link will be able to view this.