Baselight

News Article Category Dataset

14 Categories of News Articles with Headline and body.

@kaggle.timilsinabimal_newsarticlecategories

About this Dataset

News Article Category Dataset

Context

The data was created for my Academic Project entitled News-Article-Classifier. This dataset can be used to train models to classify news articles into different categories.

Content

It Contains 6877 unique data about News Articles published in HuffPost. The categories include ARTS & CULTURE, BUSINESS, COMEDY, CRIME, EDUCATION, ENTERTAINMENT, ENVIRONMENT, MEDIA, POLITICS, RELIGION, SCIENCE, SPORTS, TECH, WOMEN.
Categories and corresponding article counts are as follows:

  • ARTS AND CULTURE: 1002
  • BUSINESS: 501
  • COMEDY: 380
  • CRIME: 300
  • EDUCATION: 490
  • ENTERTAINMENT: 501
  • ENVIRONMENT: 501
  • MEDIA: 347
  • POLITICS: 501
  • RELIGION: 501
  • SCIENCE: 350
  • SPORTS: 501
  • TECH: 501
  • WOMEN: 501

Acknowledgements

The data was created with the help of News Category Dataset and scrapped from HuffPost

Inspiration

  • Do news articles from different categories have different writing styles?
  • What kinds of words contribute to each of the categories in News Articles?

Citation

If you're using this dataset for research purposes, please use the following BibTex for citations:

@dataset{dataset,
author = {Timilsina, Bimal},
year = {2021},
month = {08},
pages = {},
title = {News Article Category Dataset},
}

Share link

Anyone who has the link will be able to view this.