AG News (News Articles)
News Articles Text Classification
@kaggle.thedevastator_new_dataset_for_text_classification_ag_news
News Articles Text Classification
@kaggle.thedevastator_new_dataset_for_text_classification_ag_news
Huggingface Hub: link
The ag_news dataset provides a new opportunity for text classification research. It is a large dataset consisting of a training set of 10,000 examples and a test set of 5,000 examples. The examples are split evenly into two classes: positive and negative. This makes the dataset well-suited for research into text classification methods
If you're looking to do text classification research, the ag_news dataset is a great new dataset to use. It consists of a training set of 10,000 examples and a test set of 5,000 examples, split evenly between positive and negative class labels. The data is well-balanced and should be suitable for many different text classification tasks
- This dataset can be used to train a text classifier to automatically categorize news articles into positive and negative categories.
- This dataset can be used to develop a system that can identify positive and negative sentiment in news articles.
- This dataset can be used to study the difference in how positive and negative news is reported by different media outlets
AG is a collection of more than 1 million news articles. News articles have been gathered from more than 2000 news sources by ComeToMyHead in more than 1 year of activity. ComeToMyHead is an academic news search engine that has been running since July, 2004. The dataset is provided by the academic comunity for research purposes in data mining (clustering, classification, etc), information retrieval (ranking, search, etc), XML, data compression, data streaming, and any other non-commercial activity. For more information, please refer to the link http://www.di.unipi.it/~gulli/AG_corpus_of_news_articles.html .
License
> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: train.csv
| Column name | Description |
|---|---|
| text | The text of the news article. (string) |
| label | The label of the news article. (integer) |
File: test.csv
| Column name | Description |
|---|---|
| text | The text of the news article. (string) |
| label | The label of the news article. (integer) |
CREATE TABLE test (
"text" VARCHAR,
"label" BIGINT
);CREATE TABLE train (
"text" VARCHAR,
"label" BIGINT
);Anyone who has the link will be able to view this.