Fake News Classification
A Comprehensive Dataset for Fake News Detection
@kaggle.aadyasingh55_fake_news_classification
A Comprehensive Dataset for Fake News Detection
@kaggle.aadyasingh55_fake_news_classification
The Fake News Classification Dataset is an English-language dataset containing just over 45,000 unique news articles. These articles are classified as true (1) or false (0), making it a valuable resource for researchers and practitioners in the field of fake news identification using Transformers models. This is the first version of the dataset aimed at studying fake news detection.
This dataset supports the following tasks:
The dataset is primarily in English as generally spoken in the United States (en-US).
The dataset comprises 40,587 fields related to news articles, including three key types of fields:
Each instance contains:
{
"id": "1",
"title": "Palestinians switch off Christmas lights in Bethlehem in anti-Trump protest",
"text": "RAMALLAH, West Bank (Reuters) - Palestinians switched off Christmas lights at Jesus' traditional birthplace in Bethlehem on Wednesday night in protest at U.S. President Donald Trump's decision to recognize Jerusalem as Israel's capital...",
"label": "1"
}
The dataset is divided into three splits:
This dataset was created using Python with the pandas library as the main processing tool. It incorporates a mix of existing fake news datasets, ensuring a comprehensive dataset for training models. All processes and code used for dataset creation are available in the repository: Fake News Detection Repository.
The source data is a combination of multiple fake news datasets sourced from Kaggle, a platform for learning and honing skills in Artificial Intelligence.
Version 1.0.0 supports supervised learning methodologies for deep learning, focusing on new Transformers models in Natural Language Processing (NLP) with news articles from the United States.
This dataset is composed of three phases:
Training Phase: For training your NLP model.
Validation Phase: To validate the effectiveness of the training and check for overfitting.
Test Phase: To evaluate the model’s performance and identify mistakes in fine-tuning.
Anyone who has the link will be able to view this.