Baselight

FakeCovid Fact-Checked News Dataset

International Coverage of COVID-19 in 40 Languages from 105 Countries

@kaggle.thedevastator_fakecovid_fact_checked_news_dataset

Loading...
Loading...

About this Dataset

FakeCovid Fact-Checked News Dataset


FakeCovid Fact-Checked News Dataset

International Coverage of COVID-19 in 40 Languages from 105 Countries

By [source]


About this dataset

The FakeCovid dataset is an unparalleled compilation of 7623 fact-checked news articles related to COVID-19. Obtained from 92 fact-checking websites located in 105 countries, this comprehensive collection covers a wide range of sources and languages, including locations across Africa, Europe, Asia, The Americas and Oceania. With data gathered from references on Poynter and Snopes, this unique dataset is an invaluable resource for researching the accuracy of global news related to the pandemic. It offers an invaluable insight into the international nature of COVID information with its column headers covering country's involved; categories such as coronavirus health updates or political interference during coronavirus; URLs for referenced articles; verifiers employed by websites; article classes that can range from true to false or even mixed evaluations; publication dates ; article sources injected with credibility verification as well as article text and language standardization. This one-of-a kind dataset serves as an essential tool in understanding both global information flow around the world concerning COVID 19 while simultaneously offering transparency into whose interests guide it

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

The FakeCovid dataset is a multilingual cross-domain collection of 7623 fact-checked news articles related to COVID-19. It is collected from 92 fact-checking websites and covers a wide range of sources and countries, including locations in Africa, Asia, Europe, The Americas, and Oceania. This dataset can be used for research related to understanding the truth and accuracy of news sources related to COVID-19 in different countries and languages.

To use this dataset effectively, you will need basic knowledge of data science principles such as data manipulation with pandas or Python libraries such as NumPy or ScikitLearn. The data is in CSV (comma separated values) format that can be read by most spreadsheet applications or text editor like Notepad++.
Here are some steps on how to get started:

  • Access the FakeCovid Fact Checked News Dataset from Kaggle: https://www.kaggle.com/c/fakecovidfactcheckednewsdataset/data
  • Download the provided CSV file containing all fact checked news articles and place it into your desired folder location
  • Load the CSV file into your preferred software application like Jupyter Notebook or RStudio 4)Explore your dataset using built-in functions within data science libraries such as Pandas & matplotlib – find meaningful information through statistical analysis &//or create visualizations 5)Modify parameters within the csv file if required & save 6)Share your creative projects through Gitter chatroom #fakecovidauthors 7 )Publish any interesting discoveries you find within open source repositories like GitHub 8 )Engage with our Hangouts group #FakeCoviDFactCheckersClub 9 )Show off fun graphics via Twitter hashtag #FakeCovidiauthors 10 )Reach out if you have further questions via email contactfakecovidadatateam 11 )Stay connected by joining our mailing list#FakeCoviDAuthorsGroup

We hope this guide helps you better understand how to use our FakeCoviD Fact Checked News Dataset for generating meaningful insights relating to COVID-19 news articles worldwide!

Research Ideas

  • Developing an automated algorithm to detect fake news related to COVID-19 by leveraging the fact-checking flags and other results included in this dataset for machine learning and natural language processing tasks.
  • Training a sentiment analysis model on the data to categorize articles according to their sentiments which can be used for further investigations into why certain news topics or countries have certain outcomes, motivations, or behaviors due to their content relatedness or author biasness(if any).
  • Using unsupervised clustering techniques, this dataset could be used as a tool for identifying any discrepancies between news circulated in different populations in different countries (langauge and regions) so that publicists can focus more on providing factual information rather than spreading false rumors or misinformation about the pandemic

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: FakeCovid_July2020.csv

Column name Description
ref_category_title The title of the category the article belongs to. (String)
ref_url The URL of the article. (String)
verifiedby The name of the fact checking website that verified it. (String)
country The country the article is from. (String)
class Whether the article is true, false, or mixed. (String)
published_date The date the article was published. (Date)
country1 The first country the article is related to. (String)
country2 The second country the article is related to. (String)
country3 The third country the article is related to. (String)
country4 The fourth country the article is related to. (String)
article_source The source of the article. (String)
ref_source The source of the reference. (String)
source_title The title of the source. (String)
content_text The text of the article. (String)
lang The language of the article. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit .

Tables

Fakecovid July2020

@kaggle.thedevastator_fakecovid_fact_checked_news_dataset.fakecovid_july2020
  • 28.55 MB
  • 7623 rows
  • 19 columns
Loading...

CREATE TABLE fakecovid_july2020 (
  "id" VARCHAR,
  "ref_category_title" VARCHAR,
  "ref_url" VARCHAR,
  "pageid" VARCHAR,
  "verifiedby" VARCHAR,
  "country" VARCHAR,
  "class" VARCHAR,
  "title" VARCHAR,
  "published_date" VARCHAR,
  "country1" VARCHAR,
  "country2" VARCHAR,
  "country3" VARCHAR,
  "country4" VARCHAR,
  "article_source" VARCHAR,
  "ref_source" VARCHAR,
  "source_title" VARCHAR,
  "content_text" VARCHAR,
  "category" VARCHAR,
  "lang" VARCHAR
);

Share link

Anyone who has the link will be able to view this.