Baselight

COVID-19 Numerical Claims Open Research Dataset

Numerical claims related to COVID-19

@kaggle.dshah1612_covid19_numerical_claims_open_research_dataset

Loading...
Loading...

About this Dataset

COVID-19 Numerical Claims Open Research Dataset

The COVID-19 Numerical Claims Open Research Dataset (CONCORD) is a comprehensive, open-source dataset containing numerical claims extracted from academic papers published on COVID-19-related research. CONCORD contains approximately 203k numerical claims pertinent to COVID-19, extracted from more than 57,000 scientific research articles published between January 2020 to May 2022. These claims are extracted from full-text research articles annotated using a white box, weakly supervised model. We used the CORD-19 repository as the raw dataset for our research work.

Why numerical claims?

  • Adding a numerical entity often increases the claim’s credibility while providing fine-grained, tangible, and valuable information that can be of immense use, especially in the biomedical domain.

Thumbnail Image source: https://indianexpress.com/article/cities/bangalore/unsustainable-urbanisation-coronavirus-variants-8062078/

Tables

Concord

@kaggle.dshah1612_covid19_numerical_claims_open_research_dataset.concord
  • 41.35 MB
  • 203539 rows
  • 10 columns
Loading...

CREATE TABLE concord (
  "claim_uid" VARCHAR,
  "cord_uid" VARCHAR,
  "title" VARCHAR,
  "doi" VARCHAR,
  "numerical_claims" VARCHAR,
  "publish_time" TIMESTAMP,
  "authors" VARCHAR,
  "journal" VARCHAR,
  "country" VARCHAR,
  "institution" VARCHAR
);

Share link

Anyone who has the link will be able to view this.