Baselight

SciFact (Claims & Evidence-Containing Abstracts)

1.4K expert-written scientific claims paired with evidence-containing abstracts

@kaggle.thedevastator_discover_scientific_truths_with_scifact

Loading...
Loading...

About this Dataset

SciFact (Claims & Evidence-Containing Abstracts)


SciFact (Claims & Evidence-Containing Abstracts)

1.4K expert-written scientific claims paired with evidence-containing abstracts

By Huggingface Hub [source]


About this dataset

Science doesn't have to be a mystery - the SciFact dataset unlocks the truth behind 1.4K scientific claims and associated evidence-filled abstracts! With this expert-curated dataset, researchers across different fields can analyze the evidence that supports or refutes these claims, providing insight into their implications on policy decisions and public matters. Columns like title, abstract, structured, and claim provide unprecedented opportunities to truly understand the facts behind scientific discoveries - look closer at every level with Sci Fact!

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

The SciFact dataset is an expert-written collection of 1.4K scientific claims, paired with evidence-containing abstracts, and annotated with labels and rationales. This dataset can aid researchers in exploring different aspects of scientific claims and uncovering the truth behind them.

Research Ideas

  • Autocomplete factual reports and articles: This dataset can be used to build an autocomplete system that suggests relevant pieces of evidence for a scientific claim based on the nature of the query.
  • Fact checking: This dataset can be used to build models that detect false claims and warn against them by notifying readers when they are reading content that contains false information.
  • Developing educational AI tools: This dataset can be utilized to develop AI tools tailored towards helping students understand how to interpret scientific evidence and learn how different experts in different fields use evidence to support or refute claims presented by others

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: corpus_train.csv

Column name Description
title The title of the scientific claim. (String)
abstract A summary of any evidence that supports or refutes the claim. (String)
structured Further information about what is being claimed (for example, types or numbers). (String)

File: claims_validation.csv

Column name Description
claim The scientific claim being made. (String)

File: claims_test.csv

Column name Description
claim The scientific claim being made. (String)

File: claims_train.csv

Column name Description
claim The scientific claim being made. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Claims Test

@kaggle.thedevastator_discover_scientific_truths_with_scifact.claims_test
  • 22.72 KB
  • 300 rows
  • 6 columns
Loading...

CREATE TABLE claims_test (
  "id" BIGINT,
  "claim" VARCHAR,
  "evidence_doc_id" VARCHAR,
  "evidence_label" VARCHAR,
  "evidence_sentences" VARCHAR,
  "cited_doc_ids" VARCHAR
);

Claims Train

@kaggle.thedevastator_discover_scientific_truths_with_scifact.claims_train
  • 61.22 KB
  • 1261 rows
  • 6 columns
Loading...

CREATE TABLE claims_train (
  "id" BIGINT,
  "claim" VARCHAR,
  "evidence_doc_id" DOUBLE,
  "evidence_label" VARCHAR,
  "evidence_sentences" VARCHAR,
  "cited_doc_ids" VARCHAR
);

Claims Validation

@kaggle.thedevastator_discover_scientific_truths_with_scifact.claims_validation
  • 29.93 KB
  • 450 rows
  • 6 columns
Loading...

CREATE TABLE claims_validation (
  "id" BIGINT,
  "claim" VARCHAR,
  "evidence_doc_id" DOUBLE,
  "evidence_label" VARCHAR,
  "evidence_sentences" VARCHAR,
  "cited_doc_ids" VARCHAR
);

Corpus Train

@kaggle.thedevastator_discover_scientific_truths_with_scifact.corpus_train
  • 4.37 MB
  • 5183 rows
  • 4 columns
Loading...

CREATE TABLE corpus_train (
  "doc_id" BIGINT,
  "title" VARCHAR,
  "abstract" VARCHAR,
  "structured" BOOLEAN
);

Share link

Anyone who has the link will be able to view this.