SciFact (Claims & Evidence-Containing Abstracts)
1.4K expert-written scientific claims paired with evidence-containing abstracts
@kaggle.thedevastator_discover_scientific_truths_with_scifact
1.4K expert-written scientific claims paired with evidence-containing abstracts
@kaggle.thedevastator_discover_scientific_truths_with_scifact
By Huggingface Hub [source]
Science doesn't have to be a mystery - the SciFact dataset unlocks the truth behind 1.4K scientific claims and associated evidence-filled abstracts! With this expert-curated dataset, researchers across different fields can analyze the evidence that supports or refutes these claims, providing insight into their implications on policy decisions and public matters. Columns like title, abstract, structured, and claim provide unprecedented opportunities to truly understand the facts behind scientific discoveries - look closer at every level with Sci Fact!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
The SciFact dataset is an expert-written collection of 1.4K scientific claims, paired with evidence-containing abstracts, and annotated with labels and rationales. This dataset can aid researchers in exploring different aspects of scientific claims and uncovering the truth behind them.
- Autocomplete factual reports and articles: This dataset can be used to build an autocomplete system that suggests relevant pieces of evidence for a scientific claim based on the nature of the query.
- Fact checking: This dataset can be used to build models that detect false claims and warn against them by notifying readers when they are reading content that contains false information.
- Developing educational AI tools: This dataset can be utilized to develop AI tools tailored towards helping students understand how to interpret scientific evidence and learn how different experts in different fields use evidence to support or refute claims presented by others
If you use this dataset in your research, please credit the original authors.
Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: corpus_train.csv
| Column name | Description |
|---|---|
| title | The title of the scientific claim. (String) |
| abstract | A summary of any evidence that supports or refutes the claim. (String) |
| structured | Further information about what is being claimed (for example, types or numbers). (String) |
File: claims_validation.csv
| Column name | Description |
|---|---|
| claim | The scientific claim being made. (String) |
File: claims_test.csv
| Column name | Description |
|---|---|
| claim | The scientific claim being made. (String) |
File: claims_train.csv
| Column name | Description |
|---|---|
| claim | The scientific claim being made. (String) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.
CREATE TABLE claims_test (
"id" BIGINT,
"claim" VARCHAR,
"evidence_doc_id" VARCHAR,
"evidence_label" VARCHAR,
"evidence_sentences" VARCHAR,
"cited_doc_ids" VARCHAR
);CREATE TABLE claims_train (
"id" BIGINT,
"claim" VARCHAR,
"evidence_doc_id" DOUBLE,
"evidence_label" VARCHAR,
"evidence_sentences" VARCHAR,
"cited_doc_ids" VARCHAR
);CREATE TABLE claims_validation (
"id" BIGINT,
"claim" VARCHAR,
"evidence_doc_id" DOUBLE,
"evidence_label" VARCHAR,
"evidence_sentences" VARCHAR,
"cited_doc_ids" VARCHAR
);CREATE TABLE corpus_train (
"doc_id" BIGINT,
"title" VARCHAR,
"abstract" VARCHAR,
"structured" BOOLEAN
);Anyone who has the link will be able to view this.