SQuAD2.0 by Kaggle | Other

About this Dataset

SQuAD2.0

Adversarial questions & answers that look similar to answerable ones

Source

Huggingface Hub: link

About this dataset

combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.

Research Ideas

The SQuAD dataset can be used to train a machine learning model to automatically generate answers to questions.

The SQuAD dataset can be used to train a machine learning model to automatically generate questions based on a given context.

The SQuAD dataset can be used to improve the accuracy of existing question answering systems

Acknowledgements

The SQuAD2.0 dataset was created by the Stanford Question Answering Dataset (SQuAD) team at Stanford University.

The dataset is based on a set of documents from Wikipedia. The full text of each document is provided, along with human-generated questions about the document and corresponding answers

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name	Description
title	The title of the Wikipedia article. (String)
context	The full text of the Wikipedia article. (String)
question	The question that the model will be asked. (String)
answers	The answer to the question. (String)

File: train.csv

Column name	Description
title	The title of the Wikipedia article. (String)
context	The full text of the Wikipedia article. (String)
question	The question that the model will be asked. (String)
answers	The answer to the question. (String)

Tables

Train

@kaggle.thedevastator_squad2_0_a_challenge_for_question_answering_syst.train

19.99 MB
130319 rows
5 columns


CREATE TABLE train (
  "id" VARCHAR,
  "title" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "answers" VARCHAR
);

Validation

@kaggle.thedevastator_squad2_0_a_challenge_for_question_answering_syst.validation

1.27 MB
11873 rows
5 columns


CREATE TABLE validation (
  "id" VARCHAR,
  "title" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "answers" VARCHAR
);