QuAIL - (Comprehensive Reading) by Kaggle | Other

About this Dataset

QuAIL - (Comprehensive Reading)

Introducing QuAIL: A Comprehensive Reading Comprehension Dataset

15K Multi-Choice Questions in 4 domains

About this dataset

The QuAIL dataset is a valuable resource for researchers in the field of reading comprehension, containing 15,000 multiple-choice questions that are balanced and annotated for question types across four different domains: news, user stories, fiction, and blogs. The questions are designed to test a reader's comprehension of the accompanying text passages, making the QuAIL dataset an essential tool for investigating this important cognitive skill

How to use the dataset

The QuAIL dataset consists of 15,000 multiple-choice questions in four different domains: news, user stories, fiction, and blogs. These questions are designed to test a reader's comprehension of the text passages they are accompanying. The questions are balanced and annotated for question types, providing a valuable resource for researchers in the field of reading comprehension.

To use this dataset, simply download the train.csv and validation.csv files from Kaggle. These files contain the training and validation data for the QuAIL dataset, respectively. Each instance in the dataset consists of a question, a context passage, and four possible answers. The answer key is given in the 'answers' column.

This dataset can be used to train and evaluate reading comprehension models. For example, you could use it to build a machine learning model that predicts which answer is correct for a given question-passage pair

Research Ideas

This dataset can be used to train a machine learning model to automatically generate multiple-choice questions from text passages.

This dataset can be used to train a machine learning model to automatically label questions by type (e.g., factual, inferential, etc.).

This dataset can be used as a benchmark for evaluating reading comprehension models on multiple-choice question answering tasks

Acknowledgements

The QuAIL dataset is a valuable resource for researchers in the field of reading comprehension. The dataset contains 15,000 multiple-choice questions in four different domains: news, user stories, fiction, and blogs. These questions are designed to test a reader's comprehension of the text passages they are accompanying. The questions are balanced and annotated for question types, providing a valuable resource for researchers in the field of reading comprehension

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name	Description
domain	The domain of the question. (String)
metadata	Metadata about the question. (JSON)
context	The context of the question. (String)
question_type	The type of the question. (String)
answers	The answers to the question. (List of Strings)

File: train.csv

Column name	Description
domain	The domain of the question. (String)
metadata	Metadata about the question. (JSON)
context	The context of the question. (String)
question_type	The type of the question. (String)
answers	The answers to the question. (List of Strings)

File: challenge.csv

Column name	Description
domain	The domain of the question. (String)
metadata	Metadata about the question. (JSON)
context	The context of the question. (String)
question_type	The type of the question. (String)
answers	The answers to the question. (List of Strings)

Tables

Challenge

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre.challenge

97.02 KB
556 rows
10 columns


CREATE TABLE challenge (
  "id" VARCHAR,
  "context_id" VARCHAR,
  "question_id" BIGINT,
  "domain" VARCHAR,
  "metadata" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "question_type" VARCHAR,
  "answers" VARCHAR,
  "correct_answer_id" BIGINT
);

Train

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre.train

1.56 MB
10246 rows
10 columns


CREATE TABLE train (
  "id" VARCHAR,
  "context_id" VARCHAR,
  "question_id" BIGINT,
  "domain" VARCHAR,
  "metadata" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "question_type" VARCHAR,
  "answers" VARCHAR,
  "correct_answer_id" BIGINT
);

Validation

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre.validation

353.38 KB
2164 rows
10 columns


CREATE TABLE validation (
  "id" VARCHAR,
  "context_id" VARCHAR,
  "question_id" BIGINT,
  "domain" VARCHAR,
  "metadata" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "question_type" VARCHAR,
  "answers" VARCHAR,
  "correct_answer_id" BIGINT
);