Baselight

QuAIL - (Comprehensive Reading)

15K Multi-Choice Questions in 4 domains

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre

Loading...
Loading...

About this Dataset

QuAIL - (Comprehensive Reading)

Introducing QuAIL: A Comprehensive Reading Comprehension Dataset

15K Multi-Choice Questions in 4 domains


About this dataset

The QuAIL dataset is a valuable resource for researchers in the field of reading comprehension, containing 15,000 multiple-choice questions that are balanced and annotated for question types across four different domains: news, user stories, fiction, and blogs. The questions are designed to test a reader's comprehension of the accompanying text passages, making the QuAIL dataset an essential tool for investigating this important cognitive skill

How to use the dataset

The QuAIL dataset consists of 15,000 multiple-choice questions in four different domains: news, user stories, fiction, and blogs. These questions are designed to test a reader's comprehension of the text passages they are accompanying. The questions are balanced and annotated for question types, providing a valuable resource for researchers in the field of reading comprehension.

To use this dataset, simply download the train.csv and validation.csv files from Kaggle. These files contain the training and validation data for the QuAIL dataset, respectively. Each instance in the dataset consists of a question, a context passage, and four possible answers. The answer key is given in the 'answers' column.

This dataset can be used to train and evaluate reading comprehension models. For example, you could use it to build a machine learning model that predicts which answer is correct for a given question-passage pair

Research Ideas

  • This dataset can be used to train a machine learning model to automatically generate multiple-choice questions from text passages.

  • This dataset can be used to train a machine learning model to automatically label questions by type (e.g., factual, inferential, etc.).

  • This dataset can be used as a benchmark for evaluating reading comprehension models on multiple-choice question answering tasks

Acknowledgements

The QuAIL dataset is a valuable resource for researchers in the field of reading comprehension. The dataset contains 15,000 multiple-choice questions in four different domains: news, user stories, fiction, and blogs. These questions are designed to test a reader's comprehension of the text passages they are accompanying. The questions are balanced and annotated for question types, providing a valuable resource for researchers in the field of reading comprehension

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name Description
domain The domain of the question. (String)
metadata Metadata about the question. (JSON)
context The context of the question. (String)
question_type The type of the question. (String)
answers The answers to the question. (List of Strings)

File: train.csv

Column name Description
domain The domain of the question. (String)
metadata Metadata about the question. (JSON)
context The context of the question. (String)
question_type The type of the question. (String)
answers The answers to the question. (List of Strings)

File: challenge.csv

Column name Description
domain The domain of the question. (String)
metadata Metadata about the question. (JSON)
context The context of the question. (String)
question_type The type of the question. (String)
answers The answers to the question. (List of Strings)

Tables

Challenge

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre.challenge
  • 97.02 KB
  • 556 rows
  • 10 columns
Loading...

CREATE TABLE challenge (
  "id" VARCHAR,
  "context_id" VARCHAR,
  "question_id" BIGINT,
  "domain" VARCHAR,
  "metadata" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "question_type" VARCHAR,
  "answers" VARCHAR,
  "correct_answer_id" BIGINT
);

Train

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre.train
  • 1.56 MB
  • 10246 rows
  • 10 columns
Loading...

CREATE TABLE train (
  "id" VARCHAR,
  "context_id" VARCHAR,
  "question_id" BIGINT,
  "domain" VARCHAR,
  "metadata" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "question_type" VARCHAR,
  "answers" VARCHAR,
  "correct_answer_id" BIGINT
);

Validation

@kaggle.thedevastator_introducing_quail_a_comprehensive_reading_compre.validation
  • 353.38 KB
  • 2164 rows
  • 10 columns
Loading...

CREATE TABLE validation (
  "id" VARCHAR,
  "context_id" VARCHAR,
  "question_id" BIGINT,
  "domain" VARCHAR,
  "metadata" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "question_type" VARCHAR,
  "answers" VARCHAR,
  "correct_answer_id" BIGINT
);

Share link

Anyone who has the link will be able to view this.