Baselight

Cosmos QA (Commonsense QA)

Pushing Commonsense Reasoning to the Next Level

@kaggle.thedevastator_cosmos_qa_a_large_scale_commonsense_based_readin

Loading...
Loading...

About this Dataset

Cosmos QA (Commonsense QA)

Cosmos QA (Commonsense QA)

Pushing Commonsense Reasoning to the Next Level


Source

Huggingface Hub: link

About this dataset

The Cosmos QA dataset is a large-scale dataset of 35.6K problems that require commonsense-based reading comprehension, formulated as multiple-choice questions. The dataset focuses on reading between the lines over a diverse collection of people's everyday narratives, asking questions concerning on the likely causes or effects of events that require reasoning beyond the exact text spans in the context.

This allows for much more sophisticated models to be built and evaluated, and could lead to better performance on real-world tasks

How to use the dataset

In order to use the Cosmos QA dataset, you will need to first download the data files from the Kaggle website. Once you have downloaded the files, you will need to unzip them and then place them in a directory on your computer.

Once you have the data files placed on your computer, you can begin using the dataset for commonsense-based reading comprehension tasks. The first step is to load the context file into a text editor such as Microsoft Word or Adobe Acrobat Reader. Once the context file is open, you will need to locate the section of text that contains the question that you want to answer.

Once you have located the section of text containing the question, you will need to read through thecontext in order to determine what type of answer would be most appropriate. After carefully reading throughthe context, you should then look at each of the answer choices and selectthe one that best fits with what you have read

Research Ideas

  • This dataset can be used to develop and evaluate commonsense-based reading comprehension models.
  • This dataset can be used to improve and customize question answering systems for educational or customer service applications.
  • This dataset can be used to study how human beings process and understand narratives, in order to better design artificial intelligence systems that can do the same

Acknowledgements

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name Description
context The context of the question. (String)
answer0 The first answer option. (String)
answer1 The second answer option. (String)
answer2 The third answer option. (String)
answer3 The fourth answer option. (String)
label The correct answer to the question. (String)

File: train.csv

Column name Description
context The context of the question. (String)
answer0 The first answer option. (String)
answer1 The second answer option. (String)
answer2 The third answer option. (String)
answer3 The fourth answer option. (String)
label The correct answer to the question. (String)

File: test.csv

Column name Description
context The context of the question. (String)
answer0 The first answer option. (String)
answer1 The second answer option. (String)
answer2 The third answer option. (String)
answer3 The fourth answer option. (String)
label The correct answer to the question. (String)

Tables

Test

@kaggle.thedevastator_cosmos_qa_a_large_scale_commonsense_based_readin.test
  • 2.72 MB
  • 6963 rows
  • 8 columns
Loading...

CREATE TABLE test (
  "id" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "answer0" VARCHAR,
  "answer1" VARCHAR,
  "answer2" VARCHAR,
  "answer3" VARCHAR,
  "label" BIGINT
);

Train

@kaggle.thedevastator_cosmos_qa_a_large_scale_commonsense_based_readin.train
  • 7.62 MB
  • 25262 rows
  • 8 columns
Loading...

CREATE TABLE train (
  "id" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "answer0" VARCHAR,
  "answer1" VARCHAR,
  "answer2" VARCHAR,
  "answer3" VARCHAR,
  "label" BIGINT
);

Validation

@kaggle.thedevastator_cosmos_qa_a_large_scale_commonsense_based_readin.validation
  • 1.17 MB
  • 2985 rows
  • 8 columns
Loading...

CREATE TABLE validation (
  "id" VARCHAR,
  "context" VARCHAR,
  "question" VARCHAR,
  "answer0" VARCHAR,
  "answer1" VARCHAR,
  "answer2" VARCHAR,
  "answer3" VARCHAR,
  "label" BIGINT
);

Share link

Anyone who has the link will be able to view this.