CommonsenseQA (Multiple-Choice Q&A)
12,102 questions with one correct answer and four distractor answers
Source
Huggingface Hub: link
About this dataset
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.
How to use the dataset
Research Ideas
- This dataset can be used to train a model to predict the correct answers to multiple-choice questions.
- This dataset can be used to evaluate the performance of different models on the CommonsenseQA dataset.
- This dataset can be used to discover new types of commonsense knowledge required to predict the correct answers to questions in the CommonsenseQA dataset
Acknowledgements
License
> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: validation.csv
Column name |
Description |
answerKey |
The correct answer to the question. (String) |
choices |
The four possible answers for each question. (List of strings) |
File: train.csv
Column name |
Description |
answerKey |
The correct answer to the question. (String) |
choices |
The four possible answers for each question. (List of strings) |
File: test.csv
Column name |
Description |
answerKey |
The correct answer to the question. (String) |
choices |
The four possible answers for each question. (List of strings) |