SuperGLUE
Benchmark of task-specific difficult language understanding tasks
Sources
Huggingface Hub: link
About this dataset
SuperGLUE is a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, improved resources, and a new public leaderboard.
BoolQ (Boolean Questions, Clark et al., 2019a) is a QA task where each example consists of a short passage and a yes/no question about the passage. The questions are provided anonymously and unsolicited by users of the Google search engine, and afterwards paired with a paragraph from a Wikipedia article containing the answer. Following the original work, we evaluate with accuracy.
How to use the dataset
Research Ideas
- Train a model to perform question answering.
- Perform text classification.
- Train a model for entity recognition.
- Evaluate a model on the tasks.
- And more..
Acknowledgements
License
> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
Columns
File: boolq_test.csv
Column name |
Description |
question |
The question to be answered. (String) |
passage |
The passage of text containing the answer to the question. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: record_test.csv
Column name |
Description |
passage |
The passage of text containing the answer to the question. (String) |
query |
The question to be answered. (String) |
entities |
The entities in the passage of text. (List of strings) |
answers |
The answers to the question. (List of strings) |
File: rte_train.csv
Column name |
Description |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
File: wic_test.csv
Column name |
Description |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
word |
The word in the question. (String) |
sentence1 |
The first sentence in the question. (String) |
sentence2 |
The second sentence in the question. (String) |
start1 |
The starting index of the word in the first sentence. (Integer) |
start2 |
The starting index of the word in the second sentence. (Integer) |
end1 |
The ending index of the word in the first sentence. (Integer) |
end2 |
The ending index of the word in the second sentence. (Integer) |
File: record_validation.csv
Column name |
Description |
passage |
The passage of text containing the answer to the question. (String) |
query |
The question to be answered. (String) |
entities |
The entities in the passage of text. (List of strings) |
answers |
The answers to the question. (List of strings) |
File: wsc_validation.csv
Column name |
Description |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
text |
The text of the question. (String) |
span1_text |
The text of the first span. (String) |
span2_text |
The text of the second span. (String) |
File: copa_train.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
question |
The question to be answered. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: wsc_test.csv
Column name |
Description |
text |
The text of the question. (String) |
span1_text |
The text of the first span. (String) |
span2_text |
The text of the second span. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: multirc_train.csv
Column name |
Description |
question |
The question to be answered. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
paragraph |
The paragraph of text containing the answer to the question. (String) |
File: cb_validation.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: axg_test.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: rte_test.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: wic_train.csv
Column name |
Description |
word |
The word in the question. (String) |
sentence1 |
The first sentence in the question. (String) |
sentence2 |
The second sentence in the question. (String) |
start1 |
The starting index of the word in the first sentence. (Integer) |
start2 |
The starting index of the word in the second sentence. (Integer) |
end1 |
The ending index of the word in the first sentence. (Integer) |
end2 |
The ending index of the word in the second sentence. (Integer) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: wsc.fixed_train.csv
Column name |
Description |
text |
The text of the question. (String) |
span1_text |
The text of the first span. (String) |
span2_text |
The text of the second span. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: boolq_train.csv
Column name |
Description |
question |
The question to be answered. (String) |
passage |
The passage of text containing the answer to the question. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: record_train.csv
Column name |
Description |
passage |
The passage of text containing the answer to the question. (String) |
query |
The question to be answered. (String) |
entities |
The entities in the passage of text. (List of strings) |
answers |
The answers to the question. (List of strings) |
File: wsc_train.csv
Column name |
Description |
text |
The text of the question. (String) |
span1_text |
The text of the first span. (String) |
span2_text |
The text of the second span. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: cb_train.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: copa_test.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
question |
The question to be answered. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: rte_validation.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: multirc_validation.csv
Column name |
Description |
paragraph |
The paragraph of text containing the answer to the question. (String) |
question |
The question to be answered. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: wsc.fixed_test.csv
Column name |
Description |
text |
The text of the question. (String) |
span1_text |
The text of the first span. (String) |
span2_text |
The text of the second span. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: axb_test.csv
Column name |
Description |
sentence1 |
The first sentence in the question. (String) |
sentence2 |
The second sentence in the question. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: wsc.fixed_validation.csv
Column name |
Description |
text |
The text of the question. (String) |
span1_text |
The text of the first span. (String) |
span2_text |
The text of the second span. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: boolq_validation.csv
Column name |
Description |
question |
The question to be answered. (String) |
passage |
The passage of text containing the answer to the question. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: cb_test.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
hypothesis |
The hypothesis of the question. This is the text that the model will be required to generate as output. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: multirc_test.csv
Column name |
Description |
paragraph |
The paragraph of text containing the answer to the question. (String) |
question |
The question to be answered. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: copa_validation.csv
Column name |
Description |
premise |
The premise of the question. This is the text that the model will be given as input. (String) |
question |
The question to be answered. (String) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |
File: wic_validation.csv
Column name |
Description |
word |
The word in the question. (String) |
sentence1 |
The first sentence in the question. (String) |
sentence2 |
The second sentence in the question. (String) |
start1 |
The starting index of the word in the first sentence. (Integer) |
start2 |
The starting index of the word in the second sentence. (Integer) |
end1 |
The ending index of the word in the first sentence. (Integer) |
end2 |
The ending index of the word in the second sentence. (Integer) |
label |
The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String) |