Baselight

SuperGLUE

Benchmark of task specific difficult language understanding tasks

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat

Loading...
Loading...

About this Dataset

SuperGLUE

SuperGLUE

Benchmark of task-specific difficult language understanding tasks


Sources

Huggingface Hub: link

About this dataset

SuperGLUE is a new benchmark styled after GLUE with a new set of more difficult language understanding tasks, improved resources, and a new public leaderboard.

BoolQ (Boolean Questions, Clark et al., 2019a) is a QA task where each example consists of a short passage and a yes/no question about the passage. The questions are provided anonymously and unsolicited by users of the Google search engine, and afterwards paired with a paragraph from a Wikipedia article containing the answer. Following the original work, we evaluate with accuracy.

How to use the dataset

Research Ideas

  • Train a model to perform question answering.
  • Perform text classification.
  • Train a model for entity recognition.
  • Evaluate a model on the tasks.
  • And more..

Acknowledgements

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: boolq_test.csv

Column name Description
question The question to be answered. (String)
passage The passage of text containing the answer to the question. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: record_test.csv

Column name Description
passage The passage of text containing the answer to the question. (String)
query The question to be answered. (String)
entities The entities in the passage of text. (List of strings)
answers The answers to the question. (List of strings)

File: rte_train.csv

Column name Description
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)

File: wic_test.csv

Column name Description
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)
word The word in the question. (String)
sentence1 The first sentence in the question. (String)
sentence2 The second sentence in the question. (String)
start1 The starting index of the word in the first sentence. (Integer)
start2 The starting index of the word in the second sentence. (Integer)
end1 The ending index of the word in the first sentence. (Integer)
end2 The ending index of the word in the second sentence. (Integer)

File: record_validation.csv

Column name Description
passage The passage of text containing the answer to the question. (String)
query The question to be answered. (String)
entities The entities in the passage of text. (List of strings)
answers The answers to the question. (List of strings)

File: wsc_validation.csv

Column name Description
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)
text The text of the question. (String)
span1_text The text of the first span. (String)
span2_text The text of the second span. (String)

File: copa_train.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
question The question to be answered. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: wsc_test.csv

Column name Description
text The text of the question. (String)
span1_text The text of the first span. (String)
span2_text The text of the second span. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: multirc_train.csv

Column name Description
question The question to be answered. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)
paragraph The paragraph of text containing the answer to the question. (String)

File: cb_validation.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: axg_test.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: rte_test.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: wic_train.csv

Column name Description
word The word in the question. (String)
sentence1 The first sentence in the question. (String)
sentence2 The second sentence in the question. (String)
start1 The starting index of the word in the first sentence. (Integer)
start2 The starting index of the word in the second sentence. (Integer)
end1 The ending index of the word in the first sentence. (Integer)
end2 The ending index of the word in the second sentence. (Integer)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: wsc.fixed_train.csv

Column name Description
text The text of the question. (String)
span1_text The text of the first span. (String)
span2_text The text of the second span. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: boolq_train.csv

Column name Description
question The question to be answered. (String)
passage The passage of text containing the answer to the question. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: record_train.csv

Column name Description
passage The passage of text containing the answer to the question. (String)
query The question to be answered. (String)
entities The entities in the passage of text. (List of strings)
answers The answers to the question. (List of strings)

File: wsc_train.csv

Column name Description
text The text of the question. (String)
span1_text The text of the first span. (String)
span2_text The text of the second span. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: cb_train.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: copa_test.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
question The question to be answered. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: rte_validation.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: multirc_validation.csv

Column name Description
paragraph The paragraph of text containing the answer to the question. (String)
question The question to be answered. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: wsc.fixed_test.csv

Column name Description
text The text of the question. (String)
span1_text The text of the first span. (String)
span2_text The text of the second span. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: axb_test.csv

Column name Description
sentence1 The first sentence in the question. (String)
sentence2 The second sentence in the question. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: wsc.fixed_validation.csv

Column name Description
text The text of the question. (String)
span1_text The text of the first span. (String)
span2_text The text of the second span. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: boolq_validation.csv

Column name Description
question The question to be answered. (String)
passage The passage of text containing the answer to the question. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: cb_test.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
hypothesis The hypothesis of the question. This is the text that the model will be required to generate as output. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: multirc_test.csv

Column name Description
paragraph The paragraph of text containing the answer to the question. (String)
question The question to be answered. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: copa_validation.csv

Column name Description
premise The premise of the question. This is the text that the model will be given as input. (String)
question The question to be answered. (String)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

File: wic_validation.csv

Column name Description
word The word in the question. (String)
sentence1 The first sentence in the question. (String)
sentence2 The second sentence in the question. (String)
start1 The starting index of the word in the first sentence. (Integer)
start2 The starting index of the word in the second sentence. (Integer)
end1 The ending index of the word in the first sentence. (Integer)
end2 The ending index of the word in the second sentence. (Integer)
label The label for the question. This can be one of three values: ENTAILMENT, NEUTRAL, or CONTRADICTION. (String)

Tables

Axb Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.axb_test
  • 78.04 KB
  • 1104 rows
  • 4 columns
Loading...

CREATE TABLE axb_test (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Axg Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.axg_test
  • 15.03 KB
  • 356 rows
  • 4 columns
Loading...

CREATE TABLE axg_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Boolq Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.boolq_test
  • 1.23 MB
  • 3245 rows
  • 4 columns
Loading...

CREATE TABLE boolq_test (
  "question" VARCHAR,
  "passage" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Boolq Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.boolq_train
  • 3.69 MB
  • 9427 rows
  • 4 columns
Loading...

CREATE TABLE boolq_train (
  "question" VARCHAR,
  "passage" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Boolq Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.boolq_validation
  • 1.23 MB
  • 3270 rows
  • 4 columns
Loading...

CREATE TABLE boolq_validation (
  "question" VARCHAR,
  "passage" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Cb Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.cb_test
  • 63.06 KB
  • 250 rows
  • 4 columns
Loading...

CREATE TABLE cb_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Cb Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.cb_train
  • 57.7 KB
  • 250 rows
  • 4 columns
Loading...

CREATE TABLE cb_train (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Cb Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.cb_validation
  • 18.66 KB
  • 56 rows
  • 4 columns
Loading...

CREATE TABLE cb_validation (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Copa Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.copa_test
  • 40.68 KB
  • 500 rows
  • 6 columns
Loading...

CREATE TABLE copa_test (
  "premise" VARCHAR,
  "choice1" VARCHAR,
  "choice2" VARCHAR,
  "question" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Copa Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.copa_train
  • 34.53 KB
  • 400 rows
  • 6 columns
Loading...

CREATE TABLE copa_train (
  "premise" VARCHAR,
  "choice1" VARCHAR,
  "choice2" VARCHAR,
  "question" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Copa Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.copa_validation
  • 13.09 KB
  • 100 rows
  • 6 columns
Loading...

CREATE TABLE copa_validation (
  "premise" VARCHAR,
  "choice1" VARCHAR,
  "choice2" VARCHAR,
  "question" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Multirc Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.multirc_test
  • 488.38 KB
  • 9693 rows
  • 5 columns
Loading...

CREATE TABLE multirc_test (
  "paragraph" VARCHAR,
  "question" VARCHAR,
  "answer" VARCHAR,
  "idx" VARCHAR,
  "label" BIGINT
);

Multirc Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.multirc_train
  • 1.32 MB
  • 27243 rows
  • 5 columns
Loading...

CREATE TABLE multirc_train (
  "paragraph" VARCHAR,
  "question" VARCHAR,
  "answer" VARCHAR,
  "idx" VARCHAR,
  "label" BIGINT
);

Multirc Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.multirc_validation
  • 252.77 KB
  • 4848 rows
  • 5 columns
Loading...

CREATE TABLE multirc_validation (
  "paragraph" VARCHAR,
  "question" VARCHAR,
  "answer" VARCHAR,
  "idx" VARCHAR,
  "label" BIGINT
);

Record Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.record_test
  • 6.3 MB
  • 10000 rows
  • 5 columns
Loading...

CREATE TABLE record_test (
  "passage" VARCHAR,
  "query" VARCHAR,
  "entities" VARCHAR,
  "answers" VARCHAR,
  "idx" VARCHAR
);

Record Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.record_train
  • 59.19 MB
  • 100730 rows
  • 5 columns
Loading...

CREATE TABLE record_train (
  "passage" VARCHAR,
  "query" VARCHAR,
  "entities" VARCHAR,
  "answers" VARCHAR,
  "idx" VARCHAR
);

Record Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.record_validation
  • 6.38 MB
  • 10000 rows
  • 5 columns
Loading...

CREATE TABLE record_validation (
  "passage" VARCHAR,
  "query" VARCHAR,
  "entities" VARCHAR,
  "answers" VARCHAR,
  "idx" VARCHAR
);

Rte Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.rte_test
  • 602.2 KB
  • 3000 rows
  • 4 columns
Loading...

CREATE TABLE rte_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Rte Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.rte_train
  • 546.11 KB
  • 2490 rows
  • 4 columns
Loading...

CREATE TABLE rte_train (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Rte Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.rte_validation
  • 69.26 KB
  • 277 rows
  • 4 columns
Loading...

CREATE TABLE rte_validation (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Wic Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wic_test
  • 119.53 KB
  • 1400 rows
  • 9 columns
Loading...

CREATE TABLE wic_test (
  "word" VARCHAR,
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "start1" BIGINT,
  "start2" BIGINT,
  "end1" BIGINT,
  "end2" BIGINT,
  "idx" BIGINT,
  "label" BIGINT
);

Wic Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wic_train
  • 306.46 KB
  • 5428 rows
  • 9 columns
Loading...

CREATE TABLE wic_train (
  "word" VARCHAR,
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "start1" BIGINT,
  "start2" BIGINT,
  "end1" BIGINT,
  "end2" BIGINT,
  "idx" BIGINT,
  "label" BIGINT
);

Wic Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wic_validation
  • 60.75 KB
  • 638 rows
  • 9 columns
Loading...

CREATE TABLE wic_validation (
  "word" VARCHAR,
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "start1" BIGINT,
  "start2" BIGINT,
  "end1" BIGINT,
  "end2" BIGINT,
  "idx" BIGINT,
  "label" BIGINT
);

Wsc Fixed Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wsc_fixed_test
  • 13.56 KB
  • 146 rows
  • 7 columns
Loading...

CREATE TABLE wsc_fixed_test (
  "text" VARCHAR,
  "span1_index" BIGINT,
  "span2_index" BIGINT,
  "span1_text" VARCHAR,
  "span2_text" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Wsc Fixed Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wsc_fixed_train
  • 28.37 KB
  • 554 rows
  • 7 columns
Loading...

CREATE TABLE wsc_fixed_train (
  "text" VARCHAR,
  "span1_index" BIGINT,
  "span2_index" BIGINT,
  "span1_text" VARCHAR,
  "span2_text" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Wsc Fixed Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wsc_fixed_validation
  • 11.65 KB
  • 104 rows
  • 7 columns
Loading...

CREATE TABLE wsc_fixed_validation (
  "text" VARCHAR,
  "span1_index" BIGINT,
  "span2_index" BIGINT,
  "span1_text" VARCHAR,
  "span2_text" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Wsc Test

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wsc_test
  • 13.56 KB
  • 146 rows
  • 7 columns
Loading...

CREATE TABLE wsc_test (
  "text" VARCHAR,
  "span1_index" BIGINT,
  "span2_index" BIGINT,
  "span1_text" VARCHAR,
  "span2_text" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Wsc Train

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wsc_train
  • 28.26 KB
  • 554 rows
  • 7 columns
Loading...

CREATE TABLE wsc_train (
  "text" VARCHAR,
  "span1_index" BIGINT,
  "span2_index" BIGINT,
  "span1_text" VARCHAR,
  "span2_text" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Wsc Validation

@kaggle.thedevastator_task_oriented_natural_language_understanding_dat.wsc_validation
  • 11.64 KB
  • 104 rows
  • 7 columns
Loading...

CREATE TABLE wsc_validation (
  "text" VARCHAR,
  "span1_index" BIGINT,
  "span2_index" BIGINT,
  "span1_text" VARCHAR,
  "span2_text" VARCHAR,
  "idx" BIGINT,
  "label" BIGINT
);

Share link

Anyone who has the link will be able to view this.