Name: General Language Understanding Evaluation (GLUE)
Creator: Kaggle
License: https://creativecommons.org/publicdomain/zero/1.0/

About this Dataset

General Language Understanding Evaluation (GLUE)

The Famous General Language Understanding Evaluation benchmark

Source

Huggingface Hub: link

About this dataset

GLUE, the General Language Understanding Evaluation benchmark is a collection of resources for training, evaluating, and analyzing natural language understanding systems.

Tasks

ax
A manually-curated evaluation dataset for fine-grained analysis of system performance on a broad range of linguistic phenomena. This dataset evaluates sentence understanding through Natural Language Inference (NLI) problems. Use a model trained on MulitNLI to produce predictions for this dataset.

cola
The Corpus of Linguistic Acceptability consists of English acceptability judgments drawn from books and journal articles on linguistic theory. Each example is a sequence of words annotated with whether it is a grammatical English sentence.

mnli
The Multi-Genre Natural Language Inference Corpus is a crowdsourced collection of sentence pairs with textual entailment annotations. Given a premise sentence and a hypothesis sentence, the task is to predict whether the premise entails the hypothesis (entailment), contradicts the hypothesis (contradiction), or neither (neutral). The premise sentences are gathered from ten different sources, including transcribed speech, fiction, and government reports. The authors of the benchmark use the standard test set, for which they obtained private labels from the RTE authors, and evaluate on both the matched (in-domain) and mismatched (cross-domain) section. They also uses and recommend the SNLI corpus as 550k examples of auxiliary training data.

nli_matched
The matched validation and test splits from MNLI. See the "mnli" BuilderConfig for additional information.

mnli_mismatched
The mismatched validation and test splits from MNLI. See the "mnli" BuilderConfig for additional information.

mrpc
The Microsoft Research Paraphrase Corpus (Dolan & Brockett, 2005) is a corpus of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent.

qnli
The Stanford Question Answering Dataset is a question-answering dataset consisting of question-paragraph pairs, where one of the sentences in the paragraph (drawn from Wikipedia) contains the answer to the corresponding question (written by an annotator). The authors of the benchmark convert the task into sentence pair classification by forming a pair between each question and each sentence in the corresponding context, and filtering out pairs with low lexical overlap between the question and the context sentence. The task is to determine whether the context sentence contains the answer to the question. This modified version of the original task removes the requirement that the model select the exact answer, but also removes the simplifying assumptions that the answer is always present in the input and that lexical overlap is a reliable cue.

qqp
The Quora Question Pairs2 dataset is a collection of question pairs from the community question-answering website Quora. The task is to determine whether a pair of questions are semantically equivalent.

rte
The Recognizing Textual Entailment (RTE) datasets come from a series of annual textual entailment challenges. The authors of the benchmark combined the data from RTE1 (Dagan et al., 2006), RTE2 (Bar Haim et al., 2006), RTE3 (Giampiccolo et al., 2007), and RTE5 (Bentivogli et al., 2009). Examples are constructed based on news and Wikipedia text. The authors of the benchmark convert all datasets to a two-class split, where for three-class datasets they collapse neutral and contradiction into not entailment, for consistency.

sst2
The Stanford Sentiment Treebank consists of sentences from movie reviews and human annotations of their sentiment. The task is to predict the sentiment of a given sentence. It uses the two-way (positive/negative) class split, with only sentence-level labels.

stsb
The Semantic Textual Similarity Benchmark (Cer et al., 2017) is a collection of sentence pairs drawn from news headlines, video and image captions, and natural language inference data. Each pair is human-annotated with a similarity score from 1 to 5.

wnli
The Winograd Schema Challenge (Levesque et al., 2011) is a reading comprehension task in which a system must read a sentence with a pronoun and select the referent of that pronoun from a list of choices. The examples are manually constructed to foil simple statistical methods: Each one is contingent on contextual information provided by a single word or phrase in the sentence. To convert the problem into sentence pair classification, the authors of the benchmark construct sentence pairs by replacing the ambiguous pronoun with each possible referent. The task is to predict if the sentence with the pronoun substituted is entailed by the original sentence. They use a small evaluation set consisting of new examples derived from fiction books that was shared privately by the authors of the original corpus. While the included training set is balanced between two classes, the test set is imbalanced between them (65% not entailment). Also, due to a data quirk, the development set is adversarial: hypotheses are sometimes shared between training and development examples, so if a model memorizes the training examples, they will predict the wrong label on corresponding development set example. As with QNLI, each example is evaluated separately, so there is not a systematic correspondence between a model's score on this task and its score on the unconverted original task. The authors of the benchmark call converted dataset WNLI (Winograd NLI).

How to use the dataset

The NLI Dataset is a large collection of sentence pairs automatically extracted from online news sources, with human annotations for whether the sentences in the pair are semantically equivalent. The dataset evaluates sentence understanding through Natural Language Inference (NLI) problems. To use this dataset, you will need to train a model on the MulitNLI dataset and use it to produce predictions for the NLI Dataset

Research Ideas

Train a model to classify semantically equivalent sentences.

The dataset can be used to train a model to identify paraphrases.

The dataset can be used to train a model to identify the entailment relation between two sentences

And much more..

Acknowledgements

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: mrpc_train.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: rte_train.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: sst2_test.csv

Column name	Description
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)
sentence1	The first sentence in the pair. (string)

File: cola_validation.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_train.csv

Column name	Description
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)

File: qqp_train.csv

Column name	Description
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)
question1	The first sentence in the pair. (string)
question2	The second sentence in the pair. (string)

File: mnli_test_matched.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mrpc_validation.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: sst2_validation.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: wnli_test.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: sst2_train.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: qqp_test.csv

Column name	Description
question1	The first sentence in the pair. (string)
question2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: stsb_validation.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_test_mismatched.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: wnli_validation.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: rte_test.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: stsb_train.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_matched_validation.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: qqp_validation.csv

Column name	Description
question1	The first sentence in the pair. (string)
question2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_validation_mismatched.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: rte_validation.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: stsb_test.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: wnli_train.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: qnli_test.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)
question	A short description of the column. (Column Type)

File: mnli_mismatched_test.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: qnli_train.csv

Column name	Description
question	A short description of the column. (Column Type)
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_matched_test.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mrpc_test.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
sentence2	The second sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: ax_test.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_validation_matched.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: mnli_mismatched_validation.csv

Column name	Description
premise	The premise sentence. (string)
hypothesis	The hypothesis sentence. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: cola_train.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: cola_test.csv

Column name	Description
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

File: qnli_validation.csv

Column name	Description
question	A short description of the column. (Column Type)
sentence1	The first sentence in the pair. (string)
label	The label for the pair, indicating whether the sentences are semantically equivalent (entailment), not semantically equivalent (contradiction), or neither (neutral). (string)

Tables

Ax Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.ax_test

79.74 kB
1,104 rows
4 columns

CREATE TABLE ax_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Cola Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.cola_test

37.76 kB
1,063 rows
3 columns

CREATE TABLE cola_test (
  "sentence" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Cola Train

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.cola_train

249.5 kB
8,551 rows
3 columns

CREATE TABLE cola_train (
  "sentence" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Cola Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.cola_validation

37.62 kB
1,043 rows
3 columns

CREATE TABLE cola_validation (
  "sentence" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Matched Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_matched_test

761.33 kB
9,796 rows
4 columns

CREATE TABLE mnli_matched_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Matched Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_matched_validation

759.99 kB
9,815 rows
4 columns

CREATE TABLE mnli_matched_validation (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Mismatched Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_mismatched_test

794.63 kB
9,847 rows
4 columns

CREATE TABLE mnli_mismatched_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Mismatched Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_mismatched_validation

794.99 kB
9,832 rows
4 columns

CREATE TABLE mnli_mismatched_validation (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Test Matched

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_test_matched

761.33 kB
9,796 rows
4 columns

CREATE TABLE mnli_test_matched (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Test Mismatched

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_test_mismatched

794.63 kB
9,847 rows
4 columns

CREATE TABLE mnli_test_mismatched (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Train

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_train

50.18 MB
392,702 rows
4 columns

CREATE TABLE mnli_train (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Validation Matched

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_validation_matched

759.99 kB
9,815 rows
4 columns

CREATE TABLE mnli_validation_matched (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mnli Validation Mismatched

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mnli_validation_mismatched

794.99 kB
9,832 rows
4 columns

CREATE TABLE mnli_validation_mismatched (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mrpc Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mrpc_test

305.38 kB
1,725 rows
4 columns

CREATE TABLE mrpc_test (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mrpc Train

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mrpc_train

636.35 kB
3,668 rows
4 columns

CREATE TABLE mrpc_train (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Mrpc Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.mrpc_validation

76.8 kB
408 rows
4 columns

CREATE TABLE mrpc_validation (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Qnli Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.qnli_test

668.9 kB
5,463 rows
4 columns

CREATE TABLE qnli_test (
  "question" VARCHAR,
  "sentence" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Qnli Train

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.qnli_train

17.06 MB
104,743 rows
4 columns

CREATE TABLE qnli_train (
  "question" VARCHAR,
  "sentence" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Qnli Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.qnli_validation

667.27 kB
5,463 rows
4 columns

CREATE TABLE qnli_validation (
  "question" VARCHAR,
  "sentence" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Qqp Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.qqp_test

34.78 MB
390,965 rows
4 columns

CREATE TABLE qqp_test (
  "question1" VARCHAR,
  "question2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Qqp Train

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.qqp_train

31.78 MB
363,846 rows
4 columns

CREATE TABLE qqp_train (
  "question1" VARCHAR,
  "question2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Qqp Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.qqp_validation

3.61 MB
40,430 rows
4 columns

CREATE TABLE qqp_validation (
  "question1" VARCHAR,
  "question2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Rte Test

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.rte_test

617.18 kB
3,000 rows
4 columns

CREATE TABLE rte_test (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Rte Train

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.rte_train

559.73 kB
2,490 rows
4 columns

CREATE TABLE rte_train (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);

Rte Validation

@kaggle.thedevastator_nli_dataset_for_sentence_understanding.rte_validation

70.13 kB
277 rows
4 columns

CREATE TABLE rte_validation (
  "sentence1" VARCHAR,
  "sentence2" VARCHAR,
  "label" BIGINT,
  "idx" BIGINT
);