ANLI - (Adversarial NLI Benchmark) by Kaggle | Other

About this Dataset

ANLI - (Adversarial NLI Benchmark)

The Adversarial Natural Language Inference (ANLI, Nie et al.)

Source

Paper: link

About this dataset

The ANLI Adversarial Natural Language Inference dataset is a new, large-scale NLI benchmark dataset. The dataset is collected via an iterative, adversarial human-and-model-in-the-loop procedure. ANLI is much more difficult than its predecessors such as SNLI and MNLI. It contains three rounds. Each round has train/dev/test splits. The data fields are the same among all splits.

ANLI provides a unique challenge for natural language understanding models. The dataset is collected via an iterative, adversarial human-and-model-in-the loop procedure that makes it much more difficult than its predecessors such as SNLI and MNLI. This makes ANLI a great benchmark to assess the progress of NLI models

How to use the dataset

To use the ANLI dataset, you will need to download the train_r1.csv file. This file contains the data for the first round of training data for the ANLI dataset. Next, you will need to download the dev_r1.csv file. This file contains the data for the first round of development data for the ANLI dataset. Finally, you will need to download the test_r1.csv file. This file contains the data for the first round of testing in the ANLI dataset

Research Ideas

The ANLI Adversarial Natural Language Inference dataset can be used to train models to better understand natural language.

The dataset can be used to develop models that are more robust to adversarial examples.

The dataset can be used to improve the accuracy of NLI systems

Acknowledgements

The dataset was originally published on Huggingface Hub

License

> License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
> No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: dev_r2.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: test_r2.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: train_r3.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: dev_r3.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: test_r3.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: train_r2.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: train_r1.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: test_r1.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

File: dev_r1.csv

Column name	Description
premise	The premise of the sentence. (String)
hypothesis	The hypothesis of the sentence. (String)
label	The label of the sentence. (String)
reason	The reason for the label. (String)

Tables

Dev R1

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.dev_r1

344.4 KB
1000 rows
5 columns


CREATE TABLE dev_r1 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Dev R2

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.dev_r2

343.54 KB
1000 rows
5 columns


CREATE TABLE dev_r2 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Dev R3

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.dev_r3

421.98 KB
1200 rows
5 columns


CREATE TABLE dev_r3 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Test R1

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.test_r1

346.25 KB
1000 rows
5 columns


CREATE TABLE test_r1 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Test R2

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.test_r2

354.23 KB
1000 rows
5 columns


CREATE TABLE test_r2 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Test R3

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.test_r3

417.23 KB
1200 rows
5 columns


CREATE TABLE test_r3 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Train R1

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.train_r1

1.95 MB
16946 rows
5 columns


CREATE TABLE train_r1 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Train R2

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.train_r2

4.05 MB
45460 rows
5 columns


CREATE TABLE train_r2 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);

Train R3

@kaggle.thedevastator_anli_a_large_scale_nli_benchmark_dataset.train_r3

15.72 MB
100459 rows
5 columns


CREATE TABLE train_r3 (
  "uid" VARCHAR,
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT,
  "reason" VARCHAR
);