Baselight

CommonGen (Generative Commonsense Reasoning)

Constrained text generation task, associated with a benchmark dataset

@kaggle.thedevastator_commongen_a_benchmark_dataset_for_generative_com

Loading...
Loading...

About this Dataset

CommonGen (Generative Commonsense Reasoning)

CommonGen: A Benchmark Dataset for Generative Commonsense Reasoning

A New Challenge for Artificial Intelligence


About this dataset

The CommonGen Dataset is a new benchmark dataset for the task of generative commonsense reasoning. This task is challenging because it requires both relational reasoning using background commonsense knowledge and compositional generalization ability to work on unseen concept combinations. The CommonGen Dataset provides a set of training data and validation data for the task of generating commonsense sentences. The dataset contains a set of concept sets and target sentences that the model should generate

How to use the dataset

The CommonGen Dataset is a new benchmark dataset for the task of generative commonsense reasoning. This task is challenging because it requires both relational reasoning using background commonsense knowledge and compositional generalization ability to work on unseen concept combinations.

This guide provides an overview of the CommonGen Dataset and how to use it for generative commonsense reasoning tasks.

The CommonGen Dataset contains a set of training data, validation data, and test data. The training data consists of pairs of concepts and targets, where the target is a sentence that should be generated based on the concepts. The validation data contains pairs of concepts and targets, where the target is a sentence that should be generated based on the concepts. The test data contains pairs of concepts and targets, where the target is a sentence that should be generated based on the concepts.

To use the CommonGen Dataset for generative commonsense reasoning tasks, you will need to train a model on the training data and then evaluate your model on the validation and test data

Research Ideas

  • Achieving human-like commonsense reasoning ability in AI models
  • Generating natural language descriptions of complex concept combinations
  • Developing new methods for efficient data-driven commonsense knowledge acquisition

Acknowledgements

The CommonGen Dataset is a new benchmark dataset for the task of generative commonsense reasoning. This task is challenging because it requires both relational reasoning using background commonsense knowledge and compositional generalization ability to work on unseen concept combinations.

This dataset was created by the research team at the Allen Institute for Artificial Intelligence (AI2), in collaboration with academics from Carnegie Mellon University, Stanford University, and the University of Washington. We would like to thank these institutions for their support

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name Description
concepts A set of concepts that the model should generate a sentence about. (Input)
target The target sentence that the model should generate. (Output)

File: train.csv

Column name Description
concepts A set of concepts that the model should generate a sentence about. (Input)
target The target sentence that the model should generate. (Output)

File: test.csv

Column name Description
concepts A set of concepts that the model should generate a sentence about. (Input)
target The target sentence that the model should generate. (Output)

Tables

Test

@kaggle.thedevastator_commongen_a_benchmark_dataset_for_generative_com.test
  • 39.54 KB
  • 1497 rows
  • 3 columns
Loading...

CREATE TABLE test (
  "concept_set_idx" BIGINT,
  "concepts" VARCHAR,
  "target" VARCHAR
);

Train

@kaggle.thedevastator_commongen_a_benchmark_dataset_for_generative_com.train
  • 3 MB
  • 67389 rows
  • 3 columns
Loading...

CREATE TABLE train (
  "concept_set_idx" BIGINT,
  "concepts" VARCHAR,
  "target" VARCHAR
);

Validation

@kaggle.thedevastator_commongen_a_benchmark_dataset_for_generative_com.validation
  • 157.79 KB
  • 4018 rows
  • 3 columns
Loading...

CREATE TABLE validation (
  "concept_set_idx" BIGINT,
  "concepts" VARCHAR,
  "target" VARCHAR
);

Share link

Anyone who has the link will be able to view this.