KOR NLI (Korean Natural Language Inference) by Kaggle | Other

About this Dataset

KOR NLI (Korean Natural Language Inference)

KOR_NLI (Korean Natural Language Inference)

Unlock Complex Understanding of Sentences & Labels with KOR_NLI

By Huggingface Hub [source]

About this dataset

KOR_NLI is a comprehensive corpus of natural language understanding data that covers the Korean language. The dataset provides sets of sentences measuring entailment, contradiction, and neutrality towards each other with human labeled labels. This allows scientists to build predictive models that use natural language processing and inference to understand the underlying implications of text. The datasets include SNLI Train, XNLI Test, XNLI Validation and Multi-NLI Train providing users with a large range of data they can leverage to explore the complexities behind the art of natural language inference. With detailed premises and hypotheses accompanied by predetermined labels regarding their relation towards each other, KOR_NNI offers an invaluable source for those looking to break into this exciting new field. As computing power continues its growth in both affordability and capabilities allowing for faster experimentation we invite users to discover who is best equipped at recognizing patterns in this new realm as machines come ever closer at “reading your mind” when it comes understanding what you are saying!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to Use the KOR_NLI Dataset

The KOR_NLI dataset is a comprehensive collection of Natural Language Inference (NLI) datasets for the Korean language. This dataset consists of sets of sentences or “premises” and accompanying hypotheses with labels indicating whether the premise and hypothesis entail, contradict, or are neutral with respect to each other. The data is divided into four datasets: SNLI Train, XNLI Test, XNLI Validation, and Multi-NLI Train.

Research Ideas

Automated Assessment of Composition & Essay Writing: This dataset can be used to develop algorithms for automatically assessing written work for natural language elements such as grammar, style, and content.

Context-Aware System Development: By applying Natural Language Inference (NLI) datasets such as KOR_NLI, developers can create intelligent systems that are better equipped to understand the broader implications of user queries in context.

Document Classification & Categorization: This dataset enables machine learning models to classify documents according to their implications and hypotheses regarding their premise, by assigning them a label indicating whether they entail, contradict or are neutral with respect one another

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: snli_train.csv

Column name	Description
premise	The first sentence in the pair. (String)
hypothesis	The second sentence in the pair. (String)
label	Indicates if the two sentences entail, contradict, or are neutral relative to each other. (String)

File: xnli_test.csv

Column name	Description
premise	The first sentence in the pair. (String)
hypothesis	The second sentence in the pair. (String)
label	Indicates if the two sentences entail, contradict, or are neutral relative to each other. (String)

File: xnli_validation.csv

Column name	Description
premise	The first sentence in the pair. (String)
hypothesis	The second sentence in the pair. (String)
label	Indicates if the two sentences entail, contradict, or are neutral relative to each other. (String)

File: multi_nli_train.csv

Column name	Description
premise	The first sentence in the pair. (String)
hypothesis	The second sentence in the pair. (String)
label	Indicates if the two sentences entail, contradict, or are neutral relative to each other. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Multi Nli Train

@kaggle.thedevastator_explore_korean_natural_language_inference_with_k.multi_nli_train

50.32 MB
392702 rows
3 columns


CREATE TABLE multi_nli_train (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT
);

Snli Train

@kaggle.thedevastator_explore_korean_natural_language_inference_with_k.snli_train

20.83 MB
550152 rows
3 columns


CREATE TABLE snli_train (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT
);

Xnli Test

@kaggle.thedevastator_explore_korean_natural_language_inference_with_k.xnli_test

335.63 KB
5010 rows
3 columns


CREATE TABLE xnli_test (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT
);

Xnli Validation

@kaggle.thedevastator_explore_korean_natural_language_inference_with_k.xnli_validation

170.57 KB
2490 rows
3 columns


CREATE TABLE xnli_validation (
  "premise" VARCHAR,
  "hypothesis" VARCHAR,
  "label" BIGINT
);