QaZre (Reading Comprehension Questions) by Kaggle | Other

About this Dataset

QaZre (Reading Comprehension Questions)

A dataset reducing relation extraction to simple reading comprehension questions

By Huggingface Hub [source]

About this dataset

QaZre is an innovative and intuitive corpus tailored to help developers, researchers, and data scientists that wish to explore relation extraction from conversational language. With this dataset, you will have three sets of easily identifiable fields with consistent entries for training and evaluating your models. The fields consist of the relation type between context and subject, the related question for validation assessments, the specified subject for analysis, its exhaustive context within the sentence or paragraph structure, as well as answers to assist with accuracy ratings when needed. Use QaZre today to open up a world of possibilities in unlocking real-world relations through knowledge graph applications!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to Use the QaZre Dataset for Knowledge Graphs
The QaZre dataset was designed with the goal of making relation extraction from conversational language simple and straightforward. Through this guide, you will understand how to make use of the three distinct CSV files (train, validation, and test) included in this repository, as well as their associated fields. With this resource at your fingertips, you can begin uncovering the power behind knowledge graphs.

What is a Knowledge Graph?
A knowledge graph is a collection of information structured in such a way that it can be easily visualized in order to make sense of complex networks and relationships between concepts/entities. In essence, they are digital maps used to navigate between related ideas or topics. For example; if you were looking up information about cats on Wikipedia - through the interconnected network represented by the knowledge graph - similar articles about domestic animals like dogs or horses could also show up on your search results page!

What is Relation Extraction?
Relation extraction is an important part of building out global scale knowledge graphs because it involves collecting relationships between different entities within domain-specific datasets for analysis purposes. It’s essentially using natural language processing techniques (like machine learning) to understand how different items relate to one another within large volumes of data without human interference! This allows us to gain insights unseen before – highlighting correlations within records that would have taken great effort when analyzed manually!

The QaZre Dataset
To streamline relation extraction tasks from conversational language into manageable reading comprehension questions – The dataset consists of three distinct CSV files: train, validation & test each containing same fields – making it easier for developers & researchers looking into building efficient models based on real-world relations by extracting answers associated with each relation & question pair given contextual data (subject & context). For example; Relation: Location Question: Where is Joe Smith? Subject: Joe Smith Context : He lives in Wellington Answer: Wellington !

Key Points To Note
* Make sure all 3 CSVs are downloaded from Kaggle (train.csv ,validation .csv ,test .csv), before starting work
* Each file consists columns like --relation , question , subject , context and answers etc meaning answered can be extracted based off relationship with given subject background (try not exposing model completely so generalizability remains intact during deployment)
* Comparing extracted data against

Research Ideas

Building knowledge graphs from conversational language.

Developing AI bots which can answer questions in a natural language format.

Integrating automated relation extraction algorithms into NLP applications such as search engines and summarizers for more accurate results

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name	Description
Relation	The type of relation between the subject and the context. (String)
Question	The question related to the relation. (String)

File: train.csv

Column name	Description
Relation	The type of relation between the subject and the context. (String)
Question	The question related to the relation. (String)

File: test.csv

Column name	Description
Relation	The type of relation between the subject and the context. (String)
Question	The question related to the relation. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Test

@kaggle.thedevastator_unlock_the_power_of_knowledge_graphs_with_qazre.test

13.54 MB
120000 rows
5 columns


CREATE TABLE test (
  "relation" VARCHAR,
  "question" VARCHAR,
  "subject" VARCHAR,
  "context" VARCHAR,
  "answers" VARCHAR
);

Train

@kaggle.thedevastator_unlock_the_power_of_knowledge_graphs_with_qazre.train

1022.59 MB
8400000 rows
5 columns


CREATE TABLE train (
  "relation" VARCHAR,
  "question" VARCHAR,
  "subject" VARCHAR,
  "context" VARCHAR,
  "answers" VARCHAR
);

Validation

@kaggle.thedevastator_unlock_the_power_of_knowledge_graphs_with_qazre.validation

579.53 KB
6000 rows
5 columns


CREATE TABLE validation (
  "relation" VARCHAR,
  "question" VARCHAR,
  "subject" VARCHAR,
  "context" VARCHAR,
  "answers" VARCHAR
);