Baselight

CoQA (Conversational Question Answering)

127k Questions With Answers, 8k Conversations About Text From Seven Domains.

@kaggle.thedevastator_unlock_the_answers_broaden_your_knowledge_with_c

About this Dataset

CoQA (Conversational Question Answering)


CoQA (Conversational Question Answering)

127k Questions With Answers, 8k Conversations About Text From Seven Domains.

By Huggingface Hub [source]


About this dataset

CoQA is an impactful and large-scale dataset of conversations, questions, and answers related to passages from seven diverse domains. This collection consists of an impressive 127,000 questions along with the answers provided by 8,000 conversations. What sets CoQA apart from other question-answering datasets is that the questions asked were conversational in nature. Each passage comes with its own set of answered queries, plus corresponding evidence emphasized in the accompanying text. With all this considered, CoQA offers a wealth of possibilities for researchers and people alike as it presents a strong compilation of data ideal for constructing various conversation/question-answering systems alike. As such this dataset can serve as a resource point not only to solve existing challenges but also stand as a platform to spur innovation within question-answering technologies moving forward

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

How to Use the CoQA Kaggle Dataset

Welcome to the world of conversational question answering! The CoQA Kaggle dataset is a great resource for those interested in building their own conversational question answering system. Here is a guide on how to take advantage of this dataset.

Research Ideas

  • Capturing natural language understanding by mapping questions to relevant portions in a passage.
  • Developing intelligent systems that can provide proper answers within a conversational state while taking into account the context of the conversation.
  • Creating models that are capable of interactively responding to users’ inquiries using relevant evidence from the dataset's variety of domains

Acknowledgements

If you use this dataset in your research, please credit the original authors.
Data Source

License

License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

Columns

File: validation.csv

Column name Description
source The domain from which the conversation or question-answer pair is from. (String)
story The text passage from which questions were asked and answered. (String)
answers The concise answer response. (String)

File: train.csv

Column name Description
source The domain from which the conversation or question-answer pair is from. (String)
story The text passage from which questions were asked and answered. (String)
answers The concise answer response. (String)

Acknowledgements

If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.

Tables

Train

@kaggle.thedevastator_unlock_the_answers_broaden_your_knowledge_with_c.train
  • 11.22 MB
  • 7199 rows
  • 4 columns
Loading...

CREATE TABLE train (
  "source" VARCHAR,
  "story" VARCHAR,
  "questions" VARCHAR,
  "answers" VARCHAR
);

Validation

@kaggle.thedevastator_unlock_the_answers_broaden_your_knowledge_with_c.validation
  • 797.08 KB
  • 500 rows
  • 4 columns
Loading...

CREATE TABLE validation (
  "source" VARCHAR,
  "story" VARCHAR,
  "questions" VARCHAR,
  "answers" VARCHAR
);