DuoRC: (Q&A: Wikipedia And IMDB)
English dataset of questions and answers from Wikipedia and IMDb movie plots
@kaggle.thedevastator_duorc_a_dataset_of_movie_plots
English dataset of questions and answers from Wikipedia and IMDb movie plots
@kaggle.thedevastator_duorc_a_dataset_of_movie_plots
By Huggingface Hub [source]
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
The DuoRC dataset is an English language dataset of questions and answers gathered from crowdsourced AMT workers on Wikipedia and IMDb movie plots. The workers were given freedom to pick answer from the plots or synthesize their own answers. It contains two sub-datasets - SelfRC and ParaphraseRC. SelfRC dataset is built on Wikipedia movie plots solely. ParaphraseRC has questions written from Wikipedia movie plots and the answers are given based on corresponding IMDb movie plots.
- This dataset can be used to train a model to answer questions about movie plots.
- This dataset can be used to train a model to answer questions about Wikipedia articles.
- This dataset can be used to find paraphrases of questions about movie plots
If you use this dataset in your research, please credit the original authors.
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication
No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: SelfRC_train.csv
| Column name | Description |
|---|---|
| plot | The plot of the movie. (String) |
| title | The title of the movie. (String) |
| question | The question about the plot. (String) |
| answers | The answers to the question. (List of strings) |
| no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: SelfRC_test.csv
| Column name | Description |
|---|---|
| plot | The plot of the movie. (String) |
| title | The title of the movie. (String) |
| question | The question about the plot. (String) |
| answers | The answers to the question. (List of strings) |
| no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: ParaphraseRC_train.csv
| Column name | Description |
|---|---|
| plot | The plot of the movie. (String) |
| title | The title of the movie. (String) |
| question | The question about the plot. (String) |
| answers | The answers to the question. (List of strings) |
| no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: SelfRC_validation.csv
| Column name | Description |
|---|---|
| plot | The plot of the movie. (String) |
| title | The title of the movie. (String) |
| question | The question about the plot. (String) |
| answers | The answers to the question. (List of strings) |
| no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: ParaphraseRC_test.csv
| Column name | Description |
|---|---|
| plot | The plot of the movie. (String) |
| title | The title of the movie. (String) |
| question | The question about the plot. (String) |
| answers | The answers to the question. (List of strings) |
| no_answer | A binary value that indicates whether the question has a answer. (Integer) |
File: ParaphraseRC_validation.csv
| Column name | Description |
|---|---|
| plot | The plot of the movie. (String) |
| title | The title of the movie. (String) |
| question | The question about the plot. (String) |
| answers | The answers to the question. (List of strings) |
| no_answer | A binary value that indicates whether the question has a answer. (Integer) |
If you use this dataset in your research, please credit the original authors.
If you use this dataset in your research, please credit Huggingface Hub.
CREATE TABLE paraphraserc_test (
"plot_id" VARCHAR,
"plot" VARCHAR,
"title" VARCHAR,
"question_id" VARCHAR,
"question" VARCHAR,
"answers" VARCHAR,
"no_answer" BOOLEAN
);CREATE TABLE paraphraserc_train (
"plot_id" VARCHAR,
"plot" VARCHAR,
"title" VARCHAR,
"question_id" VARCHAR,
"question" VARCHAR,
"answers" VARCHAR,
"no_answer" BOOLEAN
);CREATE TABLE paraphraserc_validation (
"plot_id" VARCHAR,
"plot" VARCHAR,
"title" VARCHAR,
"question_id" VARCHAR,
"question" VARCHAR,
"answers" VARCHAR,
"no_answer" BOOLEAN
);CREATE TABLE selfrc_test (
"plot_id" VARCHAR,
"plot" VARCHAR,
"title" VARCHAR,
"question_id" VARCHAR,
"question" VARCHAR,
"answers" VARCHAR,
"no_answer" BOOLEAN
);CREATE TABLE selfrc_train (
"plot_id" VARCHAR,
"plot" VARCHAR,
"title" VARCHAR,
"question_id" VARCHAR,
"question" VARCHAR,
"answers" VARCHAR,
"no_answer" BOOLEAN
);CREATE TABLE selfrc_validation (
"plot_id" VARCHAR,
"plot" VARCHAR,
"title" VARCHAR,
"question_id" VARCHAR,
"question" VARCHAR,
"answers" VARCHAR,
"no_answer" BOOLEAN
);Anyone who has the link will be able to view this.