Movie Dialogue Segment Extraction
Sampled test data for evaluation of segmentation algorithms
@kaggle.inteng_moviedialcorpus
Sampled test data for evaluation of segmentation algorithms
@kaggle.inteng_moviedialcorpus
This is the data taken from the project to create dialogue corpus from movies.
The details of the project is explained below including links to the additional data and a paper:
http://i.yz.yamagata-u.ac.jp/moviedialcorpus/index.html
Three column CSV files are uploaded. Each row corresponds to the automatically extracted segment. Each column correspond to 'beginning time' 'ending time' 'if the segment is dialogue or not'.
We have conducted dialogue segment extraction from movies based on sounds. We have evaluated the data by our own VAD algorithm and filtering rules. The accuracy is about 90% except music and musical movies where the performances were much worse.
Although the performance is not so bad, it seems there is much room for improvements. We'd like to know if there is a better algorithm for dialogue segment extraction from movies.
CREATE TABLE ia_6 (
"n_0_0" DOUBLE -- 0.0,
"n_46_99" DOUBLE -- 46.99,
"n_0" BIGINT -- 0
);Anyone who has the link will be able to view this.