VGG-Sound: Only Cat And Dog Sounds
Only meows and barks (and hiss, and purr...)
@kaggle.kitonbass_vgg_sound_only_cat_and_dog_sounds
Only meows and barks (and hiss, and purr...)
@kaggle.kitonbass_vgg_sound_only_cat_and_dog_sounds
This is part of VGGSound dataset with everything related to cats and dogs converted to 10 seconds 16kHz mono wav.
I made it for my University research, because original dataset is kind of huge :)
There are also two csv files with train/test split collected from VGG Sound splits. All data numbered according to indexes of original csv tables.
Each line in the csv file has columns defined by here:
Index in original VGGSound (my addition), YouTube ID, start seconds, label, train/test split.
Also, some of the video links (~800 of them) in tables lead to unavailable videos (age restricted/deleted/etc.), which was not downloaded and therefore is not here – so there would be no audio for some indexes.
The example of real practice use of the dataset can be found in my VQ-VAE 2 notebook 👨💻.
I've also got this helper notebook 🐱🐶 which shows some simple actions you can do with audio dat. In particular:
And umm I could not figure out how to do a proper citation, but here it is from original VGGSound
@InProceedings{Chen20,
author = "Honglie Chen and Weidi Xie and Andrea Vedaldi and Andrew Zisserman",
title = "VGGSound: A Large-scale Audio-Visual Dataset",
booktitle = "International Conference on Acoustics, Speech, and Signal Processing (ICASSP)",
year = "2020",
}
CREATE TABLE test (
"unnamed_0_1" BIGINT -- Unnamed: 0.1,
"unnamed_0" BIGINT -- Unnamed: 0,
"link" VARCHAR,
"time" BIGINT,
"label" VARCHAR,
"split" VARCHAR
);CREATE TABLE train (
"unnamed_0_1" BIGINT -- Unnamed: 0.1,
"unnamed_0" BIGINT -- Unnamed: 0,
"link" VARCHAR,
"time" BIGINT,
"label" VARCHAR,
"split" VARCHAR
);Anyone who has the link will be able to view this.