Baselight

VGG-Sound: Only Cat And Dog Sounds

Only meows and barks (and hiss, and purr...)

@kaggle.kitonbass_vgg_sound_only_cat_and_dog_sounds

About this Dataset

VGG-Sound: Only Cat And Dog Sounds

This is part of VGGSound dataset with everything related to cats and dogs converted to 10 seconds 16kHz mono wav.
I made it for my University research, because original dataset is kind of huge :)

There are also two csv files with train/test split collected from VGG Sound splits. All data numbered according to indexes of original csv tables.

Each line in the csv file has columns defined by here:
Index in original VGGSound (my addition), YouTube ID, start seconds, label, train/test split.

Also, some of the video links (~800 of them) in tables lead to unavailable videos (age restricted/deleted/etc.), which was not downloaded and therefore is not here – so there would be no audio for some indexes.

The example of real practice use of the dataset can be found in my VQ-VAE 2 notebook 👨‍💻.
I've also got this helper notebook 🐱🐶 which shows some simple actions you can do with audio dat. In particular:

  • Some of the audio files are 9 seconds long – how to pad it
  • How to prepare spectrograms to use them as regular pictures

And umm I could not figure out how to do a proper citation, but here it is from original VGGSound

@InProceedings{Chen20,
  author       = "Honglie Chen and Weidi Xie and Andrea Vedaldi and Andrew Zisserman",
  title        = "VGGSound: A Large-scale Audio-Visual Dataset",
  booktitle    = "International Conference on Acoustics, Speech, and Signal Processing (ICASSP)",
  year         = "2020",
}

Share link

Anyone who has the link will be able to view this.