Baselight
Sign In
kaggle

VGG-Sound: Only Cat And Dog Sounds

Kaggle

@kaggle.kitonbass_vgg_sound_only_cat_and_dog_sounds

Loading...
Loading...

Only meows and barks (and hiss, and purr...)

Dataset Description

This is part of VGGSound dataset with everything related to cats and dogs converted to 10 seconds 16kHz mono wav.
I made it for my University research, because original dataset is kind of huge :)

There are also two csv files with train/test split collected from VGG Sound splits. All data numbered according to indexes of original csv tables.

Each line in the csv file has columns defined by here:
Index in original VGGSound (my addition), YouTube ID, start seconds, label, train/test split.

Also, some of the video links (~800 of them) in tables lead to unavailable videos (age restricted/deleted/etc.), which was not downloaded and therefore is not here – so there would be no audio for some indexes.

The example of real practice use of the dataset can be found in my VQ-VAE 2 notebook 👨‍💻.
I've also got this helper notebook 🐱🐶 which shows some simple actions you can do with audio dat. In particular:

  • Some of the audio files are 9 seconds long – how to pad it
  • How to prepare spectrograms to use them as regular pictures

And umm I could not figure out how to do a proper citation, but here it is from original VGGSound

@InProceedings{Chen20,
  author       = "Honglie Chen and Weidi Xie and Andrea Vedaldi and Andrew Zisserman",
  title        = "VGGSound: A Large-scale Audio-Visual Dataset",
  booktitle    = "International Conference on Acoustics, Speech, and Signal Processing (ICASSP)",
  year         = "2020",
}

Related Datasets

Share link

Anyone who has the link will be able to view this.