5.5k high-quality music captions written by musicians

The MusicCaps dataset contains 5,521 music examples, each of which is labeled with an English aspect list and a free text caption written by musicians.

An aspect list is for example "pop, tinny wide hi hats, mellow piano melody, high pitched female vocal melody, sustained pulsating synth lead".
The caption consists of multiple sentences about the music, e.g., "A low sounding male voice is rapping over a fast paced drums playing a reggaeton beat along with a bass. Something like a guitar is playing the melody along. This recording is of poor audio-quality. In the background a laughter can be noticed. This song may be playing in a bar."

The text is solely focused on describing how the music sounds, not the metadata like the artist name.

The labeled examples are 10s music clips from the AudioSet dataset (2,858 from the eval and 2,663 from the train split).

Please cite the corresponding paper, when using this dataset: http://arxiv.org/abs/2301.11325 (DOI: 10.48550/arXiv.2301.11325)

Related Datasets

Music Features

@kaggle
SFC2014 - REACT EU Overview Allocation Vs Decided

@esifunds
ESIF 2014-2020 Financial Instruments Achievement Details

@esifunds
Comparison: ESF Achievements And Finances

@esifunds
Lookup Comparison Of 2017-13 V 2014-2020 Thematic Categorisation Codes

@esifunds
ODP USER SURVEYS FORMATTED V2–25072023

@esifunds

Music Features

SFC2014 - REACT EU Overview Allocation Vs Decided

ESIF 2014-2020 Financial Instruments Achievement Details

Comparison: ESF Achievements And Finances

Lookup Comparison Of 2017-13 V 2014-2020 Thematic Categorisation Codes

ODP USER SURVEYS FORMATTED V2–25072023