Musical Improvisation Dataset
Structured MIDI-like sequences with tempo, key, genre, style, and cluster labels
@kaggle.ziya07_musical_improvisation_dataset
Structured MIDI-like sequences with tempo, key, genre, style, and cluster labels
@kaggle.ziya07_musical_improvisation_dataset
This dataset contains 2765 rows of musical phrase data designed to support research, education, and creative projects in music technology. Each entry represents a unique musical phrase with detailed attributes describing its melodic, rhythmic, and expressive qualities.
The dataset includes note sequences, durations, velocities, tempo, musical key, genre, style label, and a target cluster label. These features reflect structured, MIDI-like representations of music suitable for analysis, classification, and generative tasks.
✅ Key Features
2765 musical phrases
MIDI-style note sequences
Note durations and velocities
Tempo and musical key annotations
Genre and style labels
Cluster label as a target column
CSV format for easy use in data analysis projects
🗂️ Example Columns
Column Description
phrase_id Unique phrase identifier
note_sequence MIDI note numbers (space-separated)
duration_sequence Note durations in beats
velocity_sequence MIDI velocities
tempo Beats per minute (BPM)
key Musical key (e.g., Cmaj, Amin)
genre Genre label (e.g., jazz, classical)
style_label Style or idiom within the genre
cluster_label Target label for grouping or classification
CREATE TABLE maestro_v3_0_0 (
"canonical_composer" VARCHAR,
"canonical_title" VARCHAR,
"split" VARCHAR,
"year" BIGINT,
"midi_filename" VARCHAR,
"audio_filename" VARCHAR,
"duration" DOUBLE
);
CREATE TABLE musical_improvisation_dataset (
"phrase_id" BIGINT,
"note_sequence" VARCHAR,
"duration_sequence" VARCHAR,
"velocity_sequence" VARCHAR,
"tempo" DOUBLE,
"key" VARCHAR,
"genre" VARCHAR,
"style_label" VARCHAR,
"cluster_label" BIGINT
);
Anyone who has the link will be able to view this.