Baselight

Steven Wilson Detector

Finding songs that match Steven Wilson's style

@kaggle.danielgrijalvas_steven_wilson_analysis

Loading...
Loading...

About this Dataset

Steven Wilson Detector

Context

I'm going straight to the point: I'm obsessed with Steven Wilson. I can't help it, I love his music. And I need more music with similar (almost identical) style. So, what I'm trying to solve here is, how to find songs that match SW's style with almost zero error?

I'm aware that Spotify gives you recommendations, like similar artists and such. But that's not enough -- Spotify always gives you varied music. Progressive rock is a very broad genre, and I just want those songs that sound very, very similar to Steven Wilson or Porcupine Tree.

BTW, Porcupine Tree was Steven Wilson's band, and they both sound practically the same. I made an analysis where I checked their musical similarities.

Content

I'm using the Spotify web API to get the data. They have an amazingly rich amount of information, especially the audio features.

This repository has 5 datasets:

  • StevenWilson.csv: contains Steven Wilson discography (65 songs)
  • PorcupineTree.csv: 65 Porcupine Tree songs
  • Complete Steven Wilson.csv: a merge between the past two datasets (Steven Wilson + Porcupine Tree)
  • Train.csv: 200 songs used to train KNN. 100 are Steven Wilson songs and the rest are totally different songs
  • Test.csv: 100 songs that may or may not be like Steven Wilson's. I picked this songs from various prog rock playlists and my Discover Weekly from Spotify.

Also, so far I've made two kernels:

Data

There are 21 columns in the datasets.

Numerical: this columns were scraped using get_audio_features from the Spotify API.

  • acousticness: a confidence measure from 0.0 to 1.0 of whether the track is acoustic; 1.0 represents high confidence the track is acoustic
  • danceability: it describes how suitable a track is for dancing; a value of 0.0 is least danceable and 1.0 is most danceable
  • duration_ms: the duration of the track in milliseconds
  • energy: a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity
  • instrumentalness: predicts whether a track contains no vocals; values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0
  • liveness: detects the presence of an audience in the recording; 1.0 represents high confidence that the track was performed live
  • loudness: the overall loudness of a track in decibels (dB)
  • speechiness: detects the presence of spoken words in a track; the more exclusively speech-like the recording (e.g. talk show), the closer to 1.0 the attribute value
  • tempo: the overall estimated tempo of a track in beats per minute (BPM)
  • valence: a measure from 0.0 to 1.0 describing the musical positiveness; tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry)

Categorical: these features are categories represented as numbers.

  • key: the musical key the track is in. e.g. 0 = C, 1 = C♯/D♭, 2 = D, and so on
  • mode: mode indicates the modality (major or minor); major is represented by 1 and minor is 0
  • time_signature: an estimated overall time signature of a track; it is a notational convention to specify how many beats are in each bar (or measure). e.g. 4/4, 4/3, 3/4, 8/4 etc.

Strings: these fields are mostly useless (except for name, album, artist and lyrics)

  • id: the Spotify ID of the song
  • name: name of the song
  • album: album of the song
  • artist: artist of the song
  • uri: the Spotify URI of the song
  • type: the type of the Spotify object
  • track_href: the Spotify API link of the song
  • analysis_url: the URL used for getting the audio features
  • lyrics: lyrics of the song in lower case

Future

Ever been obsessed with a song? an album? an artist? I'm planning on building a web app that solves this. It will help you find music extremely similar to other.

Tables

Complete Steven Wilson

@kaggle.danielgrijalvas_steven_wilson_analysis.complete_steven_wilson
  • 118.78 KB
  • 236 rows
  • 22 columns
Loading...

CREATE TABLE complete_steven_wilson (
  "acousticness" DOUBLE,
  "album" VARCHAR,
  "analysis_url" VARCHAR,
  "artist" VARCHAR,
  "danceability" DOUBLE,
  "duration_ms" DOUBLE,
  "energy" DOUBLE,
  "id" VARCHAR,
  "instrumentalness" DOUBLE,
  "key" BIGINT,
  "liveness" DOUBLE,
  "loudness" DOUBLE,
  "mode" BIGINT,
  "name" VARCHAR,
  "speechiness" DOUBLE,
  "tempo" DOUBLE,
  "time_signature" BIGINT,
  "track_href" VARCHAR,
  "type" VARCHAR,
  "uri" VARCHAR,
  "valence" DOUBLE,
  "lyrics" VARCHAR
);

Porcupine Tree

@kaggle.danielgrijalvas_steven_wilson_analysis.porcupine_tree
  • 87.05 KB
  • 157 rows
  • 22 columns
Loading...

CREATE TABLE porcupine_tree (
  "acousticness" DOUBLE,
  "album" VARCHAR,
  "analysis_url" VARCHAR,
  "artist" VARCHAR,
  "danceability" DOUBLE,
  "duration_ms" DOUBLE,
  "energy" DOUBLE,
  "id" VARCHAR,
  "instrumentalness" DOUBLE,
  "key" BIGINT,
  "liveness" DOUBLE,
  "loudness" DOUBLE,
  "mode" BIGINT,
  "name" VARCHAR,
  "speechiness" DOUBLE,
  "tempo" DOUBLE,
  "time_signature" BIGINT,
  "track_href" VARCHAR,
  "type" VARCHAR,
  "uri" VARCHAR,
  "valence" DOUBLE,
  "lyrics" VARCHAR
);

Steven Wilson

@kaggle.danielgrijalvas_steven_wilson_analysis.steven_wilson
  • 52.99 KB
  • 79 rows
  • 22 columns
Loading...

CREATE TABLE steven_wilson (
  "acousticness" DOUBLE,
  "album" VARCHAR,
  "analysis_url" VARCHAR,
  "artist" VARCHAR,
  "danceability" DOUBLE,
  "duration_ms" DOUBLE,
  "energy" DOUBLE,
  "id" VARCHAR,
  "instrumentalness" DOUBLE,
  "key" BIGINT,
  "liveness" DOUBLE,
  "loudness" DOUBLE,
  "mode" BIGINT,
  "name" VARCHAR,
  "speechiness" DOUBLE,
  "tempo" DOUBLE,
  "time_signature" BIGINT,
  "track_href" VARCHAR,
  "type" VARCHAR,
  "uri" VARCHAR,
  "valence" DOUBLE,
  "lyrics" VARCHAR
);

Test

@kaggle.danielgrijalvas_steven_wilson_analysis.test
  • 36.43 KB
  • 100 rows
  • 21 columns
Loading...

CREATE TABLE test (
  "acousticness" DOUBLE,
  "album" VARCHAR,
  "analysis_url" VARCHAR,
  "danceability" DOUBLE,
  "duration_ms" BIGINT,
  "energy" DOUBLE,
  "id" VARCHAR,
  "instrumentalness" DOUBLE,
  "key" BIGINT,
  "liveness" DOUBLE,
  "loudness" DOUBLE,
  "mode" BIGINT,
  "name" VARCHAR,
  "speechiness" DOUBLE,
  "tempo" DOUBLE,
  "time_signature" BIGINT,
  "track_href" VARCHAR,
  "type" VARCHAR,
  "uri" VARCHAR,
  "valence" DOUBLE,
  "class" DOUBLE
);

Training

@kaggle.danielgrijalvas_steven_wilson_analysis.training
  • 64.31 KB
  • 238 rows
  • 21 columns
Loading...

CREATE TABLE training (
  "acousticness" DOUBLE,
  "album" VARCHAR,
  "analysis_url" VARCHAR,
  "danceability" DOUBLE,
  "duration_ms" DOUBLE,
  "energy" DOUBLE,
  "id" VARCHAR,
  "instrumentalness" DOUBLE,
  "key" BIGINT,
  "liveness" DOUBLE,
  "loudness" DOUBLE,
  "mode" BIGINT,
  "name" VARCHAR,
  "speechiness" DOUBLE,
  "tempo" DOUBLE,
  "time_signature" BIGINT,
  "track_href" VARCHAR,
  "type" VARCHAR,
  "uri" VARCHAR,
  "valence" DOUBLE,
  "class" BIGINT
);

Share link

Anyone who has the link will be able to view this.