About
Dataset was created as part of a school project for the PSZ (Pronalaženje Skrivenog Znanja, en. Data Mining and Semantic Web) course, which is part of the Master studies at the School of Electrical Engineering, University of Belgrade.
The dataset was created by crawling the discogs website in order to gather data for albums that were published in Serbia or Yugoslavia, along with the artists that published them and the songs they contained. After gathering it, the raw data (html pages) was preprocessed and stored in the .csv
files present in this dataset.
Content
The dataset consists of the following files:
- Serbia_albums_list.csv - contains a list with URLs to all the albums from discogs published in Serbia.
- Yugoslavia_albums_list.csv - contains a list with URLs to all the albums from discogs published in Yugoslavia.
- albums.csv - contains the scraped data from the webpages located in the URLs.
- artists.csv - contains the scraped data for the artists that published the albums.
- songs.csv - contains the scraped data for the songs that were part of the albums.