The dataset comprises scraped advertising data from the nehnutelnosti.sk page in mid-November 2023. Nehnutelnosti specializes in real estate listings and services. In adherence to ethical scraping principles, the robots.txt protocol was respected, among other precautions.
All data is related to apartment listings; no houses or any other types of real estate are included. The data structure includes the following columns:
- name_nsi: Name of the commune
- price: Price in EUR
- index: "Index of Living," ranging from 0 to 10, calculated by the Slovak startup City Performer (https://cityperformer.com/). It considers six categories: environment, quality_of_living, safety, transport, services, and relax.
- quality_of_living: Component of the index
- safety: Component of the index
- transport: Component of the index
- services: Component of the index
- relax: Component of the index
- condition: Condition of the listed apartment
- area: Area in square meters
- energy_costs: Energy costs in EUR
- provision: Binary indicator; 1 if the provision of the agency is included in the price, else 0
- certificate: Energy certificate of the building
- construction_type: Construction type of the building
- orientation: Geographical orientation
- year_built: Year of construction
- last_reconstruction: Year of the last reconstruction (no specification of what reconstruction means)
- total_floors: Number of total floors in the building
- floor: Number of the listed apartment's floor
- lift: Binary indicator; 1 if the building has a lift, else 0
- balconies: Number of balconies
- loggia: Number of loggias
- cellar: Binary indicator; 1 if the building has a cellar, else 0
- type: Type of apartment
- rooms: Number of rooms
- district: District where the commune belongs
This dataset has not undergone cleaning processes and is intentionally left raw. It is meant to serve as a practical resource for learning essential data science skills such as data cleaning, exploratory data analysis (EDA), and training predictive models. Embrace the opportunity to enhance your skills while exploring the nuances of real-world data.
Feedback Request:
If you find this dataset valuable for your work or studies, I kindly ask you to take a moment to upvote and leave comments. Your feedback is crucial in enhancing the quality and usefulness of this resource for the community.