Dataset Overview
This dataset contains real estate listings from Utah, comprising 4,440 entries and 14 columns. The data includes various attributes of properties such as type, description, year built, number of bedrooms and bathrooms, garage spaces, lot size, square footage, stories, listing price, and the date the property was last sold. The data was ethically mined from Realtor.com using an API provided by Apify.
Data Science Applications
Given the size of the dataset (4,440 entries) and the available columns, this dataset is well-suited for various data science applications, including but not limited to:
- Regression Analysis: Predict property listing prices based on features like square footage, number of bedrooms and bathrooms, year built, and lot size.
- Classification: Classify properties into different types or price ranges.
- Time Series Analysis: Analyze trends in property sales over time using the
lastSoldOn
column.
- Feature Engineering: Create new features such as price per square foot or age of the property at the time of sale to enhance predictive models.
Column Descriptors
- type: Type of property (e.g., single_family, land)
- text: Description of the property
- year_built: Year the property was built
- beds: Number of bedrooms
- baths: Total number of bathrooms
- baths_full: Number of full bathrooms
- baths_half: Number of half bathrooms
- garage: Number of garage spaces
- lot_sqft: Lot size in square feet
- sqft: Property size in square feet
- stories: Number of stories
- lastSoldOn: Date the property was last sold
- listPrice: Listing price of the property
- status: Current status of the property (e.g., for_sale)
Ethically Mined Data
This dataset was ethically mined from Realtor.com using an API provided by Apify. The data collection process ensured compliance with ethical standards and respect for the source of the information. The dataset is intended for educational and analytical purposes, promoting transparency and responsible data use.
Acknowledgements
- Apify: For providing the API used to mine the data.
- Realtor.com: For being the source of the data.
- DALL-E 3: For generating the thumbnail image for this dataset.