Baselight

Zillow Home Value Index (Updated Monthly)

Updated Monthly Pulled from FRED Api

@kaggle.robikscube_zillow_home_value_index

Loading...
Loading...

About this Dataset

Zillow Home Value Index (Updated Monthly)

Reference: https://www.zillow.com/research/zhvi-methodology/

Official Background

In setting out to create a new home price index, a major problem Zillow sought to overcome in existing indices was their inability to deal with the changing composition of properties sold in one time period versus another time period. Both a median sale price index and a repeat sales index are vulnerable to such biases (see the analysis here for an example of how influential the bias can be). For example, if expensive homes sell at a disproportionately higher rate than less expensive homes in one time period, a median sale price index will characterize this market as experiencing price appreciation relative to the prior period of time even if the true value of homes is unchanged between the two periods.

The ideal home price index would be based off sale prices for the same set of homes in each time period so there was never an issue of the sales mix being different across periods. This approach of using a constant basket of goods is widely used, common examples being a commodity price index and a consumer price index. Unfortunately, unlike commodities and consumer goods, for which we can observe prices in all time periods, we can’t observe prices on the same set of homes in all time periods because not all homes are sold in every time period.

The innovation that Zillow developed in 2005 was a way of approximating this ideal home price index by leveraging the valuations Zillow creates on all homes (called Zestimates). Instead of actual sale prices on every home, the index is created from estimated sale prices on every home. While there is some estimation error associated with each estimated sale price (which we report here), this error is just as likely to be above the actual sale price of a home as below (in statistical terms, this is referred to as minimal systematic error). Because of this fact, the distribution of actual sale prices for homes sold in a given time period looks very similar to the distribution of estimated sale prices for this same set of homes. But, importantly, Zillow has estimated sale prices not just for the homes that sold, but for all homes even if they didn’t sell in that time period. From this data, a comprehensive and robust benchmark of home value trends can be computed which is immune to the changing mix of properties that sell in different periods of time (see Dorsey et al. (2010) for another recent discussion of this approach).

For an in-depth comparison of the Zillow Home Value Index to the Case Shiller Home Price Index, please refer to the Zillow Home Value Index Comparison to Case-Shiller

Each Zillow Home Value Index (ZHVI) is a time series tracking the monthly median home value in a particular geographical region. In general, each ZHVI time series begins in April 1996. We generate the ZHVI at seven geographic levels: neighborhood, ZIP code, city, congressional district, county, metropolitan area, state and the nation.

Underlying Data

Estimated sale prices (Zestimates) are computed based on proprietary statistical and machine learning models. These models begin the estimation process by subdividing all of the homes in United States into micro-regions, or subsets of homes either near one another or similar in physical attributes to one another. Within each micro-region, the models observe recent sale transactions and learn the relative contribution of various home attributes in predicting the sale price. These home attributes include physical facts about the home and land, prior sale transactions, tax assessment information and geographic location. Based on the patterns learned, these models can then estimate sale prices on homes that have not yet sold.

The sale transactions from which the models learn patterns include all full-value, arms-length sales that are not foreclosure resales. The purpose of the Zestimate is to give consumers an indication of the fair value of a home under the assumption that it is sold as a conventional, non-foreclosure sale. Similarly, the purpose of the Zillow Home Value Index is to give consumers insight into the home value trends for homes that are not being sold out of foreclosure status. Zillow research indicates that homes sold as foreclosures have typical discounts relative to non-foreclosure sales of between 20 and 40 percent, depending on the foreclosure saturation of the market. This is not to say that the Zestimate is not influenced by foreclosure resales. Zestimates are, in fact, influenced by foreclosure sales, but the pathway of this influence is through the downward pressure foreclosure sales put on non-foreclosure sale prices. It is the price signal observed in the latter that we are attempting to measure and, in turn, predict with the Zestimate.

Market Segments
Within each region, we calculate the ZHVI for various subsets of homes (or market segments) so as to afford greater insight into what is happening in a particular market. All market segments are shown in the table below. Only residential properties are included in the ZHVI calculation. Non-residential properties, such as office buildings, shopping centers and farms are not included.

One very useful form of market segmentation that we produce is based on the distribution of home values within the metropolitan area. Here we assign properties into one of three tiers based on their Zestimates on a particular date: top, middle or bottom tier. The thresholds for the price tiers vary from metro to metro and are determined by the distribution of home values in each metro. Since Zestimates are time-dependent, a property may belong to different price tiers at different dates. To reduce tier switching, we exclude properties near the boundaries of price tiers when assigning tiers. Thus, the sum of Zestimates in all three tiers does not equal the number of Zestimates for the “All Homes” market segment.

Market Segment Number of Zestimates Description
All Homes 87.3 M Single family + condominium + cooperative
Single Family 78.1 M Single family only
Condo 9.2 M Condominium + cooperative only
0 or missing 31.6 M 0 Bedroom or missing
1 Bedroom 1.7 M 1 Bedroom
2 Bedroom 11.1 M 2 Bedroom
3 Bedroom 28.6 M 3 Bedroom
4 Bedroom 11.7 M 4 Bedroom
5+Bedroom 2.7 M 5 Bedroom or more

Tables

Zhvi

@kaggle.robikscube_zillow_home_value_index.zhvi
  • 171.96 KB
  • 299 rows
  • 52 columns
Loading...

CREATE TABLE zhvi (
  "unnamed_0" TIMESTAMP,
  "virginia" DOUBLE,
  "california" DOUBLE,
  "florida" DOUBLE,
  "new_york" DOUBLE,
  "new_jersey" DOUBLE,
  "texas" DOUBLE,
  "michigan" DOUBLE,
  "massachusetts" DOUBLE,
  "arizona" DOUBLE,
  "washington" DOUBLE,
  "colorado" DOUBLE,
  "illinois" DOUBLE,
  "the_district_of_columbia" DOUBLE,
  "nevada" DOUBLE,
  "hawaii" DOUBLE,
  "new_hampshire" DOUBLE,
  "utah" DOUBLE,
  "georgia" DOUBLE,
  "montana" DOUBLE,
  "minnesota" DOUBLE,
  "louisiana" DOUBLE,
  "maryland" DOUBLE,
  "pennsylvania" DOUBLE,
  "south_carolina" DOUBLE,
  "north_carolina" DOUBLE,
  "vermont" DOUBLE,
  "tennessee" DOUBLE,
  "oregon" DOUBLE,
  "new_mexico" DOUBLE,
  "rhode_island" DOUBLE,
  "alaska" DOUBLE,
  "maine" DOUBLE,
  "alabama" DOUBLE,
  "wisconsin" DOUBLE,
  "arkansas" DOUBLE,
  "mississippi" DOUBLE,
  "indiana" DOUBLE,
  "west_virginia" DOUBLE,
  "idaho" DOUBLE,
  "north_dakota" DOUBLE,
  "connecticut" DOUBLE,
  "kentucky" DOUBLE,
  "missouri" DOUBLE,
  "kansas" DOUBLE,
  "delaware" DOUBLE,
  "wyoming" DOUBLE,
  "oklahoma" DOUBLE,
  "south_dakota" DOUBLE,
  "nebraska" DOUBLE,
  "iowa" DOUBLE,
  "ohio" DOUBLE
);

Share link

Anyone who has the link will be able to view this.