Baselight

IRS Migration Data - 1992 To 2020

Migration patterns within the US based on Federal Income Tax Returns

@kaggle.wumanandpat_irs_migration_data_1992_to_2020

Loading...
Loading...

About this Dataset

IRS Migration Data - 1992 To 2020

The IRS publishes migration data for the US population based upon the individual tax returns filed with the IRS, where they track on a year-by-year basis

  • where people were coming from - the prior state of residency
  • where people moving to - the new state of residency
  • number of returns filed - approximate number of households that migrated
  • number of exemptions - approximate number of individuals
  • the adjusted gross income (AGI) - recorded in thousands of dollars

The raw data published on the IRS website clearly shows patterns of evolution - changing patterns of what is recorded, how it is record, and naming conventions used - making it a challenge to track changes in the underlying data over time. The current dataset attempts to address these shortcomings by normalizing the record layout, standardizing the conventions, and collecting the annual into a single, coherent dataset.

An individual record is laid out with 9 fields

Y1
Y1_STATE_FIPS
Y1_STATE_ABBR
Y1_STATE_NAME
Y2
Y2_STATE_FIPS
Y2_STATE_ABBR
Y2_STATE_NAME
NUM_RETURNS
NUM_EXEMPTIONS
AGI
Here, Y1 refers to the first year (from where the people are migrating) while Y2 refers to the second year (to where the people are migrating). As this is annual data, Y2 should always be the next year after Y1. Associated with each year are three different ways of identifying a state - the name of the state, it's two-letter abbreviaion, and it's FIPS code. Granted, carrying around three IDs per state is redundant; however, the various IDs are useful in different contexts. One thing to note - the IRS data represents migration into and out of the country via the introduction of a fake state, identified by STATE_NAME=FOREIGN, STATE_ABBR=FR, and STATE_FIPS=57.

From any given state, the dataset records migration to 52 destinations

  • either not moving, or staying in the same state
  • migrating to one of the other 49 states
  • migrating to Washington DC
  • migrating overseas (i.e., to the FOREIGN state)

Similarly, the dataset represents the migation into any given state as being from one of 52 destinations. Typically, the numbers associated with "staying put" constitute, by far, the largest contingent of tax payers for the given state. The one exception to this description is the FOREIGN state. The dataset does not record "staying put" outside of the country; there is no record for FOREIGN-to-FOREIGN migration. As such, there are 51, not 52, destinations paired with migration to-and-from the FOREIGN state.

Tables

Irs Soi State Migraion Data

@kaggle.wumanandpat_irs_migration_data_1992_to_2020.irs_soi_state_migraion_data
  • 783.75 KB
  • 78387 rows
  • 11 columns
Loading...

CREATE TABLE irs_soi_state_migraion_data (
  "y1" BIGINT,
  "y1_state_fips" BIGINT,
  "y1_state_abbr" VARCHAR,
  "y1_state_name" VARCHAR,
  "y2" BIGINT,
  "y2_state_fips" BIGINT,
  "y2_state_abbr" VARCHAR,
  "y2_state_name" VARCHAR,
  "num_returns" BIGINT,
  "num_exemptions" BIGINT,
  "agi" DOUBLE
);

Share link

Anyone who has the link will be able to view this.